1. AMSConformers

1.1. Python library for conformer generation

The conformers package holds a collection of tools for the generation of conformer sets for a molecule. It makes heavy use of the AMS Python library PLAMS. The main tools for conformer generation in the package are:

  • CREST conformer generation [1]
  • RDKit ETKDG conformer generation [2]

The top level classes in the conformers package are the UniqueConformers classes:

  • UniqueConformersCrest
  • UniqueConformersRMSD
  • UniqueConformersTFD
  • UniqueConformersAMS

The main task of the conformer class is to hold the conformers of a molecule and prune/filter out duplicates. There are several ways duplicates can be defined :

  • CREST duplicate recognition (UniqueConformersCrest) [1]
  • Duplicate recognition based on RMSD (UniqueConformersRMSD)
  • Duplicate recognition based on the Torsion Fingerprint Difference (UniqueConformersTFD) [3]
  • Duplicate recognition based on interatomic distances and torsion angles (UniqueConformersAMS)

Out of these conformer classes we recommend UniqueConformersCrest as the best compromise between accuracy and efficiency in detecting and filtering out duplicates.

In addition to duplicate pruning, all UniqueConformers classes are also linked to a default conformer generation method (which can be changed by the user). The conformer generation methods in turn are linked to default geometry optimization settings (also changeable by the user).

To create a conformer set, an empty instance of the conformers object is created, and then conformers can be added to it, read from file, or generated. The conformers in the set can then be manipulated and compared.


AMSConformers can also be used via the command line:

$AMSBIN/amsconformers my_molecule.xyz

Note: not all features from the python library can be used via the amsconformers command line tool. Type $AMSBIN/amsconformers -h for more info.

1.2. Simple example

In this example, conformers are generated using the RDKit ETKDG routines, and duplicates are filtered out using the AMS approach.

from scm.plams import Molecule, Settings
from scm.plams import init, finish
from scm.conformers import UniqueConformersCrest

# Set up the molecular data
mol = Molecule('mol.xyz')
conformers = UniqueConformersCrest()

# Set up PLAMS file handling

# Generate the conformers using the RDKit ETKDG method
# This method includes geometry optimization and pruning
conformers.generate(method='rdkit', nproc=1, maxjobs=4)


print ('NConformers: ',len(conformers))
print (conformers)

1.3. Comparing conformer sets

In this example, two conformer sets are read in from file and evaluated based on three criteria:

  1. The energy of the most stable conformer in the set
  2. The number of unique conformers in the set
  3. The number of conformer-clusters (similar conformers) in the set
import os
import numpy
from scm.plams import Molecule, DCDTrajectoryFile
from scm.conformers import UniqueConformersAMS, UniqueConformersCrest

import matplotlib

# Define some filenames
pathname = '.'
filename = os.path.join(pathname,'mol.xyz')
mol = Molecule(filename)
print ('nats: ',len(mol))

# Read in the enumerated conformers (the larger set)
conformers_enum = UniqueConformersAMS()
conformers_enum.accept_all = True
print ('Original number of enumerated conformers: ',len(conformers_enum))

# Read in the Crest conformers
conformers_crest = UniqueConformersAMS()
conformers_crest.accept_all = True
print ('Original number of CREST conformers: ',len(conformers_crest))

# Combine the two sets into one big set (remembering which conformers came from which set)
all_conformers, (indices1, indices2) = conformers_enum + conformers_crest
names = all_conformers.indices_to_names(indices1, indices2)
print('Total number of conformers: ',len(all_conformers))
print ('Indices CREST conformers in full set:')
print (indices2)

# Check the lowest energy of the CREST set
print ('\n=================================')
e_diff = conformers_crest.energies[0] - min(all_conformers.energies)
print ('Lowest energy (CREST wrt enumeration)                  : %20.10f kcal/mol'%(e_diff))
print ('Quality lowest energy conformer CREST (between 0 and 1): ',numpy.exp(-e_diff))
print ('=================================\n')

# Print the percentage of conformers found by the testset
print ('=================================')
print ('Fraction of conformers found by CREST      : ',len(conformers_crest)/len(all_conformers))
print ('Fraction of conformers found by enumeration: ',len(conformers_enum)/len(all_conformers))
print ('=================================\n')

# Make a dendrogram plot for the big set
dendrogram = all_conformers.get_dendrogram()
figure = all_conformers.get_plot_dendrogram(dendrogram, names=names, fontsize=8)

# Now assign conformers to clusters
cluster_indices1, cluster_indices2 = all_conformers.find_clusters(0.11,'distance',indices=(indices1, indices2))
nclusters = max(cluster_indices1+cluster_indices2)
print ('=================================')
print('nclusters: ',nclusters)
print ('Cluster number of each member of CREST set')
print (cluster_indices2)
print ('Fraction of clusters in CREST set     : ',len(set(cluster_indices2))/nclusters)
print ('Fraction of clusters in enumerated set: ',len(set(cluster_indices1))/nclusters)
print ('=================================')


[1](1, 2) DOI: 10.1039/c9cp06869d
[2]DOI: 10.1021/acs.jcim.5b00654
[3]DOI: 10.1021/ci2002318]