# 2.1. Conformers¶

There are four types of conformer classes, which differ in the approach to duplicate recognition, and all have largely the same interface. A description of the full interface is provided below for the UniqueConformersCrest class, followed by more abbreviated descriptions of the UniqueConformersRMSD, UniqueConformersTFD, and UniqueConformersAMS classes. Overall, we recommend the UniqueConformersCrest, for its good accuracy/efficiency ratio, and its ability to find and store rotamers. The UniqueConformersAMS class is slow in filtering out duplicates for large symmetric molecular systems, but is a good choice is conformer sets need to be compared and clustered. The UniqueConformersRMSD class is only able to filter out the most obvious duplicates. It is not able to identify duplicates if they involve symmetric images (e.g. rotations around methyl groups).

## 2.1.1. UniqueConformersCrest¶

A class holding the conformers of a molecule, using CREST duplicate recognition to filter out duplicates.

class UniqueConformersCrest(energy_threshold=0.05, rmsd_threshold=0.125, bconst_threshold=0.01)

Class representing a set of unique conformers

An instance of this class has the following attributes:

• molecule – A PLAMS molecule object defining the connection data of the molecule
• geometries – A list containing the coordinates of all conformers in the set
• energies – A list containing the energies of all conformers in the set
• rotamers – A list with UniqueConformersCrest objects representing the rotamer-set for each conformer
• check_for_duplicates – Only accept new conformer if candidate is not a duplicate (if False, there is still a check for isomers and bond changes)
• accept_isomers – Don’t reject isomers (default is to reject them)
• accept_all – Accept any candidate in the set without checks
• generator – A conformer generator object. Has to be set with set_generator(). The default generator is of the CRESTGenerator type.

A simple example of (parallel) use:

>>> from scm.plams import Molecule
>>> from scm.plams import init, finish
>>> from scm.conformers import UniqueConformersCrest

>>> # Set up the molecular data
>>> mol = Molecule('mol.xyz')
>>> conformers = UniqueConformersCrest()
>>> conformers.prepare_state(mol)

>>> # Set up PLAMS settings
>>> init()

>>> # Create the generator and run
>>> conformers.generate(nproc=1, maxjobs=12)

>>> finish()

>>> # Write the results to file
>>> print(conformers)
>>> conformers.write()


Note

The default generator for this conformer class is the CRESTGenerator, using the GFN1-xTB engine. This will generally take a lot of time. To speed things up, set a generator with a different engine prior to running generate():

>>> engine = Settings()
>>> engine.ForceField.Type = 'UFF'
>>> conformers.set_generator(method='crest', engine_settings=engine, nproc=1, maxjobs=12)

__init__(energy_threshold=0.05, rmsd_threshold=0.125, bconst_threshold=0.01)

Creates an instance of the conformer class

• energy_threshold – The energy difference above which conformers are always considered unique (kcal/mol).
• rmsd_threshold – RMSD below which conformers are considered duplicates Angstrom.
• bconst_threshold – Relative rotational constant used to determine if conformers are unique or not.
add_conformer(coords, energy, reorder=True)

Adds the new coordinates to the list of conformers, if they are not duplicates

• coords – A coordinate array for the candidate conformer
• energy – The energy of the candidate conformer
• reorder – Boolean specifying if the conformers should be ordered based on energy after addition of candidate

Note

If the conformer is not unique, this returns the index of its duplicate. If it is unique, this returns None.

set_generator(method='crest', engine_settings=None, nproc=1, max_energy=6.0, maxjobs=1)

Store a generator object

Note

Overwrites previous generator object

• method – A string, and one of the following options [‘crest’, ‘rdkit’]

• engine_settings – PLAMS Settings object:
>>> engine_settings = Settings()
>>> engine_settings.DFTB.Model = 'GFN1-xTB'

• nproc – Number of processors used for each single call to AMS

• max_energy – Maximum accepted energy difference from lowest energy conformer

• maxjobs – Maximum number of parallel AMS processes

generate(method='crest', nproc=1, maxjobs=1)

Generate conformers using the specified method

• method – A string, and one of the following options [‘crest’, ‘rdkit’]
• nproc – Number of processors used for each single call to AMS (only used if set_generator was not called)
• maxjobs – Maximum number of parallel AMS processes ((only used if set_generator was not called))

Note

optimize(convergence_level, optimizer=None, max_energy=None, engine_settings=None, nproc=1, maxjobs=1, name='go', verbose=False)

(Re)-Optimize the conformers currently in the set

• convergence_level – One of the convergence options (‘tight’, ‘vtight’, ‘loose’, etc’)

• optimizer – Instance of the ConformerOptimizer class. If not provided, an engine_settings object is required.

• engine_settings – PLAMS Settings object:
>>> engine_settings = Settings()
>>> engine_settings.DFTB.Model = 'GFN1-xTB'

get_diffs_for_candidate(coords, energy, iconf=None)

Find out how much the values in the candidate molecule differ from each conformer

• coords – Coordinate array for the candidate conformer
• energy – Energy of the candidate conformer (kcal/mol)
• iconf – Optional: A single conformer index to compare the candidate with (default is to compare to all)
read(dirname, name='crest', enfilename=None, reorder=True, filetype=None)

Read a conformer set from the specified directory

• dirname – The directory name containing the conformer file
• name – The name of the conformer file
• enfilename – Optionally the name of a file containing the conformer energies (default: energies_name.txt)
• reorder – Boolean specifying if the conformers need to be reordered based on energy
• filetype – Extension of the conformer file (‘dcd’, ‘rkf’, ‘xyz’). If not provided, it is determined from the extensions of files in dirname
write(dirname='.', name='crest', write_rotamers=False, filetype='dcd')

Write the conformers to file

• write_rotamers – Boolean specifying if the rotamers of the conformers shouldk be written to files
• name – The name of the conformer file
• dirname – The directory name containing the conformer file
• filetype – Extension of the conformer file (‘dcd’ (default) or ‘rkf’).
clear()

Remove all conformers

copy()

Copy the conformer set

filter(max_energy=None)

Filter all conformers again, possibly with a maximum allowed (relative) energy

find_clusters(dist=5.0, criterion='maxclust', method='average', indices=None)

Assign all conformers to clusters

• dist – Either the max number of clusters (for maxclust), or the maximum distance between clusters (for distance)
• criterion – Determines how many clusters to make (maxclust or distance).
• indices – A tuple with as elements lists of indices for subsets of conformers

Note

Uses scipy’s fcluster method

find_nth_conformer(i)

Find the the index of the n-th conformer added (indices start at 0)

fit()

Fit all conformers onto the first one in the set, and resave

classmethod from_rdkitmol(rdmol, energies=None, reorder=True)

Get all the conformers from the RDKit molecule

get_all_energies()

Get all the energies in the set

get_all_geometries()

Get all the geometries in the set

get_all_rmsds()

Get the RMSD value from the lowest energy conformer for all conformers

get_conformers()

Returns the conformers as a list of molecules

get_dendrogram(method='average')

Gets a dendrogram reflecting the distances between conformers

Note

Uses scipy’s fcluster method

get_energies()

Returns the energies in reference to the most stable

get_molecule(i)

Return a molecule object for conformer i

get_plot_dendrogram(dend, names=None, fontsize=4)

Makes a plot of the dendrogram

get_rdkitmol()

Convert to RDKit molecule

get_rmsds_from_frame(frame)

Get all RMSDs from a certain frame

indices_to_names(indices1, indices2, name1='a', name2='b')

Convert two sets of indices to names for the conformers in self

Note

• Mostly for use related to clustering features
• Only works with two sets of indices.
• All indices need to be represented by these two lists
prepare_state(mol)

Set up all the molecule data

remove_conformer(index)

Remove a conformer from the set

remove_high_energy(max_energy)

Remove all high energy conformers

remove_non_minima(save_rejected_to_file=False, rejected_filename='rejected_non_minima_conformers.xyz')

Perform PES point characterizations for all conformers and remove the ones that are not local minima If save_rejected_to_file is true, rejected non-minimum conformers are saved to the file rejected_filename

reorder()

Reorder conformers from smallest to largest energy

rmsds

Get the RMSD value from the lowest energy conformer for all conformers

set_energies(energies)

Set the energies of the conformers

## 2.1.2. UniqueConformersTFD¶

A class holding the conformers of a molecule, using the torsion fingerprint difference distance (TFD) to recognize and filter out duplicates.

class UniqueConformersTFD(tfd_threshold=0.05)

Class representing a set of unique conformers

An instance of this class has the following attributes:

• molecule – A PLAMS molecule object defining the connection data of the molecule
• rdmol – RDKit molecule object without conformers
• geometries – A list containing the coordinates of all conformers in the set
• energies – A list containing the energies of all conformers in the set
• check_for_duplicates – Only accept new conformer if candidate is not a duplicate (if False, there is still a check for isomers and bond changes)
• accept_isomers – Don’t reject isomers (default is to reject them)
• accept_all – Accept any candidate in the set without checks
• generator – A conformer generator object. Has to be set with set_generator(). The default generator is of the RDKitGenerator type.

A simple example of (parallel) use:

>>> from scm.plams import Molecule
>>> from scm.plams import init, finish
>>> from scm.conformers import UniqueConformersTFD

>>> # Set up the molecular data
>>> mol = Molecule('mol.xyz')
>>> conformers = UniqueConformersTFD()
>>> conformers.prepare_state(mol)

>>> # Set up PLAMS settings
>>> init()

>>> # Create the generator and run
>>> conformers.generate(nproc=1, maxjobs=12)

>>> finish()

>>> # Write the results to file
>>> print(conformers)
>>> conformers.write()


Note

The default generator for this conformer class is the RDKitGenerator, using the GFN1-xTB engine. This will generally take a lot of time. To speed things up, set a different generator prior to runnung generate():

>>> engine = Settings()
>>> engine.ForceField.Type = 'UFF'
>>> conformers.set_generator(method='rdkit', engine_settings=engine, nproc=1, maxjobs=12)

__init__(tfd_threshold=0.05)

Creates an instance of the conformer class

• tfd_threshold – Torsion Fingerprint (unitless)
prepare_state(mol)

Set up all the molecule data

• mol – PLAMS Molecule object
add_conformer(coords, energy, reorder=True)

Adds the new coordinates to the list of conformers, if they are not duplicates

• coords – A coordinate array for the candidate conformer
• energy – The energy of the candidate conformer
• reorder – Boolean specifying if the conformers should be ordered based on energy after addition of candidate

Note

If the conformer is not unique, this returns the index of its duplicate. If it is unique, this returns None.

set_generator(method='rdkit', engine_settings=None, nproc=1, max_energy=6.0, maxjobs=1)

Store a generator object

• method – A string, and one of the following options [‘crest’, ‘rdkit’]

• engine_settings – PLAMS Settings object:
>>> engine_settings = Settings()
>>> engine_settings.DFTB.Model = 'GFN1-xTB'

• nproc – Number of processors used for each single call to AMS

• max_energy – Maximum accepted energy difference from lowest energy conformer

• maxjobs – Maximum number of parallel AMS processes

Note

Overwrites previous generator object

generate(method='rdkit', nproc=1, maxjobs=1)

Generate conformers using the specified method

• method – A string, and one of the following options [‘crest’, ‘rdkit’]
• nproc – Number of processors used for each single call to AMS (only used if set_generator was not called)
• maxjobs – Maximum number of parallel AMS processes ((only used if set_generator was not called))

Note

get_diffs_for_candidate(coords, energy, iconf=None)

Find out how much the values in the candidate molecule differ from each conformer

• coords – Coordinate array for the candidate conformer
• energy – Energy of the candidate conformer (kcal/mol)
• iconf – Optional: A single conformer index to compare the candidate with (default is to compare to all)
read(dirname, name='tfd', enfilename=None, reorder=True, filetype='dcd')

Read a conformer set from the specified directory

• dirname – The directory name containing the conformer file
• name – The name of the conformer file
• enfilename – Optionally the name of a file containing the conformer energies (default: energies_name.txt)
• reorder – Boolean specifying if the conformers need to be reordered based on energy
• filetype – Extension of the conformer file (‘dcd’ (default) or ‘rkf’).
write(dirname='.', name='tfd', filetype='dcd')

Write the conformers to file

• name – The name of the conformer file
• dirname – The directory name containing the conformer file
• filetype – Extension of the conformer file (‘dcd’ (default) or ‘rkf’).
get_torsion_atoms()

Returns all the torsion atoms involved in the TFD

Note

Each contribution is a list of sets of four atoms. Mostly the list has only one entry, but in case of symmetry, more sets of 4 atoms can contribute to a single torsion value.

get_torsion_values(iconf)

Get the values of all the torsion angles for this conformer

Note

Each contribution is a list of torion angles. Mostly the list has only one entry, but in the case of symmetry, or rings, several torsion angles contribute to a single TFP value.

## 2.1.3. UniqueConformersRMSD¶

A class holding the conformers of a molecule, using only RMSD to recognize and filter out duplicates.

class UniqueConformersRMSD(energy_threshold=0.05, rmsd_threshold=0.125)

Class representing a set of unique conformers

An instance of this class has the following attributes:

• molecule – A PLAMS molecule object defining the connection data of the molecule
• geometries – A list containing the coordinates of all conformers in the set
• energies – A list containing the energies of all conformers in the set
• check_for_duplicates – Only accept new conformer if candidate is not a duplicate (if False, there is still a check for isomers and bond changes)
• accept_isomers – Don’t reject isomers (default is to reject them)
• accept_all – Accept any candidate in the set without checks
• generator – A conformer generator object. Has to be set with set_generator(). The default generator is of the CRESTGenerator type.

A simple example of (parallel) use:

>>> from scm.plams import Molecule
>>> from scm.plams import init, finish
>>> from scm.conformers import UniqueConformersRMSD

>>> # Set up the molecular data
>>> mol = Molecule('mol.xyz')
>>> conformers = UniqueConformersRMSD()
>>> conformers.prepare_state(mol)

>>> # Set up PLAMS settings
>>> init()

>>> # Create the generator and run
>>> conformers.generate(nproc=1, maxjobs=12)

>>> finish()

>>> # Write the results to file
>>> print(conformers)
>>> conformers.write()


Note

The default generator for this conformer class is the RDKitGenerator, using the UFF engine.

__init__(energy_threshold=0.05, rmsd_threshold=0.125)

Creates an instance of the conformer class

• energy_threshold – The energy difference above which conformers are always considered unique (kcal/mol).
• rmsd_threshold – RMSD below which conformers are considered duplicates Angstrom.
add_conformer(coords, energy, reorder=True)

Adds the new coordinates to the list of conformers, if they are not duplicates

• coords – A coordinate array for the candidate conformer
• energy – The energy of the candidate conformer
• reorder – Boolean specifying if the conformers should be ordered based on energy after addition of candidate

Note

If the conformer is not unique, this returns the index of its duplicate. If it is unique, this returns None.

set_generator(method='rdkit', engine_settings=None, nproc=1, max_energy=6.0, maxjobs=1)

Store a generator object

Note

Overwrites previous generator object

• method – A string, and one of the following options [‘crest’, ‘rdkit’]

• engine_settings – PLAMS Settings object:
>>> engine_settings = Settings()
>>> engine_settings.DFTB.Model = 'GFN1-xTB'

• nproc – Number of processors used for each single call to AMS

• max_energy – Maximum accepted energy difference from lowest energy conformer

• maxjobs – Maximum number of parallel AMS processes

generate(method='rdkit', nproc=1, maxjobs=1)

Generate conformers using the specified method

• method – A string, and one of the following options [‘crest’, ‘rdkit’]
• nproc – Number of processors used for each single call to AMS (only used if set_generator was not called)
• maxjobs – Maximum number of parallel AMS processes ((only used if set_generator was not called))

Note

optimize(convergence_level, optimizer=None, max_energy=None, engine_settings=None, nproc=1, maxjobs=1, name='go', verbose=False)

(Re)-Optimize the conformers currently in the set

• convergence_level – One of the convergence options (‘tight’, ‘vtight’, ‘loose’, etc’)

• optimizer – Instance of the ConformerOptimizer class. If not provided, an engine_settings object is required.

• engine_settings – PLAMS Settings object:
>>> engine_settings = Settings()
>>> engine_settings.DFTB.Model = 'GFN1-xTB'

get_diffs_for_candidate(coords, energy, iconf=None)

Find out how much the values in the candidate molecule differ from each conformer

• coords – Coordinate array for the candidate conformer
• energy – Energy of the candidate conformer (kcal/mol)
• iconf – Optional: A single conformer index to compare the candidate with (default is to compare to all)
read(dirname, name='rmsd', enfilename=None, reorder=True, filetype=None)

Read a conformer set from the specified directory

• dirname – The directory name containing the conformer file
• name – The name of the conformer file
• enfilename – Optionally the name of a file containing the conformer energies (default: energies_name.txt)
• reorder – Boolean specifying if the conformers need to be reordered based on energy
• filetype – Extension of the conformer file (‘dcd’, ‘rkf’, ‘xyz’). If not provided, it is determined from the extensions of files in dirname
write(dirname='.', name='rmsd', filetype='dcd')

Write the conformers to file

• write_rotamers – Boolean specifying if the rotamers of the conformers shouldk be written to files
• name – The name of the conformer file
• dirname – The directory name containing the conformer file
• filetype – Extension of the conformer file (‘dcd’ (default) or ‘rkf’).

## 2.1.4. UniqueConformersAMS¶

A class holding the conformers of a molecule, using distance matrices and torsion angles to recognize and filter out duplicates.

class UniqueConformersAMS(energy_threshold=0.2, min_dihed=30, min_dist=0.1)

Class representing a set of unique conformers

An instance of this class has the following attributes:

• molecule – A PLAMS molecule object defining the connection data of the molecule
• geometries – A list containing the coordinates of all conformers in the set
• energies – A list containing the energies of all conformers in the set
• rotamers – A list with UniqueConformersAMS objects representing the rotamer-set for each conformer
• check_for_duplicates – Only accept new conformer if candidate is not a duplicate (if False, there is still a check for isomers and bond changes)
• accept_isomers – Don’t reject isomers (default is to reject them)
• accept_all – Accept any candidate in the set without checks
• generator – A conformer generator object. Has to be set with set_generator(). The default generator is of the CrestGenerator type.

A simple example of (parallel) use:

>>> from scm.plams import Molecule
>>> from scm.plams import init, finish
>>> from scm.conformers import UniqueConformersAMS

>>> # Set up the molecular data
>>> mol = Molecule('mol.xyz')
>>> conformers = UniqueConformersAMS()
>>> conformers.prepare_state(mol)

>>> # Set up PLAMS settings
>>> init()

>>> # Create the generator and run
>>> conformers.generate(nproc=1, maxjobs=12)

>>> finish()

>>> # Write the results to file
>>> print(conformers)
>>> conformers.write()


The default generator for this conformer class is the RDKitGenerator. A list of all possibe generators:

• RDKitGenerator
• TorsionGenerator
• CrestGenerator

By default the RDKitGenerator uses the UFF engine. To select a different engine, set a different generator prior to running generate():

>>> engine = Settings()
>>> engine.ForceField.Type = 'UFF'
>>> conformers.set_generator(method='rdkit', engine_settings=engine, nproc=1, maxjobs=12)


The RDKitGenerator first uses RDKit to generate an initial set of conformer geometries. These are then subjected to geometry optimization using an AMS engine, after which duplicates are filtered out. By default, the RDKitGenerator determines the number of initial conformers based on the number of rotatable bonds in the system. For a large molecule, this will result in a very large number of conformers. To set the number of initial conformers by hand, use:

>>> conformers.set_generator(method='rdkit', nproc=1, maxjobs=12)
>>> conformers.generator.set_number_initial_conformers(100)
>>> print ('Initial number of conformers: ',conformers.generator.ngeoms)

__init__(energy_threshold=0.2, min_dihed=30, min_dist=0.1)

Creates an instance of the conformer class

• energy_threshold – The energy difference above which conformers are always considered unique (kcal/mol).
• min_dist – Maximum difference a distance between two atoms can have for a conformer to be considered a duplicate.
• min_dihed – Maximum difference a dihedral can have for a conformer to be considered a duplicate.
prepare_state(mol, atoms_to_remove=None)

Set up all the molecule data

• mol – A PLAMS Molecule object
• atoms_to_remove – Optional: A list of atoms to be removed from the distance matrices (default is all H)
add_conformer(coords, energy, reorder=True)

Adds the new coordinates to the list of conformers, if they are not duplicates

• coords – A coordinate array for the candidate conformer
• energy – The energy of the candidate conformer
• reorder – Boolean specifying if the conformers should be ordered based on energy after addition of candidate

Note

If the conformer is not unique, this method returns the index of its duplicate. If it is unique, this returns None.

set_generator(method='rdkit', engine_settings=None, nproc=1, max_energy=6.0, maxjobs=1)

Store a generator object, to be used by the generate() method

• method – A string, and one of the following options [‘crest’, ‘rdkit’]

• engine_settings – PLAMS Settings object:
>>> engine_settings = Settings()
>>> engine_settings.DFTB.Model = 'GFN1-xTB'

• nproc – Number of processors used for each single call to AMS

• max_energy – Maximum accepted energy difference from lowest energy conformer

• maxjobs – Maximum number of parallel AMS processes

Note

Overwrites previously used generator object.

generate(method='rdkit', nproc=1, maxjobs=1)

Generate conformers using the specified method

• method – A string, and one of the following options [‘crest’, ‘rdkit’]
• nproc – Number of processors used for each single call to AMS (only used if set_generator was not called)
• maxjobs – Maximum number of parallel AMS processes ((only used if set_generator was not called))

Note

Note

If a generator was set previously with the set_generator() method, no arguments are required

get_diffs_for_candidate(coords, energy=0.0, iconf=None)

Find out how much the values in the candidate molecule differ from each conformer

• coords – Coordinate array for the candidate conformer
• energy – Energy of the candidate conformer (kcal/mol)
• iconf – Optional: A single conformer index to compare the candidate with (default is to compare to all)
read(dirname, name='ams', enfilename=None, reorder=True, filetype=None)

Read a conformer set from the specified directory

• dirname – The directory name containing the conformer file
• name – The name of the conformer file
• enfilename – Optionally the name of a file containing the conformer energies (default: energies_name.txt)
• reorder – Boolean specifying if the conformers need to be reordered based on energy
• filetype – Extension of the conformer file (‘dcd’, ‘rkf’, ‘xyz’). If not provided it is determined from extensions of files in dirname
write(dirname='.', name='ams', write_rotamers=False, filetype='dcd')

Write the conformers to file

• write_rotamers – Boolean specifying if the rotamers of the conformers shouldk be written to files
• name – The name of the conformer file
• dirname – The directory name containing the conformer file
• filetype – Extension of the conformer file (‘dcd’ (default) or ‘rkf’).