2.1. Conformers

There are four types of conformer classes, which differ in the approach to duplicate recognition, and all have largely the same interface. A description of the full interface is provided below for the UniqueConformersCrest class, followed by more abbreviated descriptions of the UniqueConformersRMSD, UniqueConformersTFD, and UniqueConformersAMS classes. Overall, we recommend the UniqueConformersCrest, for its good accuracy/efficiency ratio, and its ability to find and store rotamers. The UniqueConformersAMS class is slow in filtering out duplicates for large symmetric molecular systems, but is a good choice is conformer sets need to be compared and clustered. The UniqueConformersRMSD class is only able to filter out the most obvious duplicates. It is not able to identify duplicates if they involve symmetric images (e.g. rotations around methyl groups).

2.1.1. UniqueConformersCrest

A class holding the conformers of a molecule, using CREST duplicate recognition to filter out duplicates.

class UniqueConformersCrest(energy_threshold=0.05, rmsd_threshold=0.125, bconst_threshold=0.003)

Class representing a set of unique conformers

An instance of this class has the following attributes:

  • molecule – A PLAMS molecule object defining the connection data of the molecule

  • geometries – A list containing the coordinates of all conformers in the set

  • energies – A list containing the energies of all conformers in the set

  • rotamers – A list with UniqueConformersCrest objects representing the rotamer-set for each conformer

  • generator – A conformer generator object. Has to be set with set_generator(). The default generator is of the CRESTGenerator type.

  • settings – All the user definable settings * check_for_duplicates – Only accept new conformer if candidate is not a duplicate * accept_isomers – Don’t reject isomers (default is to reject them) * accept_all – Accept any candidate in the set without checks

A simple example of (parallel) use:

>>> from scm.plams import Molecule
>>> from scm.plams import init, finish
>>> from scm.conformers import UniqueConformersCrest

>>> # Set up the molecular data
>>> mol = Molecule('mol.xyz')
>>> conformers = UniqueConformersCrest()
>>> conformers.prepare_state(mol)

>>> # Set up PLAMS settings
>>> init()

>>> # Create the generator and run
>>> conformers.generate(nproc=4)

>>> finish()

>>> # Write the results to file
>>> print(conformers)
>>> conformers.write()

Note

The default generator for this conformer class is the CRESTGenerator, using the GFN1-xTB engine. This will generally take a lot of time. To speed things up, set a generator with a different engine prior to running generate():

>>> engine = Settings()
>>> engine.ForceField.Type = 'UFF'
>>> conformers.set_generator(method='crest', engine_settings=engine, nproc=4)
__init__(energy_threshold=0.05, rmsd_threshold=0.125, bconst_threshold=0.003)

Creates an instance of the conformer class

  • energy_threshold – The energy difference above which conformers are always considered unique (kcal/mol).

  • rmsd_threshold – RMSD below which conformers are considered duplicates Angstrom.

  • bconst_threshold – Relative rotational constant used to determine if conformers are unique or not.

    Note: in the grimme code they use 0.01 as bconst_threshold, but this leads to a lot of misclassifications (i.e. different conformers are classified as equivalent rotamers) So, here we use a smaller default value.

add_conformer(coords, energy, reorder=True)

Adds the new coordinates to the list of conformers, if they are not duplicates

  • coords – A coordinate array for the candidate conformer

  • energy – The energy of the candidate conformer

  • reorder – Boolean specifying if the conformers should be ordered based on energy after addition of candidate

Note

If the conformer is not unique, this returns the index of its duplicate. If it is unique, this returns None.

get_diffs_for_candidate(coords, energy, iconf=None)

Find out how much the values in the candidate molecule differ from each conformer

  • coords – Coordinate array for the candidate conformer

  • energy – Energy of the candidate conformer (kcal/mol)

  • iconf – Optional: A single conformer index to compare the candidate with (default is to compare to all)

_compute_inverse_size(coords)

Compute all distances in this molecule object

Note: This is a CREST distance matrix thing

_get_family()

Find the names of the family of classes to which self belongs

_handle_duplicate(data, reorder, check_for_duplicates)

Find any duplicate/rotamer conformer, and make swap if energy is lower

_handle_duplicates(data, reorder, check_for_duplicates)

Find any duplicates, and swap if it makes the energy lower

_log_energy_filtering(high_energies, max_energy)

Do some logging about high energy conformers that were filtered out

add_conformers(geometries, energies, max_energy=None)

Add a set of conformers and energies

clear()

Remove all conformers

convert_key_to_camelcase(key)

Convert the key to json format

convert_key_to_underscore_case(key)

Revert back to original value, and see if it works

convert_keys_to_camelcase(settings)

Convert all keys to CamelCase (except the last json ones like _type, _default, etc)

convert_keys_to_underscore_case(settings)

Revert all the keys to the Python settings

copy()

Copy the conformer set

static create_json_entry(typename, value, unique=True, choices=None, include=None)

Create a single entry

filter(max_energy=None)

Filter all conformers again, possibly with a maximum allowed (relative) energy

find_clusters(dist=5.0, criterion='maxclust', method='average', indices=None)

Assign all conformers to clusters

  • dist – Either the max number of clusters (for maxclust), or the maximum distance between clusters (for distance)

  • criterion – Determines how many clusters to make (maxclust or distance).

  • indices – A tuple with as elements lists of indices for subsets of conformers

Note

Uses scipy’s fcluster method

find_nth_conformer(i)

Find the the index of the n-th conformer added (indices start at 0)

fit()

Fit the lowest energy conformer onto the reference molecule. Fit all other conformers onto the lowest one in the set.

classmethod from_rdkitmol(rdmol, energies=None, reorder=True)

Get all the conformers from the RDKit molecule

generate(method=None, nproc=1)

Generate conformers using the specified method

Note

Adjusts self

  • method – A string, and one of the following options

    [‘crest’, ‘rdkit’,’torsion’] or None o use a previously set generator

  • nproc – Number of processors used in total (only used if set_generator was not called)

get_all_energies()

Get all the energies in the set

get_all_geometries()

Get all the geometries in the set

get_all_rmsds()

Get the RMSD value from the lowest energy conformer for all conformers

get_conformers()

Returns the conformers as a list of molecules

get_dendrogram(method='average')

Gets a dendrogram reflecting the distances between conformers

Note

Uses scipy’s fcluster method

get_energies()

Returns the energies in reference to the most stable

get_json_settings()

Convert the settings object in self to a json style settings object

get_molecule(i)

Return a molecule object for conformer i

get_plot_dendrogram(dend, names=None, fontsize=4)

Makes a plot of the dendrogram

get_rdkitmol()

Convert to RDKit molecule

get_rmsds_from_frame(frame)

Get all RMSDs from a certain frame

handle_input(inp, names, handle_nested_objects)

Get the settings from the input, convert to underscore case, and insert them

  • inp - Input object

  • names - The names (self._name) of all the classes in the family of classes of self

  • handle_nested_objects - Boolean specifying if there can be encapsulated objects of the same family,

    for which settings are needed

indices_to_names(indices1, indices2, name1='a', name2='b')

Convert two sets of indices to names for the conformers in self

Note

  • Mostly for use related to clustering features

  • Only works with two sets of indices.

  • All indices need to be represented by these two lists

property molecule

Return the main molecule object that contains the bonds etc

optimize(convergence_level, optimizer=None, max_energy=None, engine_settings=None, nproc=1, name='go', verbose=False)

(Re)-Optimize the conformers currently in the set

  • convergence_level – One of the convergence options (‘Normal’, ‘Good’, ‘VeryGood’, ‘Excellent’)

  • optimizer – Instance of the ConformerOptimizer class. If not provided, an engine_settings object is required.

  • engine_settings – PLAMS Settings object:
    >>> engine_settings = Settings()
    >>> engine_settings.DFTB.Model = 'GFN1-xTB'
    
pass_settings(settings, encaps_settings=False)

Set the settings object into self

prepare_state(mol)

Set up all the molecule data

read(dirname='.', name='conformers', enfilename=None, reorder=True, filetype=None, read_rotamers=False)

Read a conformer set from the specified directory in DCD format

read_settings(inp, names)

Get the settings for the conformers object from the input in the required format

  • inp - Input object

  • names - The classnames of all the classes in the family of classes of obj

Note: This is complicated by the fact that a single family of classes may be spread over multiple input blocks.

remove_conformer(index)

Remove a conformer from the set

remove_high_energy(max_energy)

Remove all high energy conformers

remove_non_minima(save_rejected_to_file=False, rejected_filename='rejected_non_minima_conformers.xyz')

Perform PES point characterizations for all conformers and remove the ones that are not local minima If save_rejected_to_file is true, rejected non-minimum conformers are saved to the file rejected_filename

reorder()

Reorder conformers from smallest to largest energy

property rmsds

Get the RMSD value from the lowest energy conformer for all conformers

score(engine_settings, nproc=1, watch=True)

Re-score the conformers according to the energy of a single-point calculation wiht the specified engine settings. This method does not change the geometry of the conformers; it just computes the energy with the given engine and re-sort them.

set_blocknames(blocknames)

Provide the relevant blocknames for the family of classes

  • blocknames – List of strings representing the input blocks relevant for this family of classes

set_energies(energies)

Set the energies of the conformers

set_generator(method, engine_settings=None, nproc=1, max_energy=None)

Store a generator object

Note

Overwrites previous generator object

  • method – A string, and one of the following options

    [‘crest’, ‘rdkit’,’torsion’,’annealing’]

  • engine_settings – PLAMS Settings object:

    engine_settings = Settings() engine_settings.DFTB.Model = ‘GFN1-xTB’

  • nproc – Number of processors used in total

to_json()

Create a json settings object from self.settings

write(dirname='.', name='conformers', filetype='rkf', write_rotamers=True)

Write the conformers to file

2.1.2. UniqueConformersTFD

A class holding the conformers of a molecule, using the torsion fingerprint difference distance (TFD) to recognize and filter out duplicates.

class UniqueConformersTFD(energy_threshold=0.05, tfd_threshold=0.05)

Class representing a set of unique conformers

An instance of this class has the following attributes:

  • molecule – A PLAMS molecule object defining the connection data of the molecule

  • rdmol – RDKit molecule object without conformers

  • geometries – A list containing the coordinates of all conformers in the set

  • energies – A list containing the energies of all conformers in the set

  • generator – A conformer generator object. Has to be set with set_generator(). The default generator is of the CRESTGenerator type.

  • settings – User definable settings * check_for_duplicates – Only accept new conformer if candidate is not a duplicate * accept_isomers – Don’t reject isomers (default is to reject them) * accept_all – Accept any candidate in the set without checks

A simple example of (parallel) use:

>>> from scm.plams import Molecule
>>> from scm.plams import init, finish
>>> from scm.conformers import UniqueConformersTFD

>>> # Set up the molecular data
>>> mol = Molecule('mol.xyz')
>>> conformers = UniqueConformersTFD()
>>> conformers.prepare_state(mol)

>>> # Set up PLAMS settings
>>> init()

>>> # Create the generator and run
>>> conformers.generate(nproc=4)

>>> finish()

>>> # Write the results to file
>>> print(conformers)
>>> conformers.write()

Note

The default generator for this conformer class is the RDKitGenerator, using the GFN1-xTB engine. This will generally take a lot of time. To speed things up, set a different generator prior to runnung generate():

>>> engine = Settings()
>>> engine.ForceField.Type = 'UFF'
>>> conformers.set_generator(method='rdkit', engine_settings=engine, nproc=4)
__init__(energy_threshold=0.05, tfd_threshold=0.05)

Creates an instance of the conformer class

  • energy_threshold – The energy difference above which conformers are always considered unique (kcal/mol).

  • tfd_threshold – Torsion Fingerprint (unitless)

prepare_state(mol)

Set up all the molecule data

  • mol – PLAMS Molecule object

property rdmol

Return the RDKit molecule object (which does not contain the conformers)

add_conformer(coords, energy, reorder=True)

Adds the new coordinates to the list of conformers, if they are not duplicates

  • coords – A coordinate array for the candidate conformer

  • energy – The energy of the candidate conformer

  • reorder – Boolean specifying if the conformers should be ordered based on energy after addition of candidate

Note

If the conformer is not unique, this returns the index of its duplicate. If it is unique, this returns None.

get_diffs_for_candidate(coords, energy, iconf=None)

Find out how much the values in the candidate molecule differ from each conformer

  • coords – Coordinate array for the candidate conformer

  • energy – Energy of the candidate conformer (kcal/mol)

  • iconf – Optional: A single conformer index to compare the candidate with (default is to compare to all)

get_torsion_atoms()

Returns all the torsion atoms involved in the TFD

Note

Each contribution is a list of sets of four atoms. Mostly the list has only one entry, but in case of symmetry, more sets of 4 atoms can contribute to a single torsion value.

get_torsion_values(iconf)

Get the values of all the torsion angles for this conformer

Note

Each contribution is a list of torion angles. Mostly the list has only one entry, but in the case of symmetry, or rings, several torsion angles contribute to a single TFP value.

2.1.3. UniqueConformersRMSD

A class holding the conformers of a molecule, using only RMSD to recognize and filter out duplicates.

class UniqueConformersRMSD(energy_threshold=0.05, rmsd_threshold=0.125)

Class representing a set of unique conformers

An instance of this class has the following attributes:

  • molecule – A PLAMS molecule object defining the connection data of the molecule

  • geometries – A list containing the coordinates of all conformers in the set

  • energies – A list containing the energies of all conformers in the set

  • generator – A conformer generator object. Has to be set with set_generator(). The default generator is of the CRESTGenerator type.

  • settings – User definable settings * check_for_duplicates – Only accept new conformer if candidate is not a duplicate * accept_isomers – Don’t reject isomers (default is to reject them) * accept_all – Accept any candidate in the set without checks

A simple example of (parallel) use:

>>> from scm.plams import Molecule
>>> from scm.plams import init, finish
>>> from scm.conformers import UniqueConformersRMSD

>>> # Set up the molecular data
>>> mol = Molecule('mol.xyz')
>>> conformers = UniqueConformersRMSD()
>>> conformers.prepare_state(mol)

>>> # Set up PLAMS settings
>>> init()

>>> # Create the generator and run
>>> conformers.generate(nproc=4)

>>> finish()

>>> # Write the results to file
>>> print(conformers)
>>> conformers.write()

Note

The default generator for this conformer class is the RDKitGenerator, using the UFF engine.

__init__(energy_threshold=0.05, rmsd_threshold=0.125)

Creates an instance of the conformer class

  • energy_threshold – The energy difference above which conformers are always considered unique (kcal/mol).

  • rmsd_threshold – RMSD below which conformers are considered duplicates Angstrom.

prepare_state(mol)

Set up all the molecule data

  • mol – PLAMS Molecule object

add_conformer(coords, energy, reorder=True)

Adds a conformer to the list if requirements are met

Note

Adds every conformer

get_diffs_for_candidate(coords, energy, iconf=None)

Find out how much the values in the candidate molecule differ from each conformer

  • coords – Coordinate array for the candidate conformer

  • energy – Energy of the candidate conformer (kcal/mol)

  • iconf – Optional: A single conformer index to compare the candidate with (default is to compare to all)

_get_trimmed_rdmol(coords)

Get the trimmed RDKit Molecule from the full coordinates

_get_trimmed_rdmol_from_conf(conf)

Get the trimmed RDKit Molecule from the trunned conformer

2.1.4. UniqueConformersAMS

A class holding the conformers of a molecule, using distance matrices and torsion angles to recognize and filter out duplicates.

class UniqueConformersAMS(energy_threshold=0.2, dihedral_threshold=30.0, distance_threshold=0.1)

Class representing a set of unique conformers

An instance of this class has the following attributes:

  • molecule – A PLAMS molecule object defining the connection data of the molecule

  • geometries – A list containing the coordinates of all conformers in the set

  • energies – A list containing the energies of all conformers in the set

  • rotamers – A list with UniqueConformersAMS objects representing the rotamer-set for each conformer

  • generator – A conformer generator object. Has to be set with set_generator(). The default generator is of the CrestGenerator type.

  • settings – All the user definable settings * check_for_duplicates – Only accept new conformer if candidate is not duplicate * accept_isomers – Don’t reject isomers (default is to reject them) * accept_all – Accept any candidate in the set without checks

A simple example of (parallel) use:

>>> from scm.plams import Molecule
>>> from scm.plams import init, finish
>>> from scm.conformers import UniqueConformersAMS

>>> # Set up the molecular data
>>> mol = Molecule('mol.xyz')
>>> conformers = UniqueConformersAMS()
>>> conformers.prepare_state(mol)

>>> # Set up PLAMS settings
>>> init()

>>> # Create the generator and run
>>> conformers.generate(nproc=4)

>>> finish()

>>> # Write the results to file
>>> print(conformers)
>>> conformers.write()

The default generator for this conformer class is the RDKitGenerator. A list of all possibe generators:

  • RDKitGenerator

  • TorsionGenerator

  • CrestGenerator

By default the RDKitGenerator uses the UFF engine. To select a different engine, set a different generator prior to running generate():

>>> engine = Settings()
>>> engine.ForceField.Type = 'UFF'
>>> conformers.set_generator(method='rdkit', engine_settings=engine, nproc=4)

The RDKitGenerator first uses RDKit to generate an initial set of conformer geometries. These are then subjected to geometry optimization using an AMS engine, after which duplicates are filtered out. By default, the RDKitGenerator determines the number of initial conformers based on the number of rotatable bonds in the system. For a large molecule, this will result in a very large number of conformers. To set the number of initial conformers by hand, use:

>>> conformers.set_generator(method='rdkit', nproc=4)
>>> conformers.generator.set_number_initial_conformers(100)
>>> print ('Initial number of conformers: ',conformers.generator.settings.ngeoms)
__init__(energy_threshold=0.2, dihedral_threshold=30.0, distance_threshold=0.1)

Creates an instance of the conformer class

  • energy_threshold – The energy difference above which conformers are always considered unique (kcal/mol).

  • distance_threshold – Maximum difference a distance between two atoms can have for a conformer to be considered a duplicate.

  • dihedral_threshold – Maximum difference a dihedral can have for a conformer to be considered a duplicate.

prepare_state(mol)

Set up all the molecule data

  • mol – A PLAMS Molecule object

add_conformer(coords, energy, reorder=True)

Adds the new coordinates to the list of conformers, if they are not duplicates

  • coords – A coordinate array for the candidate conformer

  • energy – The energy of the candidate conformer

  • reorder – Boolean specifying if the conformers should be ordered based on energy after addition of candidate

Note

If the conformer is not unique, this method returns the index of its duplicate. If it is unique, this returns None.

get_diffs_for_candidate(coords, energy=0.0, iconf=None)

Find out how much the values in the candidate molecule differ from each conformer

  • coords – Coordinate array for the candidate conformer

  • energy – Energy of the candidate conformer (kcal/mol)

  • iconf – Optional: A single conformer index to compare the candidate with (default is to compare to all)