RDKit interface

RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python. PLAMS interface to RDKit originates from QMFlows project and features functions for generating PLAMS molecules from string representations (SMARTS, SMILES) as well as a handful of tools for dealing with proteins and PDB files.

from_rdmol(rdkit_mol, confid=-1, properties=True)[source]

Translate an RDKit molecule into a PLAMS molecule type. RDKit properties will be unpickled if their name ends with ‘_pickled’.

Parameters:
  • rdkit_mol (rdkit.Chem.Mol) – RDKit molecule
  • confid (int) – conformer identifier from which to take coordinates
  • properties (bool) – If all Chem.Mol, Chem.Atom and Chem.Bond properties should be converted from RDKit to PLAMS format.
Returns:

a PLAMS molecule

Return type:

Molecule

to_rdmol(plams_mol, sanitize=True, properties=True, assignChirality=False)[source]

Translate a PLAMS molecule into an RDKit molecule type. PLAMS Molecule, Atom or Bond properties are pickled if they are neither booleans, floats, integers, floats nor strings, the resulting property names are appended with ‘_pickled’.

Parameters:
  • plams_mol (Molecule) – A PLAMS molecule
  • sanitize (bool) – Kekulize, check valencies, set aromaticity, conjugation and hybridization
  • properties (bool) – If all Molecule, Atom and Bond properties should be converted from PLAMS to RDKit format.
  • assignChirality (bool) – Assign R/S and cis/trans information, insofar as this was not yet present in the PLAMS molecule.
Returns:

an RDKit molecule

Return type:

rdkit.Chem.Mol

from_smiles(smiles, nconfs=1, name=None, forcefield=None, rms=0.1)[source]

Generates PLAMS molecule(s) from a smiles strings.

Parameters:
  • smiles (str) – A smiles string
  • nconfs (int) – Number of conformers to be generated
  • name (str) – A name for the molecule
  • forcefield (str) – Choose ‘uff’ or ‘mmff’ forcefield for geometry optimization and ranking of comformations. The default value None results in skipping of the geometry optimization step.
  • rms (float) – Root Mean Square deviation threshold for removing similar/equivalent conformations
Returns:

A molecule with hydrogens and 3D coordinates or a list of molecules if nconfs > 1

Return type:

Molecule or list of PLAMS Molecules

from_smarts(smarts, nconfs=1, name=None, forcefield=None, rms=0.1)[source]

Generates PLAMS molecule(s) from a smarts strings. This allows for example to define hydrogens explicitly. However it is less suitable for aromatic molecules (use from_smiles in that case).

Parameters:
  • smarts (str) – A smarts string
  • nconfs (int) – Number of conformers to be generated
  • name (str) – A name for the molecule
  • forcefield (str) – Choose ‘uff’ or ‘mmff’ forcefield for geometry optimization and ranking of comformations. The default value None results in skipping of the geometry optimization step.
  • rms (float) – Root Mean Square deviation threshold for removing similar/equivalent conformations.
Returns:

A molecule with hydrogens and 3D coordinates or a list of molecules if nconfs > 1

Return type:

Molecule or list of PLAMS Molecules

get_conformations(mol, nconfs=1, name=None, forcefield=None, rms=-1, enforceChirality=False)[source]

Generates 3D conformation(s) for an rdkit_mol or a PLAMS Molecule

Parameters:
  • mol (rdkit.Chem.Mol or Molecule) – RDKit or PLAMS Molecule
  • nconfs (int) – Number of conformers to be generated
  • name (str) – A name for the molecule
  • forcefield (str) – Choose ‘uff’ or ‘mmff’ forcefield for geometry optimization and ranking of comformations. The default value None results in skipping of the geometry optimization step
  • rms (float) – Root Mean Square deviation threshold for removing similar/equivalent conformations.
  • enforceChirality (bool) – Enforce the correct chirality if chiral centers are present
Returns:

A molecule with hydrogens and 3D coordinates or a list of molecules if nconfs > 1

Return type:

Molecule or list of PLAMS Molecules

from_sequence(sequence, nconfs=1, name=None, forcefield=None, rms=0.1)[source]

Generates PLAMS molecule from a peptide sequence. Includes explicit hydrogens and 3D coordinates.

Parameters:
  • sequence (str) – A peptide sequence, e.g. ‘HAG’
  • nconfs (int) – Number of conformers to be generated
  • name (str) – A name for the molecule
  • forcefield (str) – Choose ‘uff’ or ‘mmff’ forcefield for geometry optimization and ranking of comformations. The default value None results in skipping of the geometry optimization step.
  • rms (float) – Root Mean Square deviation threshold for removing similar/equivalent conformations.
Returns:

A peptide molecule with hydrogens and 3D coordinates or a list of molecules if nconfs > 1

Return type:

Molecule or list of PLAMS Molecules

modify_atom(mol, idx, element)[source]

Change atom “idx” in molecule “mol” to “element” and add or remove hydrogens accordingly

Parameters:
  • mol (Molecule or rdkit.Chem.Mol) – molecule to be modified
  • idx (int) – index of the atom to be modified
  • element (str) –
Returns:

Molecule with new element and possibly added or removed hydrogens

Return type:

Molecule

apply_template(mol, template)[source]

Modifies bond orders in PLAMS molecule according template smiles structure.

Parameters:
  • mol (Molecule or rdkit.Chem.Mol) – molecule to be modified
  • template (str) – smiles string defining the correct chemical structure
Returns:

Molecule with correct chemical structure and provided 3D coordinates

Return type:

Molecule

apply_reaction_smarts(mol, reaction_smarts, complete=False, forcefield=None, return_rdmol=False)[source]

Applies reaction smirks and returns product. If returned as a PLAMS molecule, thismolecule.properties.orig_atoms is a list of indices of atoms that have not been changed (which can for example be used partially optimize new atoms only with the freeze keyword)

Parameters:
  • mol (Molecule or rdkit.Chem.Mol) – molecule to be modified
  • reactions_smarts (str) – Reactions smarts to be applied to molecule
  • complete (bool or float (value between 0 and 1)) – Apply reaction until no further changes occur or given fraction of reaction centers have been modified
  • forcefield (str) – Specify ‘uff’ or ‘mmff’ to apply forcefield based geometry optimization of product structures.
  • return_rdmol (bool) – return a RDKit molecule if true, otherwise a PLAMS molecule
Returns:

(product molecule, list of unchanged atoms)

Return type:

(Molecule, list of int)

readpdb(pdb_file, removeHs=False, proximityBonding=False, return_rdmol=False)[source]

Generate a molecule from a PDB file

Parameters:
  • pdb_file (str or file) – The PDB file to read
  • removeHs (bool) – Hydrogens are removed if True
  • proximityBonding (bool) – Enables automatic proximity bonding
  • return_rdmol (bool) – return a RDKit molecule if true, otherwise a PLAMS molecule
Returns:

The molecule

Return type:

Molecule or rdkit.Chem.Mol

writepdb(mol, pdb_file=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]

Write a PDB file from a molecule

Parameters:
  • mol (Molecule or rdkit.Chem.Mol) – molecule to be exported to PDB
  • pdb_file (str or file) – The PDB file to write to, or a filename
add_Hs(mol, forcefield=None, return_rdmol=False)[source]

Add hydrogens to protein molecules read from PDB. Makes sure that the hydrogens get the correct PDBResidue info.

Parameters:
  • mol (Molecule or rdkit.Chem.Mol) – Molecule to be protonated
  • forcefield (str) – Specify ‘uff’ or ‘mmff’ to apply forcefield based geometry optimization on new atoms.
  • return_rdmol (bool) – return a RDKit molecule if true, otherwise a PLAMS molecule
Returns:

A molecule with explicit hydrogens added

Return type:

Molecule or rdkit.Chem.Mol

partition_protein(mol, residue_bonds=None, split_heteroatoms=True, return_rdmol=False)[source]

Splits a protein molecule into capped amino acid fragments and caps.

Parameters:
  • mol (Molecule or rdkit.Chem.Mol) – A protein molecule
  • residue_bonds (tuple) – a tuple of pairs of residue number indicating which peptide bonds to split. If none, split all peptide bonds.
  • split_heteroatoms (bool) – if True, all bonds between a heteroatom and a non-heteroatom across residues are removed
Returns:

list of fragments, list of caps

get_backbone_atoms(mol)[source]

Return a list of atom indices corresponding to the backbone atoms in a peptide molecule. This function assumes PDB information in properties.pdb_info of each atom, which is the case if the molecule is generated with the “readpdb” or “from_sequence” functions.

Parameters:mol (Molecule or rdkit.Chem.Mol) – a peptide molecule
Returns:a list of atom indices
Return type:list
get_substructure(mol, func_list)[source]

Search for functional groups within a molecule based on a list of reference functional groups. SMILES strings, PLAMS and/or RDKit molecules can be used interchangeably in “func_list”.

Example:

>>> mol = from_smiles('OCCO')  # Ethylene glycol
>>> func_list = ['[H]O', 'C[N+]', 'O=PO']
>>> get_substructure(mol, func_list)

{'[H]O': [(<scm.plams.mol.atom.Atom at 0x125183518>,
           <scm.plams.mol.atom.Atom at 0x1251836a0>),
          (<scm.plams.mol.atom.Atom at 0x125183550>,
           <scm.plams.mol.atom.Atom at 0x125183240>)]}
Parameters:
  • mol (Molecule) – A PLAMS molecule.
  • func_list (list) – A list of functional groups. Functional groups can be represented by SMILES strings, PLAMS and/or RDKit molecules.
Returns:

A dictionary with functional groups from “func_list” as keys and a list of n-tuples with matching PLAMS Atom as values.