3.7.4. RDKit interface¶
RDKit is a collection of cheminformatics and machine-learning software written in C++ and Python. PLAMS interface to RDKit originates from QMFlows project and features functions for generating PLAMS molecules from string representations (SMARTS, SMILES) as well as a handful of tools for dealing with proteins and PDB files.
-
add_Hs
(mol, forcefield=None, return_rdmol=False)[source]¶ Add hydrogens to protein molecules read from PDB. Makes sure that the hydrogens get the correct PDBResidue info.
Parameters: Returns: A molecule with explicit hydrogens added
Return type: Molecule
or rdkit.Chem.Mol
-
apply_reaction_smarts
(mol, reaction_smarts, complete=False, forcefield=None, return_rdmol=False)[source]¶ Applies reaction smirks and returns product. If returned as a PLAMS molecule, thismolecule.properties.orig_atoms is a list of indices of atoms that have not been changed (which can for example be used partially optimize new atoms only with the freeze keyword)
Parameters: - mol (
Molecule
or rdkit.Chem.Mol) – molecule to be modified - reactions_smarts (str) – Reactions smarts to be applied to molecule
- complete (bool or float (value between 0 and 1)) – Apply reaction until no further changes occur or given fraction of reaction centers have been modified
- forcefield (str) – Specify ‘uff’ or ‘mmff’ to apply forcefield based geometry optimization of product structures.
- return_rdmol (bool) – return a RDKit molecule if true, otherwise a PLAMS molecule
Returns: (product molecule, list of unchanged atoms)
Return type: (
Molecule
, list of int)- mol (
-
apply_template
(mol, template)[source]¶ Modifies bond orders in PLAMS molecule according template smiles structure.
Parameters: Returns: Molecule with correct chemical structure and provided 3D coordinates
Return type:
-
get_backbone_atoms
(mol)[source]¶ Return a list of atom indices corresponding to the backbone atoms in a peptide molecule. This function assumes PDB information in properties.pdb_info of each atom, which is the case if the molecule is generated with the “readpdb” or “from_sequence” functions.
Parameters: mol ( Molecule
or rdkit.Chem.Mol) – a peptide moleculeReturns: a list of atom indices Return type: list
-
modify_atom
(mol, idx, element)[source]¶ Change atom “idx” in molecule “mol” to “element” and add or remove hydrogens accordingly
Parameters: Returns: Molecule with new element and possibly added or removed hydrogens
Return type:
-
to_rdmol
(plams_mol, sanitize=True, properties=True)[source]¶ Translate a PLAMS molecule into an RDKit molecule type. PLAMS
Molecule
,Atom
orBond
properties are pickled if they are neither booleans, floats, integers, floats nor strings, the resulting property names are appended with ‘_pickled’.Parameters: Returns: an RDKit molecule
Return type: rdkit.Chem.Mol
-
from_rdmol
(rdkit_mol, confid=-1, properties=True)[source]¶ Translate an RDKit molecule into a PLAMS molecule type. RDKit properties will be unpickled if their name ends with ‘_pickled’.
Parameters: Returns: a PLAMS molecule
Return type:
-
from_sequence
(sequence, nconfs=1, name=None, forcefield=None, rms=0.1)[source]¶ Generates PLAMS molecule from a peptide sequence. Includes explicit hydrogens and 3D coordinates.
Parameters: - sequence (str) – A peptide sequence, e.g. ‘HAG’
- nconfs (int) – Number of conformers to be generated
- name (str) – A name for the molecule
- forcefield (str) – Choose ‘uff’ or ‘mmff’ forcefield for geometry optimization and ranking of comformations. The default value None results in skipping of the geometry optimization step.
- rms (float) – Root Mean Square deviation threshold for removing similar/equivalent conformations.
Returns: A peptide molecule with hydrogens and 3D coordinates or a list of molecules if nconfs > 1
Return type: Molecule
or list of PLAMS Molecules
-
from_smiles
(smiles, nconfs=1, name=None, forcefield=None, rms=0.1)[source]¶ Generates PLAMS molecule(s) from a smiles strings.
Parameters: - smiles (str) – A smiles string
- nconfs (int) – Number of conformers to be generated
- name (str) – A name for the molecule
- forcefield (str) – Choose ‘uff’ or ‘mmff’ forcefield for geometry optimization and ranking of comformations. The default value None results in skipping of the geometry optimization step.
- rms (float) – Root Mean Square deviation threshold for removing similar/equivalent conformations
Returns: A molecule with hydrogens and 3D coordinates or a list of molecules if nconfs > 1
Return type: Molecule
or list of PLAMS Molecules
-
from_smarts
(smarts, nconfs=1, name=None, forcefield=None, rms=0.1)[source]¶ Generates PLAMS molecule(s) from a smarts strings. This allows for example to define hydrogens explicitly. However it is less suitable for aromatic molecules (use from_smiles in that case).
Parameters: - smarts (str) – A smarts string
- nconfs (int) – Number of conformers to be generated
- name (str) – A name for the molecule
- forcefield (str) – Choose ‘uff’ or ‘mmff’ forcefield for geometry optimization and ranking of comformations. The default value None results in skipping of the geometry optimization step.
- rms (float) – Root Mean Square deviation threshold for removing similar/equivalent conformations.
Returns: A molecule with hydrogens and 3D coordinates or a list of molecules if nconfs > 1
Return type: Molecule
or list of PLAMS Molecules
-
partition_protein
(mol, residue_bonds=None, split_heteroatoms=True, return_rdmol=False)[source]¶ Splits a protein molecule into capped amino acid fragments and caps.
Parameters: - mol (
Molecule
or rdkit.Chem.Mol) – A protein molecule - residue_bonds (tuple) – a tuple of pairs of residue number indicating which peptide bonds to split. If none, split all peptide bonds.
- split_heteroatoms (bool) – if True, all bonds between a heteroatom and a non-heteroatom across residues are removed
Returns: list of fragments, list of caps
- mol (