API¶
Classes¶
- class scm.reactmap.Molecule(atoms=None, bonds=None, smiles=None, graph=None, plams_mol=None, active_atoms=None, active_bonds=None)¶
Bases:
objectThe main class for inputing/using reactants and products.
There are four possible ways of initializing the molecule:
by explicitly specifying
atomsandbondsusing a SMILES string
using a networkx graph
using a PLAMS molecule
- Keyword Arguments:
atoms (
list(str)) – a list of atomic symbols. The order in the list corresponds to the index and will be used with thebondsargument.bonds (
list(tuple)) – a list of pairs (e1,e2) where e1 and e2 correspond to atom indices (integers) between which there is a bondsmiles (
str) – a SMILES string for inputting molecular structure(s)graph (
networkx.graph) – a networkx graph representing the molecular structure(s)plams_mol (
scm.plams.Molecule) – a PLAMS molecule. If bonds are not specified in the plams molecule, they will be automatically guessed.active_atoms (
list(int)) – a subset of the atom indices which are allowed to be mapped to an atom with a different index. Atoms outside this set will automatically be mapped using the identity function on the indices (e.g., if atom 5 is not in the active set, atom 5 in the reactants will be mapped to atom 5 in the products).active_bonds (
list(int)) – a subset of the bonds that are allowed to be broken (reactant) or formed (product) in a reaction. Bonds outside this set will be preserved but not necessarily mapped to atoms with the same indices.
Examples
Let’s input acetone first using the
atomsandbondsarguments:import scm.reactmap as rm reactant = rm.Molecule(atoms=['C','C','C','O','H','H','H','H','H','H'],bonds=[(0,1),(1,2),(1,3),(0,4),(0,5),(0,6),(3,7),(3,8),(3,9)])
And now as a SMILES sting using the
smileskeyword argument:reactant = rm.Molecule(smiles="CC(=O)C")- property H2_count¶
The number of H2 molecules in this
Moleculeobject.
- property active_atoms¶
A list of the indices of the active atoms for the
Moleculeinstance
- property active_bonds¶
A list of the active bonds for the
Moleculeinstance
- property assignable_atoms¶
Atoms which can be assigned to a different (non-identity) index. This also accounts for implicit H’s
- get_atom_implicitH(a: int)¶
This function returns the number of implicit Hydrogens for the atom with index a. If implicit Hydrogens are not being used, this function returns 0.
- get_graph(implicitH: bool | None = None)¶
The networkx graph representation of the chemical structure(s).
Note
The graph instance will only be calculated a single time as it is assumed that the reactant/product molecules will not change after input. This prevents redundant computations if the graph is needed in multiple places.
- Keyword Arguments:
implicitH (
bool) – whether or not to return the graph with implicit Hydrogens- Returns:
a networkx.graph instance
- property implicitH_valid¶
True if the
Moleculeobject can be used with theimplicitHdescription of the system; Otherwise, False (e.g., if there are hypervalent Hydrogens).- Type:
(bool)
- is_valid¶
a boolean denoting if input molecule is valid and can be used in reaction mapping; this may be False if the SMILES string is invalid, if one but not both of
atomsandbondsis provided, or if something is incorrect about the active atoms/bonds input.
- class scm.reactmap.Reaction(reactant, product, name=None)¶
Bases:
objectThe main class for inputting (reactant, product) pairs to specify a reaction and investigating the results of a mapping.
The Reaction class is intended as the minimal workspace for users of
reactmap. The Reaction class contains the reactant and product molecules and, once the problem has been solved, the optimal mapping and its associated cost.- Keyword Arguments:
reactant (
reactmap.Molecule) – The reactant in the reactionproduct (
reactmap.Molecule) – The product in the reactionname (
str) – A name for the reaction. If no name is supplied, a generic reaction name is generated.
Example
Let’s find the (hypothetical) mapping between CCCCO and CCOCC using the Reaction class to input the molecules. Then, we’ll print the mapping using the value stored in the Reaction instance.:
>>> import reactmap as rm >>> reac = rm.Molecule(smiles='CCCCO') >>> prod = rm.Molecule(smiles='CCOCC') >>> reaction = rm.Reaction(reactant = reac, product = prod) >>> rm.Map(reaction) >>> print(reaction.mapping) [4, 3, 0, 1, 2, 12, 13, 14, 10, 11, 5, 6, 8, 9, 7]
- property bipartite_graph¶
A networkx bipartite graph between the reactant and product atoms.
- property bond_breakages¶
The bonds that are removed from the reactant molecule in the optimal mapping.
- property bond_formations¶
The bonds that are formed in the product molecule in the optimal mapping.
- found_solution¶
Whether a solution has been found.
- Type:
(Bool)
- property implicitH¶
True/False if this reaction is treating the reactants/products using implicit H’s
- Type:
(Bool)
- invert_solution()¶
When a mapping has been calculated from reactant to product. Return an inverted mapping from product to reactant.
- is_feasible() bool¶
the reaction mapping problem has a solution if and only if the size of atom type sets is the same for reactant and product i.e., if we have 2 Oxygen atoms and 4 Carbon atoms for the reactants and 1 oxygen and 5 carbons for the products, then there is no feasible mapping
- output_image(filename)¶
Generates 2
.svgfiles showing an atom mapping between the reactants and the products.- Parameters:
filename (
str) – the base name for the two output.svgfiles
The two output files will be called
<filename>_reactants.svgand<filename>_products.svg. The reactants will show broken bonds in red and the products will show formed bonds in green. The atom indices in both images refer to the reactant indices. This means that the reactant indices correspond to the same indices as were given in the input. The product indices, however, correspond to the index of the reactant atom that was mapped to that product atom. As a concrete example, atom 15 in the reactants image always maps to atom 15 in the products image. Atom 15 in the reactants is also atom 15 in the original reactants input, but atom 15 in the products images may refer to atom 17 (for example) in the original product input.
- prod¶
The product molecule
- Type:
- reac¶
The reactant molecule
- Type:
- property reaction_parity¶
Either 0 or 1, The parity of the feasible solutions. The cost of the solution will always have the same parity as all feasible mapping costs differ by 2.
- Type:
(Int)
- show_results(initial_index=0)¶
Displays the results of the reaction mapping in a human-readable way.
- Keyword Arguments:
initial_index (
int) – The index of the first atom. You may want to set this to 1 if you prefer the first atom to have index ‘1’.
The output shows several pieces of information.
Problem details
Number of atoms
Atom counts
Number of reactant/product bonds
Atom mapping details
Reactant index - the index of the reactant atom
Product index - the index of the product atom mapped to by the corresponding reactant atom
Atom type - the chemical symbol of the atom
Atom Status - any of the following
A - an active atom
I - an inactive atom
D - a (duplicate) atom that was filtered out of the reaction (and not mapped) because it was part of a molecule that occurred in both reactants and products
Bond change details - a list of bonds that are broken and formed given the atom mapping
Solution details
Time to find the solution
The cost of the mapping
- property top_nodes¶
A list of the reactant atoms to distinguish the reactant and product atoms.
- valid_reaction¶
Whether the reaction is valid.
- Type:
(Bool)
- class scm.reactmap.Map(reaction, settings=None, mapping=None)¶
Bases:
objectThe class that solves the mapping problem on a Reaction instance.
This is the main class that handles feasibility checking, pre-processing, running heuristics, and setting up and running mixed-integer linear programming (MILP) problems to solve the reaction mapping problem. This class interfaces with Reaction instances and writes relevant mapping data to those instances. Optionally, a Settings object can be provided to specify problem settings.
- Keyword Arguments:
reaction (
reactmap.Reaction) – the reaction class that contains the reactant(s) and product(s). This can be a single Reaction object or a list of Reaction objects.settings (
reactmap.Settings) – the Settings class that contains user-specified problem options. If no Settings class is supplied, default values for all settings will be used.
Example
Let’s calculate the mapping cost between CCCCO and CCOCC:
>>> import scm.reactmap as rm >>> reac = rm.Molecule(smiles='CCCCO') >>> prod = rm.Molecule(smiles='CCOCC') >>> reaction = rm.Reaction(reactant = reac, product = prod) >>> rm.Map(reaction) >>> print(reaction.cost) 4
We can also use a Settings object to turn off heuristics:
>>> settings = rm.Settings(use_heuristics = False) >>> rm.Map(reaction, settings = settings) >>> print(reaction.cost) 4
- class scm.reactmap.Settings(**kwargs)¶
Bases:
objectThis is the main class for inputting settings for solving the problem(s).
Table 1 General problem solving options¶ Keyword
Options/Description
implicitH
True (default) - Treat all hydrogens as implicit. This greatly reduces the dimension of the optimization problem and can lead to significant performance gains. Note that these constraints assume all Hydrogens are bound to at most one other atom.
False - Do not use an implicit Hydrogen representation of the problem.
formulation
- These options allow the user to choose between MIP formulations.
‘bondmapping’ - use a larger MIP formulation that explicitly maps all preserved bonds.
‘bondchange’ - (default) - use a smaller MIP formulation that implicitly maps all preserved bonds. This is typically the superior formulation.
cbc_path
A path to the cbc executable if it is not automatically detected by the system (e.g., it is not in the $PATH variable)
warmstart
True - Uses the mapping in the reaction class (either given by the user, or from the heuristics) as an initial guess.
False (default) - Do not use the mappings as a starting guess.
preprocess
True (default) - Attempt to remove redundancies (e.g. identical molecular structures in both reactants and products) to speed up the problem.
False - Do not attempt to simplify the problem.
preprocess_map
True - Also solve the mappings for molecules that were filtered out by pre-processing (these maps should be obvious between identical molecular structures, but calculating them requires more computational effort)
False (default) - Ignore the mappings for filtered structures
time_limit
The total time the MILP solver is allowed to run. Default is no time limit.
Table 2 Heuristics options¶ Keyword
Options/Description
use_heuristics
True (default) - Attempt to solve the problem using inexpensive heuristics rather than the MIP.
False - Directly solve the MIP.
lb_bonds
True (default) - Calculate an extremely cheap but often loose lower bound based on the minimum number of required bond changes.
False - Do not calculate this lower bound.
lb_assignment
True (default) - Calculate a cheap and relatively tight lower bound based on solving an assignment problem with local valence difference costs.
False - Do not calculate this lower bound.
lb_branchtight
True - Calculate an expensive lower bound using an iterative procedure.
False (default) - Do not calculate this lower bound.
ub_assignment
True (default) - Calculate a cheap and often good upper bound using an assignment problem with costs based on local neighborhood difference between reactant and product atoms.
False - Do not calculate this upper bound.
ub_greedy
True - Use an expensive, greedy algorithm to try to find an initial atom mapping.
False (default) - Do not calculate this upper bound.
ub_qap
True - Solve the quadratic assignment problem to local optimality to find an upper bound for the reaction mapping problem. This can be expensive but can often produce a tight upper bound.
False (default) - Do not calculate this upper bound.
Table 3 MIP Constraint Options¶ Keyword
Options/Description
con_valance
True (default) - Use valence constraints. These constraints involve limiting the possible number of mapped bonds based on the valences of mapped atoms (e.g., if we map a carbon with 2 oxygen neighbors to a carbon with only 1 oxygen neighbor, we know that at most one of the C-O bonds can be preserved.)
False - Do not use valence constraints.
con_tightbonds
True - Use tightbonds constraints. These constraints tighten the model by taking into account that a certain bond can only be mapped once. These constraints are typically expensive and usually do not improve performance.
False (default) - Do not use these constraints.
con_symmetry
True - Take problem symmetries into account to reduce the number of redundant solutions. For example, in the CO2 molecule, the mappings involving the oxygens are interchangeable. These constraints typically do not improve performance.
False (default) - Do not use these constraints.
con_symmetry_form
- These options are only used if
con_symmetryis True 17 - Use a sparse symmetry-breaking constraint.
18 - Use a medium-sparse symmetry-breaking constraint.
19 (default) - Use a dense symmetry-breaking constraint.
con_symmetry_version
- These options are only used if
con_symmetryis True ‘cycle’ - consider cyclic symmetries
‘node’ - consider nodal symmetries
‘nodecycle’ (default) - consider both nodal and cyclic symmetries
con_symmetry_side
- These options are only used if
con_symmetryis True ‘reactant’ - use symmetries in the reactant molecule(s) to make constraints
‘product’ - use the symmetries in the product molecule(s)
‘reactantproduct’ (default) - use the symmetries in both reactants and products
Table 4 Output Options¶ Keyword
Options/Description
print_progress
True (default) - In solving a large problem set of reactions, write a one-line summary to stdout after each reaction map is solved.
False - Do not write anything to stdout.
Example
We can set options in two ways. First by initializing the settings class, and by changing them directly. We can see the set options with the show_option attribute. Here one can see that the implictH and formulation options have been changed:
>>> import reactmap as rm >>> settings = rm.Settings(implicitH = False) >>> settings.formulation = 'bondmapping' >>> settings.show_options() ────────────────────────────────────────────────────── RM Settings ────────────────────────────────────────────────────── Key Value Default ────────────────────────────────────────────────────── print_level status status log_level info info print_progress False False cbc_path None None warmstart False False preprocess True True preprocess_map False False use_heuristics True True lb_bonds False False lb_assignment True True lb_branchtight False False ub_assignment True True ub_qap False False ub_greedy False False implicitH False True con_valence True True con_tightbonds False False formulation bondmapping bondchange con_symmetry False False con_symmetry_version cycle cycle con_symmetry_form 19 19 con_symmetry_side reactantproduct reactantproduct write_log False False time_limit None None ──────────────────────────────────────────────────────
- __str__()¶
‘ Print a table of all settings, the current value and the default setting.
- restore_defaults()¶
Resets all options to the default values.
- show_options()¶
This displays all settings options as well as their current and default values.
- class scm.reactmap.RMLogger(print_level='error', log_level='info', write_log=False, log_name='RM_Log')¶
Bases:
objectThis is the logger class for reactmap. This class handles all output from other reactmap components. An instance of this class is available at the package level (as
reactmap.logger) and is used throughout the solution process.Table 5 Useful Output and Logging Options¶ Keyword
Options/Description
print_level
- These options indicate the level of information to write to stdout.
‘info’ - output every step of the solution process
‘status’ (default) - output a summary of the solution process
‘warning’ - only output warnings
‘error’ - only output errors
‘critical’ - only output critical errors
‘none’ - do not write to stdout
log_level
- These options indicate the level of information to write to the logfile.
write_logmust be True for these options to take effect. ‘info’ - output every step of the solution process
‘status’ (default) - output a summary of the solution process
‘warning’ - only output warnings
‘error’ - only output errors
‘critical’ - only output critical errors
‘none’ - do not write to log
write_log
True - Write a log file
False (default) - Do not write a log file
- add_to_log(message)¶
Adds a message to the log. This function also checks if the internal log string becomes to large, and will flush the log to disk.
- Keyword Arguments:
message (
str) – The message to add to the log.
- conclude_mapping(reaction)¶
When a mapping is finished. Write a small concluding message about the reaction results.
- Keyword Arguments:
reaction (
reactmap.Reaction) – The reaction that will is about to be solved.
- critical(message)¶
Create a message of level critical.
- Keyword Arguments:
message (
str) – The message to be logged and/or printed.
- error(message)¶
Create a message of level error.
- Keyword Arguments:
message (
str) – The message to be logged and/or printed.
- info(message)¶
Create a message of level info.
- Keyword Arguments:
message (
str) – The message to be logged and/or printed.
- new_reaction(reaction)¶
When starting to solve a new problem, this function will show some information about the problem, like the name of the reaction, the amount of atoms and number of bonds for both the reactant and product molecules.
- Keyword Arguments:
reaction (
reactmap.Reaction) – The reaction that will is about to be solved.
- open_new_log_file()¶
Open a log file named after the set log name.
- print_message(message)¶
Prints a message to the command line.
- Keyword Arguments:
message (
str) – The message to be printed.
- set_log_level(output)¶
- Set the log level that messages need to exceed in order to be saved to the log. The options are:
info : Show all messages. Also the least important messages as trying individual heuristics.
status : Show mayor points in the solving process, like trying heuristics or starting with solving the MILP model.
warning : Show warnings. For example when certain settings cannot be used and are thus turned off.
error : Show errors.
critical : Shows critical errors that force the solver to quit with this problem.
none : Show no messages.
- set_print_level(output)¶
- Set the print level that messages need to exceed in order to be printed in the command line. The options are:
info : Show all messages. Also the least important messages as trying individual heuristics.
status : Show mayor points in the solving process, like trying heuristics or starting with solving the MILP model.
warning : Show warnings. For example when certain settings cannot be used and are thus turned off.
error : Show errors.
critical : Shows critical errors that force the solver to quit with this problem.
none : Show no messages.
- start_new_log_file(log_name='RM_Log')¶
Start a new log file. If a previous log is still open, it will be closed.
- Keyword Arguments:
log_name (
str) – The name of the file. The default is ‘RM_Log’.
- status(message)¶
Create a message of level status.
- Keyword Arguments:
message (
str) – The message to be logged and/or printed.
- warning(message)¶
Create a message of level warning.
- Keyword Arguments:
message (
str) – The message to be logged and/or printed.
- write_data()¶
Flushes the current log to disk.
- property write_log¶
True/False if a log file should be written (defaults to False)
Submodules¶
heuristics is a submodule for using heuristic techniques to generate upper and lower bounds for reaction mapping problems. The HeuristicManager class interfaces with all heuristic techniques and manages the best values for upper and lower bounds to the problem. The HeuristicManager class should be sufficient for most users to try different heuristics.
- Classes:
HeuristicsManager
- Submodules:
lower_bound
upper_bound
- class scm.reactmap.heuristics.HeuristicsManager(settings)¶
Bases:
objectThe HeuristicsManager class is responsible for executing the heuristics (upper and lower bounds) on a reaction problem. The class will automatically be called by the Map class during a normal
reactmaprun. Advanced users may sometimes find it useful to work with instances of the HeuristicsManager class, but this will likely be for finding approximate solutions to very difficult problems or other advanced case studies.- Keyword Arguments:
reaction (
reactmap.Reaction) – The reaction problem for which to find the upper and lower bounds.settings (
reactmap.Settings) – The settings class that details which heuristics to try.
Example
We will use the HeuristicsManager class to find an upper bound (a feasible solution) to the reaction mapping problem between CCCCO and CCOCC. Here we find that the default upper bound options are sufficient in finding a tight upper bound.
>>> import reactmap as rm >>> reac = rm.Molecule(smiles='CCCCO') >>> prod = rm.Molecule(smiles='CCOCC') >>> reaction = rm.Reaction(reactant = reac, product = prod) >>> heurManager = rm.heuristics.HeuristicsManager(rm.Settings(ub_greedy = True)) >>> heurManager.run(reaction) >>> print(reaction.ub) 4 >>> rm.Map(reaction) >>> print(reaction.cost) 4
- run(reaction)¶
- try_lb(f, heur_name)¶
This function tries a lower bound. If a better lower bound was found it updates the appropriate fields in the reaction class.
- Keyword Arguments:
f (
func) – The function handle for a lower bound heuristicheur_name (
str) – The settings name of the lower bound. This is required for logging.
- try_ub(f, heur_name)¶
This function tries a upper bound. If a better upper bound was found it updates the appropriate fields in the reaction class.
- Keyword Arguments:
f (
func) – The function handle for a upper bound heuristicheur_name (
str) – The settings name of the upper bound. This is required for logging.