3. For developers

This section is meant for developers wishing to extend/change/fix the input definitions. I will describe the general flow of how the automatic generation of classes works, and general tips for adding new Block types, Key types or new behavior.

3.1. Automatic class generation

See the following schematic for a general overview of the build process, or scroll past it for a more detailed text description.

_images/input_classes_build_process.svg

For every driver or engine in the AMS suite, there is a json file defining rules for valid text input following the syntax. The syntax consists of keys and blocks. A key is a simple key-value pair and a block is simply named collection of keys. The json definitions determine which blocks and keys are allowed in the text input, and the metadata per block/key. The goal is to create a python class system that allows for type hinting and autocompletion.

To achieve this, in the foray post_run step, the input definition jsons are used to instantiate scm.pisa.block.Block and scm.pisa.key.Key objects. These mostly behave as simple dataclass containers for the metadata defined in the json. Like Python dataclasses, the scm.pisa.block.Block class contains a __post_init__() method that is called after the __init__() method. This method recursively generates instance attributes for keys and blocks defined inside of the blocks. They also contain functionality for changing values of scm.pisa.key.Key objects and finally generating valid text input.

To allow for autocompletion without a live python kernel and static type checking, this information can not simply live in memory, but must be written out to python source code. The scm.pisa.block_subclass_string.BlockSubclassString class is capable of writing valid python source code, that defines a class where the __post_init__() method is overwritten to explicitly define all the keys and blocks inside of the block. See the following snippet of autogenerated code:

from typing import Literal
from scm.pisa.block import EngineBlock, FixedBlock
from scm.pisa.key import BoolKey, FloatKey, MultipleChoiceKey, StringKey, BoolType

class HYBRID(EngineBlock):
    class __Capping(FixedBlock):
        '''This block is about capping details. Capping occurs with hydrogen atoms when a bond is broken between an atom inside the region and one outside.'''
        def __post_init__(self):
            self.AllowHighBondOrders: BoolType | BoolKey = BoolKey(name='AllowHighBondOrders', comment='Allows capping of interregional aromatic, double and triple bonds. This is normally not a good idea, since the capping is done with hydrogen atoms.', default=False)
            self.AtomicInfoForCappingAtom: str | StringKey = StringKey(name='AtomicInfoForCappingAtom', comment='The AtomicInfo for the capping atoms. Typically a string like ForceField.Type=X much like forcefield info is entered in the System block for normal atoms.', default='ForceField.Type=H_ ForceField.Charge=0.0')
            self.CappingElement: str | StringKey = StringKey(name='CappingElement', comment='The element to be used for capping. The hydrogen atom has the advantage that it is very small.', default='H')
            self.CheckCapping: BoolType | BoolKey = BoolKey(name='CheckCapping', comment='The same outside atom can be involved in multiple capping coordinate definitions. This is not a good idea, and this will not be accepted by using this check.', default=True)
            self.Distance: float | FloatKey = FloatKey(name='Distance', comment='A negative value means automatic. In that case the sum of covalent radii is used', default=-1.0)
            self.Option: Literal["Fractional", "Fixed"] = MultipleChoiceKey(name='Option', comment='The capping atom is always along the broken bond vector.\n\nThe bond distance between the capping atom and the two atoms are obtained from covalent radii, let us call them D1H and D2H.\n\nWith option=Fractional the capping is on the bond vector with the fraction D1H/(D1H+D2H).\n\nWith the Fixed option it at the distance D1H from atom 1. A distance of zero always means the coordinate of the inside atom.', gui_name='Capping option:', default='Fixed', choices=['Fractional', 'Fixed'])
    class __Energy(FixedBlock):
        '''This block is there to construct the energy.'''
        class __Term(FixedBlock):
            '''This block is there to construct the energy term. Can have multiple occurrences'''
            def __post_init__(self):
                self.Charge: float | FloatKey = FloatKey(name='Charge', comment='Net charge to be used for this energy term.', default=0.0)
                self.EngineID: str | StringKey = StringKey(name='EngineID', comment='Identifier for the engine')
                self.Factor: float | FloatKey = FloatKey(name='Factor', default=1.0)
                self.Region: str | StringKey = StringKey(name='Region', comment='Identifier for the region', gui_type='region')
                self.UseCappingAtoms: BoolType | BoolKey = BoolKey(name='UseCappingAtoms', comment='Whether to use capping for broken bonds', default=True)
        def __post_init__(self):
            self.DynamicFactors: Literal["Default", "UseLowestEnergy", "UseHighestEnergy"] = MultipleChoiceKey(name='DynamicFactors', comment='Default - use factors as set in the corresponding Term blocks; \nUseLowestEnergy - set all factors to 0 except for that of the engine with the lowest energy, which is set to 1; \nUseHighestEnergy - set all factors to 0 except for that of the engine with the highest energy, which is set to 1. \nThe last two options make sense only for non-QMMM hybrid calculation (that is, if the QMMM block is not present) and only when using engines whose energies can be compared directly.', gui_name='Adjust factors:', default='Default', choices=['Default', 'UseLowestEnergy', 'UseHighestEnergy'])
            self.Term = self.__Term(name='Term', comment='This block is there to construct the energy term. Can have multiple occurrences', unique=False)
    class __Engine(EngineBlock):
        '''The input for the computational (sub) engine. The header of the block determines the type of the engine. An optional second word in the header serves as the EngineID, if not present it defaults to the engine name. Currently it is not allowed to have a Hybrid engine as a sub engine.'''
        def __post_init__(self):
            pass
    class __QMMM(FixedBlock):
        '''This block is there to identify the QMMM engines.'''
        def __post_init__(self):
            self.Embedding: Literal["Mechanical", "Electrostatic"] = MultipleChoiceKey(name='Embedding', comment='Determines how the QM region is embedded into the MM region.\n\nMechanical embedding embedding can also be achieved using the Energy%Terms keywords, but the common case of a two region mechanical QM/MM embedding is easier to set up using this keyword.', default='Electrostatic', choices=['Mechanical', 'Electrostatic'])
            self.MMCharge: float | FloatKey = FloatKey(name='MMCharge', comment='Net charge to be used for the MM region.', default=0.0)
            self.MMEngineID: str | StringKey = StringKey(name='MMEngineID', comment='Identifier for the MM engine')
            self.MMEngineIsPolarizable: BoolType | BoolKey = BoolKey(name='MMEngineIsPolarizable', comment='Whether or not the MM engine has dynamic charges (for now not supported at all).', hidden=True, default=False)
            self.QMCharge: float | FloatKey = FloatKey(name='QMCharge', comment='Net charge to be used for the QM region.', default=0.0)
            self.QMEngineID: str | StringKey = StringKey(name='QMEngineID', comment='Identifier for the QM engine')
            self.QMRegion: str | StringKey = StringKey(name='QMRegion', comment='Identifier for the QM region. The rest of the system is considered the MM region.', gui_type='region')
            self.UseCappingAtoms: BoolType | BoolKey = BoolKey(name='UseCappingAtoms', comment='Whether to use capping for broken bonds.', default=True)
    def __post_init__(self):
        self.AllowSanityCheckWarnings: BoolType | BoolKey = BoolKey(name='AllowSanityCheckWarnings', comment='Sanity checks will be performed on the setup. If this option is on, only warnings are printed. If not the program will stop on warnings.', default=False)
        self.GuessAttributesOnce: BoolType | BoolKey = BoolKey(name='GuessAttributesOnce', comment='If any ForceField subengines are defined, and if automatic atom typing is possible, then the atom typing is done at the level of the Hybrid engine, and not of the ForceField subengines. This ensures that the same atom types and charges are used in each subsystem, so that pair energy terms that should cancel, will cancel. If set to False, then for each energy term the atom types and charges of the subsystem will be determined separately.', default=True)
        self.RestartSubEngines: BoolType | BoolKey = BoolKey(name='RestartSubEngines', comment='Save all the results of the subengines and pass those in a next geometry step or MD step.', default=True)
        self.TweakRequestForSubEngines: BoolType | BoolKey = BoolKey(name='TweakRequestForSubEngines', comment='Only request what is really needed, gradients and charges.', default=True)
        self.Capping = self.__Capping(name='Capping', comment='This block is about capping details. Capping occurs with hydrogen atoms when a bond is broken between an atom inside the region and one outside.')
        self.Energy = self.__Energy(name='Energy', comment='This block is there to construct the energy.')
        self.Engine: EngineBlock = self.__Engine(name='Engine', comment='The input for the computational (sub) engine. The header of the block determines the type of the engine. An optional second word in the header serves as the EngineID, if not present it defaults to the engine name. Currently it is not allowed to have a Hybrid engine as a sub engine.', unique=False, header=True)
        self.QMMM = self.__QMMM(name='QMMM', comment='This block is there to identify the QMMM engines.')

As you can see, all the actual functionality is defined in the base classes defined in scm.pisa.block, and the autogenerated code only defines which blocks and keys exist in a driver block or engine block. All the blocks exists as nested class definitions, since block names are not guaranteed to be unique across the drivers and engines.

3.2. Type hinting

To allow users to write some_block.some_key = True instead of some_block.some_key.val = True without the type checker complaining, every key attribute has Union type hint of it’s actual type and the value that it accepts. The __setattr__() of scm.pisa.block.Block is overwritten to allow for this. The Engine attribute of all driver blocks is special, in the sense that it has a general type hint scm.pisa.block.EngineBlock that accepts all existing engine blocks.

There is no way to create a type hint that warns users from setting IntKey or FloatKey attributes with booleans values, since Python considers bool a direct subclass of int. Users will not be warned when writing some_block.some_int_key = True by the type hinter, but will encounter an exception at runtime.