8.1.3. ResultsImporters API¶

class ResultsImporter(job_collection=None, data_set=None, settings=None)¶

__init__(job_collection=None, data_set=None, settings=None)¶

Class for constructing job_collection.yaml, data_set.yaml, and job_collection_engines.yaml files

job_collectionJobCollection or str or None

If None, a new JobCollection will be created

If a str, will be read from that file.

data_setDataSet, dict of DataSets, or None

If None, a new DataSet will be created

If a str, will be read from that file.

If dict, then if the dictionary has no key ‘training_set’ one will be created with a new DataSet

settingsstr, or Settings or dict

Some settings affecting the way the results importer works.

If a string is given, it should be the path to a results_importer_settings.yaml file. The settings are then loaded from that file. Otherwise, the settings are read from the Settings or dict.

If a setting is not set, it will take some default value.

‘remove_bonds’: bool (default True). Whether to delete the bonds from the jobs in the job collection. The bonds are not needed for ReaxFF or DFTB parametrization.
‘trim_settings’ : bool (default True). If the reference job is a GeometryOptimization but the newly added job is a SinglePoint, then remove the GeometryOptimization settings block from the settings. Similarly remove unused MolecularDynamics/PESScan blocks.
‘default_go_settings’ : Settings() containing default settings in the GeometryOptimization block of the AMS input. E.g. {‘MaxIterations’: 50, ‘PretendConverged’: True}
‘units’: Settings() containing the preferred units for different extractors. Units can be specified either with a known string or with a 2-tuple (string, float). For example {'energy': 'eV', 'forces': ('custom_unit', 12.34)}. The unit for ‘energy’ is also used for relative_energies. The known units are given in the PLAMS documentation.
‘allow_cache’: bool (default True). Whether to look up in the cache if a job has already been added to the job collection before adding another one.

results_importer_settings = Settings()
results_importer_settings.trim_settings = True
results_importer_settings.default_go_settings.MaxIterations = 50
results_importer_settings.units.energy = 'kcal/mol'
results_importer_settings.remove_bonds = False

sc = ResultsImporter(settings=results_importer_settings)

classmethod from_yaml(folder: str)¶: Initialize the ResultsImporter from a folder with .yaml files

classmethod from_params_results(results: ParAMSResults, settings: Settings | None = None)¶: Initialize the ResultsImporter with job collection and data sets from ParAMSResults

add_singlejob(amsjob, properties, name=None, task='SinglePoint', data_set=None, subgroup=None, settings=None, extra_engine=None)¶

This function adds an entry to the job collection and an entry to the reference engine collection using amsjob as a template The items of properties are extractors.

Returns a list of expressions added to the data_set.

amsjobAMSJob or path to ams.results folder or path to VASP results folder or path to Quantum ESPRESSO .out file

Job with a finished reference calculation

data_setstr or None

for example ‘training_set’ or ‘validation_set’. A value of None means ‘training_set’

subgroupstr or None

Set the SubGroup metadata for the DataSetEntries

taskstr

The Task of the added job. Defaults to SinglePoint, no matter what the Task of the reference amsjob is. If you want GeometryOptimizations to be run during the parametrization, you must set task=’GeometryOptimization’ here.

settingsSettings

Custom settings for the added job. By default, all settings from the reference job are inherited.

Any settings provided with this argument will override the inherited settings from the reference job.

sett = Settings()
sett.input.ams.GeometryOptimization.MaxIterations = 50

ri = ResultsImporter()
ri.add_singlejob(..., task='GeometryOptimization', settings=sett)

propertieslist or dict of extractors

If given as a list, the default settings (weights, sigma, unit) are used for each data_set entry.

The arguments to the extractor should be specified without the jobid.

Example: properties = ['energy', 'angle((0,1,2))'] will add two data_set entries, one for the energy and one for the angle between the first three atoms

Properties can also be a dict, where the keys are the extractors as above and the values are dicts containing the settings for the data set entry.

Example:

properties = {
    'energy': {
        'weight': 1.0,
        'sigma': 1.0
        'unit': 'eV',
    },
    'forces': {
        'weight': ...,
        'sigma': ....,
        'unit': 'Ha/bohr',
        'weights_scheme': WeightsSchemeGaussian(normalization='numentries'),
        }
    },
    'pes': { # this will translate the "min" to a fixed index (recommended)
         'relative_to': 'min',
    },
    'pes(relative_to="min")': { # the pes returned from this extractor might be relative to different datapoints
    },
    'angle((0,1,2))': {
        'weight': ...,
    }
}

add_pesscan_singlepoints(amsjob, properties, name=None, task='SinglePoint', start=0, end=None, step=1, indices=None, data_set=None, subgroup=None, settings=None, extra_engine=None)¶

The reference job must be a PESScan. This method extracts the converged points and adds them to the job collection as single points. Returns a list of expressions added to the data_set.

To add a job with Task ‘PESScan’ (for use with the pes* extractors), do NOT use this method but instead the add_singlejob() method.

amsjobAMSJob or path to ams.results folder

The job must have had Task PESScan

propertieslist or dictionary

Allowed keys are ‘energy’ (recommended for ML potentials), ‘relative_energies’ (recommended for ReaxFF/DFTB), and ‘forces’ (only applicable if CalcPropertiesAtPESPoints was set in the initial PESS scan).

Note: For Volume scans with no other constraints, if CalcPropertiesAtPESPoints was not set then this function will automatically add zeros as to the forces for all converged points.

namestr

Jobs in the job collection will get ID “name_frame003” etc.

taskstr

Task, only ‘SinglePoint’ makes sense

startint

start step (0-based)

endint or None

end step (0-based). If None, the entire trajectory is used

stepint

Use every step frames

indiceslist of int

Manually specified list of indices. Overrides start/end/step if not None.

data_setstr

Dataset (‘training_set’, etc.)

subgroupstr

Set a custom SubGroup metadata key for the data_set entries.

For the property ‘relative_energies’ you can set the ‘relative_to’ option to specify the reference point. Allowed values:

‘min’ : smallest energy from indices subset

‘max’ : largest energy from indices subset

‘first’ : first energy from indices subset

‘last’ : last energy from indices subset

‘min_global’ : smallest energy in the trajectory

‘max_global’ : largest energy in the trajectory

‘first_global’ : first energy in the trajectory

‘last_global’ : last energy in the trajectory

If specifying e.g. ‘min_global’, then the smallest energy in the trajectory is always included, even if it is not covered by the indices subset.

Example:

add_pesscan_singlepoints('/path/to/ams.rkf', properties={
    'energy': {
        'weight': 1.5
    },
    'relative_energies': {
        'weight': 2.0,
        'sigma' : 0.1,
        'unit'  : 'eV',
        'relative_to': 'min_global'
    },
}

add_neb_singlepoints(amsjob, properties, name=None, task='SinglePoint', data_set=None, subgroup=None, images='highest', extra_engine=None)¶

Method for extracting frames from an NEB calculation, and adding singlepoint jobs for each of the frames. Returns a list of expressions added to the data_set.

amsjobAMSJob or path to ams.results folder: The job must contain History and NEB sections on ams.rkf
propertieslist or dictionary: Allowed keys are [‘energy’, ‘relative_energies’]
namestr: Jobs in the job collection will get ID “name_frame003” etc.
taskstr: Task, only ‘SinglePoint’ makes sense.
data_setstr: Dataset (‘training_set’, etc.)
subgroupstr: Set custom SubGroup metadata for the data_set entries.
imagesstr, ‘highest’ or ‘all’: Whether to include only the highest energy image or all images

add_trajectory_singlepoints(amsjob, properties, name=None, task='SinglePoint', start=0, end=None, step=1, N=None, indices=None, data_set=None, subgroup=None, settings=None, extra_engine=None)¶

ResultsImporter for extracting frames from a trajectory file, and adding singlepoint jobs for each of the frames. Returns a list of expressions added to the data_set.

To add a job with task ‘GeometryOptimization’ or ‘MolecularDynamics’, do NOT use this method but instead use the add_singlejob() method.

amsjobAMSJob or path to ams.results folder: The job must contain a History section on ams.rkf, e.g. from a geometry optimization or MD simulation
propertieslist or dictionary: Allowed keys are [‘energy’, ‘relative_energies’, ‘forces’, ‘stresstensor’] Not all extractors are supported, since each individual frame does not constitute an AMSResults.

Note

The syntax forces(atom_index), for example forces(0) to extract the force components on the first atom, is not supported. It is only supported for the add_singlejob method.
namestr: Jobs in the job collection will get ID “name_frame003” etc.
taskstr: Task, only ‘SinglePoint’ makes sense
startint: start step (0-based)
endint or None: end step (0-based). If None, the entire trajectory is used
stepint: Use every step frames
Nint: Get N equally spaced frames in the interval [start, end). Overrides step if set.
indiceslist of int: Manually specified list of indices. Overrides start/end/step if not None.
data_setstr: Dataset (‘training_set’, etc.)
subgroupstr: Set custom SubGroup metadata for the data_set entries.

For the property ‘relative_energies’ you can set the ‘relative_to’ option to specify the reference point. Allowed values:

‘min’ : smallest energy from indices subset

‘max’ : largest energy from indices subset

‘first’ : first energy from indices subset

‘last’ : last energy from indices subset

‘min_global’ : smallest energy in the trajectory

‘max_global’ : largest energy in the trajectory

‘first_global’ : first energy in the trajectory

‘last_global’ : last energy in the trajectory

If specifying e.g. ‘min_global’, then the smallest energy in the trajectory is always included, even if it is not covered by the indices subset.

Example:

add_trajectory_singlepoints('/path/to/ams.rkf', properties={
    'energy': {
        'weight': 1.5
    },
    'relative_energies': {
        'weight': 2.0,
        'sigma' : 0.1,
        'unit'  : 'eV',
        'relative_to': 'min_global'
    },
    'forces': {}
}

add_pesexploration_singlepoints(amsjob, properties, name=None, task='SinglePoint', indices=None, data_set=None, subgroup=None, settings=None, extra_engine=None)¶

ResultsImporter for PES exploration reference jobs. Returns a list of expressions added to the data_set.

To add jobs with task ‘PESExploration’ to the job collection (although you most likely do not want to do that because of the computational expense), do NOT use this method but instead the add_singlejob() method.

The method will add

Forward and reverse reaction barriers from any transition state
Relative energies between two minima connected by a transition state
Relative energies between the lowest-energy minimum and all other minima

The “lowest-energy” and “all other minima” refer to minima either explicitly specified in indices, or connected to one of the transition states in indices.

Tip

Most PES explorations contain very many states. Select the subset you’re interested in with the indices, corresponding to state numbers.

amsjob: AMSJob or path to amsjob: An AMSJob or path to AMSJob
properties: list or dict: ‘energy’ and/or ‘relative_energies’
name: str: prefix for the jobids of the individual datapoints
task: str: must be ‘SinglePoint’
indices: None or list of int: The indices in this method are 1-based! The state numbering for a PES Exploration is 1-based, and those indices are used for everyday working. Therefore the same indexing scheme is kept.
data_set:: Dataset
subgroup:: Subgroup
settings:: Settings

add_reaction_energy(reactants, products, normalization='r0', normalization_value=1.0, task='SinglePoint', weight=1.0, sigma=None, reactants_names=None, products_names=None, reference=None, unit=None, dupe_check=True, data_set=None, subgroup=None, settings=None, extra_engine=None, metadata=None)¶

ResultsImporter for adding reaction energies to the data_set.

reactantslist: a list of jobids, or a list of paths to ams.results folders or ams.rkf files, or a list of AMSJobs
productslist: a list of jobids, or a list of paths to ams.results folders or ams.rkf files, or a list of AMSJobs
normalization_speciesstr: ‘r0’ for the first reactant, ‘r1’ for the second reactant, etc. ‘p0’ for the first product, ‘p1’ for the second product, etc. This normalizes the chemical equation such that the coefficient in front of the specified species is normalization_value
normalization_valuefloat: Normalize the chemical equation such that the coefficient in front of normalization_species is this number
taskstr, default ‘SinglePoint’: Set the task for the job collection entries (only if new entries are created from AMSJobs)
weightfloat, optional: Weight for the data set entry
sigmafloat, optional: Sigma for the data set entry
reactants_nameslist: set the job_collection IDs (only if new entries are created from AMSJobs). By default the job name is used.
products_nameslist: set the job_collection IDs (only if new entries are created from AMSJobs). By default the job name is used
referencefloat or None: The reaction energy. If None, an attempt will be made to calculate this, if all the constituent jobs (reactants and products) were loaded as AMSJobs
unitstr or 2-tuple: Energy unit. If 2-tuple should be of the form (“string”, “conversion_factor_from_au”)
dupe_checkbool: Check for duplicate data set entries
metadata:: a dictionary containing metadata for the data set entry. Note: new key-value pairs may be added to the dictionary by this method.
settingsSettings: Additional job settings

This method is primarily to be used by providing a list of paths to ams.results folders. The method will

Create reference engines in the EngineCollection based on the engine settings for the provided jobs
Extract the final structures from the AMSJobs, and add them to the JobCollection with the provided Task and pertinent ReferenceEngineID
Balance the chemical equation, calculate the reference value, and add an entry to the DataSet
The metadata for the DataSet entry is augmented by INFO_ReferenceEngineIDs, which gives all the reference engines used to calculate the reference data

store(folder=None, prefix='', backup=True, store_settings=True)¶

folderstr: If folder is not given, the current working directory is used. If the folder does not exist, it will be created.
prefixstr: Prefix the output file names, e.g. giving “prefixjob_collection.yaml”
backupbool: Whether to backup any existing files by appending a .002 suffix or similar to the existing files.
store_settingsbool: Whether to save results_importer_settings.yaml

Saves job_collection.yaml, reference_engines.yaml, training_set.yaml. For each additional data_set (e.g. “validation_set”), a yaml file is also created.

store_ase(folder: PathLike, format: str = 'extxyz') → List[Path]¶

Stores training_set.xyz and validation_set.xyz in the ASE format in the specified folder.

Only the following are included: * singlepoint jobs * single extractors (energy, forces)

folder: str: Folder to store the files in
format: str: format argument to pass to the ase.io.write method. Defaults to the ASE “extxyz” format.

add_engine_from_amsjob(amsjob, name=None)¶

Reads the engine definition from amsjob and adds it to the engine collection. Returns the ID of the engine in the engine collection.

amsjob: an AMSJob, or path to ams.results

namestr or None: If None, the engine name is created from the engine settings

Electronic Structure

ADF

Periodic DFT

DFTB & MOPAC

Interatomic Potentials

ReaxFF

Machine Learning Potentials

Force Fields

kMC and Microkinetics

Bumblebee: OLED stacks

Fluid Thermodynamics

COSMO-RS

Workflows and Utilities

OLED workflows

ChemTraYzer2

Conformers

Reactions Discovery

AMS Driver

Properties

PES Exploration

Molecular Dynamics

Monte Carlo

Interfaces

ParAMS

PLAMS

GUI

VASP

Downloads

Windows

Mac

Linux

Documentation

Overview

Tutorials

Installation Manual

Brochures

Other Resources

Changelog

Webinars

Workshops

Knowledgebank

FAQ

Pricing and licensing

8.1.3. ResultsImporters API¶