AMS worker

The PLAMS interface to the AMS driver and its engines through the AMSJob and AMSResults questions is technically quite similar to how one would run calculations from the command line or from the GUI: the AMS input files are written to disk, the AMS driver starts up, reads its input, performs the calculations and writes the results to disk in form of human readable text files as well as machine readable binary files, usually in KF format. This setup has the advantage that any calculation that can be performed with AMS can be setup from PLAMS as an AMSJob, and that any result from any calculation can be accessed from PLAMS through the corresponding AMSResults instance. Furthermore, the resulting files on disk can often visualized using the AMS GUI, as if the job had been set up and run through the graphical user interface. As such, this way of running AMS offers maximum flexibility and convenience to users.

However, for simple and fast jobs where we only care about some basic results, this flexibility comes at a cost: input files need to be created on disk, a process is launched, possibly reading all kinds of configuration and parameter files. The process writes more files to disk, which we later need to open again to extract (in the worst case) just a single number. The overhead might be irrelevant for sufficiently slow engines, but for a very fast force field this overhead can easily become the performance bottleneck.

Starting with the AMS2019.3 release, the AMS driver implements a special task, in which the running process listens for calculation requests on a named pipe (FIFO) and communicates the results of the calculations back on another pipe. This avoids the overhead of starting processes and eliminates all file based I/O. You can find more information about the pipe interface in the AMS driver in the corresponding part of the documentation. In PLAMS the AMSWorker class is used to represent this running AMS driver process. The AMSWorker class handles all communication with the process and hides the technical details of underlying communication protocol.

Consider the following short PLAMS script, that calculates and prints the total GFN1-xTB energy for all molecules found in a folder full of xyz-files. Using the regular AMSJob, this can be written as:

molecules = read_molecules('folder/with/xyz/files')

sett = Settings()
sett.input.ams.Task = 'SinglePoint'
sett.input.dftb.Model = 'GFN1-xTB'

for name, mol in molecules.items():
    results = AMSJob(name=name, molecule=mol, settings=sett).run()
    print('Energy of {} = {}'.format(name, results.get_energy()))

In order to switch this script over to using the AMSWorker, we need to make only a couple of changes:

molecules = read_molecules('folder/with/xyz/files')

sett = Settings()
sett.input.dftb.Model = 'GFN1-xTB'

with AMSWorker(sett) as worker:
    for name, mol in molecules.items():
        results = worker.SinglePoint(name, mol)
        print('Energy of {} = {}'.format(name, results.get_energy()))

With the first AMSJob based version, both the Settings instance and the Molecule instance were passed into the constructor of the AMSJob, while the AMSWorker constructor only accepts the Settings instance. The Molecule instance is only later passed into the SinglePoint method. This shows the basic usage of the AMSWorker class: create it once, supplying the desired Settings, and use these fixed settings for calculations on multiple molecules. It is not possible to change the Settings on an already running AMSWorker instance. If you have to switch Settings, you need to create a new AMSWorker with the new settings. It therefore only makes sense to use the AMSWorker if one has to do calculations on many molecules using the same settings.

Note that when using the AMSWorker the type of results in the above example is not actually AMSResults anymore: the call to SinglePoint returns an instance of AMSWorkerResults, which only implements a small subset of the methods available in the full AMSResults. This is the concession we have to make for using AMSWorker instead of AMSJob: after all the AMSJob class has many methods to extract arbitrary data from the result files of an AMS calculation. Since none of these files exist when directly communicating with the AMS process over a pipe, the AMSWorkerResults class supports none of these methods.

Given these restrictions we recommend that users first try the traditional route of running the AMS driver via the AMSJob class, and only switch to the AMSWorker alternative if they observe a significant slowdown due to the startup and I/O cost. The overhead is likely only relevant for simple tasks (single points, geometry optimizations) using rather fast engines such as semi-empirical methods and force fields.

In case the worker process fails to start up or terminates unexpectedly, an AMSWorkerError exception will be raised. The standard output and standard error output from the failed worker process is stored in the stdout and stderr attributes in AMSWorkerError. If an AMSWorkerError or AMSPipeRuntimeError exception occurs during SinglePoint, it will be internally caught and stored in the error attribute of the returned AMSWorkerResults object for further inspection. These two types of exceptions are typically related to the calculation being performed (the combination of the Molecule and Settings), so they are not allowed to propagate out of SinglePoint to match the behavior of AMSJob in similar situations. However, other types of exceptions derived from AMSPipeError may also occur in AMSWorker. These correspond to other errors defined by the pipe protocol and will propagate normally, because they represent programming and logic errors, protocol incompatibilities, or unsupported features. In any case, AMSWorker will be ready to handle another call to SinglePoint after an error.

PLAMS also provides the AMSWorkerPool class, which represents a pool of running AMSWorker instances, which dynamically pick tasks from a queue of calculations to be performed. This is useful for workflows that require the execution of many trivially parallel simple tasks. Using the AMSWorkerPool we could write the above example as:

molecules = read_molecules('folder/with/xyz/files')

sett = Settings()
sett.input.dftb.Model = 'GFN1-xTB'
sett.runscript.nproc = 1 # every worker is a serial process now

with AMSWorkerPool(sett, num_workers=4) as pool:
     results = pool.SinglePoints(molecules.items())
for r in results:
   print('Energy of {} = {}'.format(r.name, r.get_energy()))

AMSWorker API

class AMSWorker(settings, workerdir_root='/tmp', workerdir_prefix='amsworker', use_restart_cache=True, keep_crashed_workerdir=False, always_keep_workerdir=False)[source]

A class representing a running instance of the AMS driver as a worker process.

Users need to supply a Settings instance representing the input of the AMS driver process (see Preparing input), but not including the Task keyword in the input (the input.ams.Task key in the Settings instance). The Settings instance should also not contain a system specification in the input.ams.System block, the input.ams.Properties block, or the input.ams.GeometryOptimization block. Often the settings of the AMS driver in worker mode will come down to just the engine block.

The AMS driver will then start up as a worker, communicating with PLAMS via named pipes created in a temporary directory (determined by the workerdir_root and workerdir_prefix arguments). This temporary directory might also contain temporary files used by the worker process. Note that while an AMSWorker instance exists, the associated worker process can be assumed to be running and ready: If it crashes for some reason, it is automatically restarted.

The recommended way to start an AMSWorker is as a context manager:

with AMSWorker(settings) as worker:
    results = worker.SinglePoint('my_calculation', molecule)
# clean up happens automatically when leaving the block

If it is not possible to use the AMSWorker as a context manager, cleanup should be manually triggered by calling the stop() method.

stop(keep_workerdir=False)[source]

Stops the worker process and removes its working directory.

This method should be called when the AMSWorker instance is not used as a context manager and the instance is no longer needed. Otherwise proper cleanup is not guaranteed to happen, the worker process might be left running and files might be left on disk.

SinglePoint(name, molecule, prev_results=None, quiet=True, gradients=False, stresstensor=False, hessian=False, elastictensor=False, charges=False, dipolemoment=False, dipolegradients=False)[source]

Performs a single point calculation on the geometry given by the Molecule instance molecule and returns an instance of AMSWorkerResults containing the results.

Every calculation should be given a name. Note that the name must be unique for this AMSWorker instance: One should not attempt to reuse calculation names with a given instance of AMSWorker.

By default only the total energy is calculated but additional properties can be requested using the corresponding keyword arguments:

  • gradients: Calculate the nuclear gradients of the total energy.

  • stresstensor: Calculate the clamped-ion stress tensor. This should only be requested for periodic systems.

  • hessian: Calculate the Hessian matrix, i.e. the second derivative of the total energy with respect to the nuclear coordinates.

  • elastictensor: Calculate the elastic tensor. This should only be requested for periodic systems.

  • charges: Calculate atomic charges.

  • dipolemoment: Calculate the electric dipole moment. This should only be requested for non-periodic systems.

  • dipolegradients: Calculate the nuclear gradients of the electric dipole moment. This should only be requested for non-periodic systems.

Users can pass an instance of a previously obtained AMSWorkerResults as the prev_results keyword argument. This can trigger a restart from previous results in the worker process, the details of which depend on the used computational engine: For example, a DFT based engine might restart from the electronic density obtained in an earlier calculation on a similar geometry. This is often useful to speed up series of sequentially dependent calculations:

mol = Molecule('some/system.xyz')
with AMSWorker(sett) as worker:
    last_results = None
    do i in range(num_steps):
        results = worker.SinglePoint(f'step{i}', mol, prev_results=last_results, gradients=True)
        # modify the geometry of mol using results.get_gradients()
        last_results = results

Note that the restarting is disabled if the AMSWorker instance was created with use_restart_cache=False. It is still permitted to pass previous AMSResults instances as the prev_results argument, but no restarting will happen.

The quiet keyword can be used to obtain more output from the worker process. Note that the output of the worker process is not printed to the standard output but instead ends up in the ams.out file in the temporary working directory of the AMSWorker instance. This is mainly useful for debugging.

GeometryOptimization(name, molecule, prev_results=None, quiet=True, gradients=True, stresstensor=False, hessian=False, elastictensor=False, charges=False, dipolemoment=False, dipolegradients=False, method=None, coordinatetype=None, usesymmetry=None, optimizelattice=False, maxiterations=None, pretendconverged=None, calcpropertiesonlyifconverged=True, convquality=None, convenergy=None, convgradients=None, convstep=None, convstressenergyperatom=None)[source]

Performs a geometry optimization on the Molecule instance molecule and returns an instance of AMSWorkerResults containing the results from the optimized geometry.

The geometry optimizer can be controlled using the following keyword arguments:

  • method: String identifier of a particular optimization algorithm.

  • coordinatetype: Select a particular kind of optimization coordinates.

  • usesymmetry: Enable the use of symmetry when applicable.

  • optimizelattice: Optimize the lattice vectors together with atomic positions.

  • maxiterations: Maximum number of iterations allowed.

  • pretendconverged: If set to true, non converged geometry optimizations will be considered successful.

  • calcpropertiesonlyifconverged: Calculate properties (e.g. the Hessian) only if the optimization converged.

  • convquality: Overall convergence quality, see AMS driver manual for the GeometryOptimization task.

  • convenergy: Convergence criterion for the energy (in Hartree).

  • convgradients: Convergence criterion for the gradients (in Hartree/Bohr).

  • convstep: Convergence criterion for displacements (in Bohr).

  • convstressenergyperatom: Convergence criterion for the stress energy per atom (in Hartree).

ParseInput(program_name, text_input, string_leafs)[source]

Parse the text input and return a Python dictionary representing the JSONified input.

  • program_name: the name of the program. This will be used for loading the appropriate json input definitions. e.g. if program_name=’adf’, the input definition file ‘adf.json’ will be used.

  • text_input: a string containing the text input to be parsed.

  • string_leafs: if True the values in the parsed json input will always be string. e.g. if in the input you have ‘SomeFloat 1.2’, the json leaf node for ‘SomeFloat’ will be the string ‘1.2’ (and not the float number 1.2). If False the leaf values in the json input will be of the ‘appropriate’ type, i.e. float will be floats, strings will be strings, booleans will be boleans etc…

AMSWorkerResults API

class AMSWorkerResults(name, molecule, results, error=None)[source]

A specialized class encapsulating the results from calls to an AMSWorker.

Technical

AMSWorkerResults is not a subclass of Results or AMSResults. It does however implement some commonly used methods of the AMSResults class, so that results calculated by AMSJob and AMSWorker can be accessed in a uniform way.

property name

The name of a calculation.

That is the name that was passed into the AMSWorker method when this AMSWorkerResults object was created. I can not be changed after the AMSWorkerResults instance has been created.

ok()[source]

Check if the calculation was successful. If not, the error attribute contains a corresponding exception.

Users should check if the calculation was successful before using the other methods of the AMSWorkerResults instance, as using them might raise a ResultsError exception otherwise.

get_errormsg()[source]

Attempts to retreive a human readable error message from a crashed job. Returns None for jobs without errors.

get_energy(unit='au')[source]

Return the total energy, expressed in unit.

get_gradients(energy_unit='au', dist_unit='au')[source]

Return the nuclear gradients of the total energy, expressed in energy_unit / dist_unit.

get_stresstensor()[source]

Return the clamped-ion stress tensor, expressed in atomic units.

get_hessian()[source]

Return the Hessian matrix, i.e. the second derivative of the total energy with respect to the nuclear coordinates, expressed in atomic units.

get_elastictensor()[source]

Return the elastic tensor, expressed in atomic units.

get_charges()[source]

Return the atomic charges, expressed in atomic units.

get_dipolemoment()[source]

Return the electric dipole moment, expressed in atomic units.

get_dipolegradients()[source]

Return the nuclear gradients of the electric dipole moment, expressed in atomic units. This is a (3*numAtoms x 3) matrix.

get_input_molecule()[source]

Return a Molecule instance with the coordinates passed into the AMSWorker.

Note that this method may also be used if the calculation producing this AMSWorkerResults object has failed, i.e. ok() is False.

get_main_molecule()[source]

Return a Molecule instance with the final coordinates.

get_main_ase_atoms()[source]

Return an ASE Atoms instance with the final coordinates.

AMSWorkerPool API

class AMSWorkerPool(settings, num_workers, workerdir_root='/tmp', workerdir_prefix='awp', keep_crashed_workerdir=False)[source]

A class representing a pool of AMS worker processes.

All workers of the pool are initialized with the same Settings instance, see the AMSWorker constructor for details.

The number of spawned workers is determined by the num_workers argument. For optimal performance on many small jobs it is recommended to spawn a number of workers equal to the number of physical CPU cores of the machine the calculation is running on, and to let every worker instance run serially:

import psutil

molecules = read_molecules('folder/with/xyz/files')

sett = Settings()
# ... more settings ...
sett.runscript.nproc = 1 # <-- every worker itself is serial (aka export NSCM=1)

with AMSWorkerPool(sett, psutil.cpu_count(logical=False)) as pool:
    results = pool.SinglePoints([ (name, molecules[name]) for name in sorted(molecules) ])

As with the underlying AMSWorker class, the location of the temporary directories can be changed with the workerdir_root and workerdir_prefix arguments.

It is recommended to use the AMSWorkerPool as a context manager in order to ensure that cleanup happens automatically. If it is not possible to use the AMSWorkerPool as a context manager, cleanup should be manually triggered by calling the stop() method.

SinglePoints(items, watch=False, watch_interval=60)[source]

Request to pool to execute single point calculations for all items in the iterable items. Returns a list of AMSWorkerResults objects.

The items argument is expected to be an iterable of 2-tuples (name, molecule) and/or 3-tuples (name, molecule, kwargs), which are passed on to the SinglePoint method of the pool’s AMSWorker instances. (Here kwargs is a dictionary containing the optional keyword arguments and their values for this method.)

If watch is set to True, the AMSWorkerPool will regularly log progress information. The interval between messages can be set with the watch_interval argument in seconds.

As an example, the following call would do single point calculations with gradients and (only for periodic systems) stress tensors for all Molecule instances in the dictionary molecules.

results = pool.SinglePoint([ (name, molecules[name], {
                                 "gradients": True,
                                 "stresstensor": len(molecules[name].lattice) != 0
                              }) for name in sorted(molecules) ])
GeometryOptimizations(items, watch=False, watch_interval=60)[source]

Request to pool to execute geometry optimizations for all items in the iterable items. Returns a list of AMSWorkerResults objects for the optimized geometries.

If watch is set to True, the AMSWorkerPool will regularly log progress information. The interval between messages can be set with the watch_interval argument in seconds.

The items argument is expected to be an iterable of 2-tuples (name, molecule) and/or 3-tuples (name, molecule, kwargs), which are passed on to the GeometryOptimization method of the pool’s AMSWorker instances. (Here kwargs is a dictionary containing the optional keyword arguments and their values for this method.)

stop()[source]

Stops the all worker processes and removes their working directories.

This method should be called when the AMSWorkerPool instance is not used as a context manager and the instance is no longer needed. Otherwise proper cleanup is not guaranteed to happen, worker processes might be left running and files might be left on disk.