Cookbook

This is a collection of code snippets showing how to perform recurrent PLAMS tasks.

Settings and input

Create an input block with an header

These Settings

sett = Settings()
sett.input.ams.SomeInputBlock['_h'] = 'MyHeader'
sett.input.ams.SomeInputBlock.SomeOption = 2

will generate the following text input when used for an AMSJob:

SomeInputBlock MyHeader
    SomeOption 2
End

Create an empty input block

These Settings

sett = Settings()
sett.input.ams.SomeInputBlock

will generate the following text input when used for an AMSJob:

SomeInputBlock
End

Create an input block with repeating keys

These Settings

sett = Settings()
sett.input.ams.constraints.atom = [1,2,3]

will generate the following text input when used for an AMSJob:

Constraints
    Atom 1
    Atom 2
    Atom 3
End

Repeating input block

These Settings

block_1 = Settings()
block_1.SomeOption = 1

block_2 = Settings()
block_2.SomeOption = 7

sett = Settings()
sett.input.ams.SomeInputBlock = [block_1, block_2]

will generate the following text input when used for an AMSJob:

SomeInputBlock
  SomeOption 1
End
SomeInputBlock
  SomeOption 7
End

Convert an AMS text input into an AMS job

The from_input() function can be used to convert text input into AMSJob instances.

job = AMSJob.from_input(name='jobname', text_input='''
Task SinglePoint
System
    Atoms
        H 0. 0. 0.
        H 0. 0. 1.
    End
End
Engine Band
    Basis
        Type DZ
        Core None
    End
EndEngine
''')

Convert an AMS text input into settings object

The Settings object corresponding to a piece of text input can easily be obtained as the .settings attribute of the AMSJob instance returned by the from_input() method.

settings = AMSJob.from_input("""
Task SinglePoint
Engine Band
    Basis
        Type DZ
        Core None
    End
EndEngine
""").settings

Convert an AMS .run file into an AMS job

The from_inputfile() method can be used to convert a .run file generated by the AMS GUI into a PLAMS AMSJob.

job = AMSJob.from_inputfile('/path/to/job.run')

Note

This function does not work on PLAMS-generated .run files. You can instead use the PLAMS-generated .in file.

Specify paths to files in the input

With PLAMS, you cannot specify relative paths to input files, because every PLAMS job launches in a new directory, which makes the relative paths invalid. To specify an absolute path, use os.path.abspath:

import os

sett = Settings()
sett.input.reaxff.forcefield = os.path.abspath('../my-forcefield.ff')

Restart from a previous job

To use restart features in AMS, for example the EngineRestart, or to read the InitialVelocities from the final velocities of a previous molecular dynamics run, you can use a convenient shortcut and simply assign the job to the corresponding settings entry:

sett = Settings()
sett.input.ams.EngineRestart = (previous_ams_job, 'engine') # resolves to the engine.rkf

sett2 = Settings()
sett2.input.ams.MolecularDynamics.InitialVelocities.Type = 'FromFile'
sett2.input.ams.MolecularDynamics.InitialVelocities.File = previous_ams_job # resolves to ams.rkf

Alternatively, call the rkfpath() method on the previous job’s AMSResults:

sett = Settings()
sett.input.ams.EngineRestart = previous_ams_job.results.rkfpath(file='engine')

sett2 = Settings()
sett2.input.ams.MolecularDynamics.InitialVelocities.Type = 'FromFile'
sett2.input.ams.MolecularDynamics.InitialVelocities.File = previous_ams_job.results.rkfpath()

Molecules

Generate a molecule from a SMILES string

See function from_smiles().

# Compute 10 conformers, optimize with UFF and pick the lowest in energy.
ethane = from_smiles('C-C', nconfs=10, forcefield='uff')[0]

Load all files in a folder as molecules

See function read_molecules().

molecules = read_molecules('/some_path/folder_containing_structure_files/')

for name, mol in molecules.items():
    print("Name of the file (without extension): ", name)
    print(mol)

Generate a liquid or gas mixture

# Generate a 2-to-1 water-acetonitrile mixture with a density of 0.9 g/cm^3 and approximately 200 atoms
water = from_smiles('O')
acetonitrile = from_smiles('CC#N')
mixture = packmol(molecules=[water, acetonitrile],
                  mole_fractions=[0.667, 0.333],
                  n_atoms=200,
                  density=0.9)

See Packmol example for more examples on how to construct liquid or gas mixtures and solid/liquid or solid/gas interfaces.

Write an ams.rkf-like trajectory

If you have a list of molecules, it can be convenient to write them to an AMS-like .rkf file so that you can visualize them in the GUI module AMSmovie.

from scm.plams import molecules_to_rkf, from_smiles

molecule_list = [from_smiles('C'), from_smiles('CC')]
molecules_to_rkf(molecule_list, 'output.rkf', overwrite=True)

Convert a trajectory to ams.rkf with bond guessing

See Convert to ams.rkf trajectory with bond guessing

Pre-optimize a molecule

# pre-optimize with UFF inside AMS
optimized_mol = preoptimize(mol)

See Quick jobs for details and options.

Counting rings

Rings inside molecules can be counted in various ways, which are not all giving the same results. With the help of the RDKit library, a vast variety of ring counting approaches is readily available. The general approach to using these functions in a PLAMS scripts is to convert your PLAMS Molecule into an RDKit molecule, see the page on the RDKit interface. This is how one searches for the smallest set of rings in a molecule:

# import RDKit
from rdkit import Chem

# create a PLAMS molecule and convert it to an RDKit Mol
dicyclopentadiene = from_smiles('C1C=CC2C1C3CC2C=C3')
rdmol = to_rdmol(dicyclopentadiene)

# Calculate smalles set of rings
for atoms in Chem.GetSymmSSSR(rdmol):
     print ([atom_id for atom_id in atoms], len(atoms))

For more information see also the RDKit manual.

Extracting Results

You can use the following snippets to retrieve results after running the required calculations:

Directly from Functions

Results can be either red from previous calculations (see Accessing Old Jobs) or from an AMSResults instance of a computation just executed within the same workflow. In either case an AMSResults object should be present at runtime:

myAMSJob.run()
myAMSResults = myAMSJob.results if myAMSJob.ok() else None

Warning

Access to any results data should only occur under the condition that AMSJob.ok() indicate a successful termination of the computation.

Examples: Total Energy and Final Structure

Multiple functions of the AMSResults API allow for simple access of the most common results.

myAMSEnergy = myAMSResults.get_energy(unit='au')

myAMSStructure = myAMSResults.get_main_molecule()

AMSResults API Functions

The following members of an AMSResults instance can be used as shown in the above examples to read results:

Property

Function

Return Type

Details

Structure

get_molecule(section)

Molecule

Structure from section

get_input_molecule()

Molecule

Input structure

get_main_molecule()

Molecule

Final structure from any AMS task

get_history_molecule(step)

Molecule

Structure from history section at step # step

Energy

get_energy()

Float

Final energy

Gradients

get_gradients()

Array (numpy)

Gradients from engine calculation

Stress tensor

get_stresstensor()

Array (numpy)

Stress tensor from periodic engine calculation

Hessian

get_hessian()

Array (numpy)

Hessian from frequency calculation (AMS/engine)

Elastic tensor

get_elastictensor()

Array (numpy)

Elastic tensor from periodic calculation

Frequencies

get_frequencies()

Array (numpy)

Vibrational frequencies

Atomic Charges

get_charges()

Array (numpy)

Atomic partial charges

Dipole vector

get_dipolemoment()

Array (numpy)

Electric dipole moment

Nuclear gradients of dipole vector

get_dipolegradients()

Array (numpy)

Nuclear Gradients of Electric dipole moment

From the RKF Interface

Other properties not listed in the table above should be retrieved using the readrkf() function:

myProperty = myAMSResults.readrkf(section, variable)

It is the responsibility of the user to provide the correct names for section and variable under which the required result is stored in the rkf file.

Finding Section/Variable Pairs

Looking up the names of the needed sections and variable within rkf files is typically needed for more intricate properties when writing a new PLAMS workflow. There are two main approaches to search for this information.

From Python Directories

The AMSResults member function get_rkf_skeleton() returns a dictionary containing the available sections as keys and the containing variable names as values

KFBrowser

KFBrowser is a GUI module used to inspect rkf files.

1. Open KFBrowser in the GUI via SCM → KFBrowser
2. By default KFBrowser opens the ams.rkf file. Where neccessary, switch to File → open → <engine>.rkf
3. Press ctrl + e or select File → Expert Mode to display the stored file contents
4. Find the entry of interest. While this is a sometimes not trivial step, most often the required variable is found in either the Properties or AMSResults sections.
5. Once found, the names for section and variable listed in the rkf file directly corresponds to the section/variable pair to be used in the readrkf function as shown above.

Note

When reading results from a different rkf file than ams.rkf the filename has to be specified as:

myEngineProperty = myAMSResults.readrkf(section, variable, file=<engine>)

whereas <engine> corresponds to the file <engine>.rkf present in the calculation directory.

From molecular dynamics trajectories

General MD properties

The KFHistory class can be used to iterate through the History or MDHistory of a trajectory. In this example the energy, temperature and pressure per frame are read and printed.

kf = KFReader(mdjob.results['ams.rkf'])
hist = KFHistory(kf, "History")
mdhist = KFHistory(kf, "MDHistory")

frame = 0
for E, T, p in zip(hist.iter("Energy"), mdhist.iter("Temperature"), mdhist.iter("Pressure")):
    frame += 1
    print("Frame: {} Energy: {} Temperature: {} Pressure: {}".format(frame, E, T, p))

Properties that can be iterated in this way are

Table 1 General properties in section History

Property

Return type

Unit

Coords

List of float

bohr

nLatticeVectors

Int

n.a.

LatticeVectors

List of float

bohr

Energy

Float

hartree

Gradients

List of float

hartree/bohr

StressTensor

List of float

atomic units

Note

For AMS MD simulations you must set MolecularDynamics.Trajectory.WriteGradients = "True" to store the gradients on the ams.rkf file.

Table 2 General MD properties in section MDHistory

Property

Return type

Unit

Step

Integer

n.a.

Time

Float

fs

TotalEnergy

Float

Hartree

PotentialEnergy

Float

Hartree

KineticEnergy

Float

Hartree

Temperature

Float

Kelvin

ConservedEnergy

Float

Hartree

Velocities

List of float

bohr/fs

Charges

List of float

n.a.

PressureTensor

List of float

hartree/bohr3

Pressure

Float

hartree/bohr3

Density

Float

dalton/bohr3

Number of molecules

Float

n.a.

To read a single property into a numpy array, you can run

import numpy as np

# mdjob is a finished AMSJob
coords = mdjob.results.get_history_property('Coords', history_section='History')
coords = np.array(coords).reshape(len(coords), -1, 3) # in bohr
print(coords.shape)

Set history_section='MDHistory' to read from the MDHistory section.

Molecules from trajectories

The coordinates of an MD trajectory can efficiently be obtained by creating an RKFTrajectoryFile. To create an instance of RKFTrajectoryFile, simply pass the according ams.rkf file to it. In this example, the atomic coordinates and lattice vectors are read via RKFTrajectoryFile while the PLAMS Molecule function get_center_of_mass() to calculate the center of mass for every frame.

rkf = RKFTrajectoryFile(mdjob.results['ams.rkf'])
mol = rkf.get_plamsmol()

for i in range(rkf.get_length()):
    coords,cell = rkf.read_frame(i,molecule=mol)
    print(coords, cell, mol.get_center_of_mass())

It is also possible to iterate through the History section of trajectory file. This can be useful in cases were the numbers of atoms is changing per frame or the coordinates per single molecule are needed. Here’s an example where the molecule types present in that particular frame are read for every frame:

kf = KFReader(mdjob.results['ams.rkf'])
mdhist = KFHistory(kf, "MDHistory")
hist = KFHistory(kf, "History")

# get number of distinct molecule types and all their formulas
number_of_molecules = kf.read('Molecules','Num molecules')
formulas = [ kf.read('Molecules',f'Molecule name {i+1}') for i in range(number_of_molecules) ]

for mols, step in zip( hist.iter("Mols.Type"), mdhist.iter("Step")):
    line = f"{step:8d} "
    for i in sorted(set(mols)): line += f"{formulas[i-1]:s} "
    print(line)

Accessing Old Jobs

The following illustrates how to load data from previously executed jobs.

Binding Native PLAMS Jobs

Warning

The jobs should be loaded with a version of PLAMS that is consistent with the version originally used to run the jobs.

From an existing PLAMS working directory with the contents

OLDDIR/
├── OLDJOB1/
|   ├── ams.log
|   ├── ams.rkf
|   ├── OLDJOB1.dill
|   ├── OLDJOB1.err
|   ├── OLDJOB1.in
|   ├── OLDJOB1.out
|   ├── OLDJOB1.run
|   ├── engine.rkf
|   ├── output.xyz
├── input
└── logfile

we can bind an instance of the AMSJob class by making use of the .dill file. The AMSJob object in turn contains a results object, which gives access to the data previously calculated. This can be achieved using the load() function as illustrated in the following snippet:

path       = "OLDDIR/OLDJOB1/OLDJOB1.dill"
single_JOB = load(path)                                       # AMSJob instance
if single_JOB.ok():
   energy     = single_JOB.results.get_energy()               # load the desired properties
   structure  = single_JOB.results.get_main_molecule()
   propertyX  = single_JOB.results.readrkf('AMSResults', 'DipoleMoment', file='engine')

More often than not, the working directory will include multiple individual subdirectories, each containing individual PLAMS job.

OLDDIR/
├── OLDJOB1/
|   ├── ams.log
|   ├── ams.rkf
|   ├── OLDJOB1.dill
|   ├── OLDJOB1.err
|   ├── OLDJOB1.in
|   ├── OLDJOB1.out
|   ├── OLDJOB1.run
|   ├── engine.rkf
|   ├── output.xyz
├── OLDJOB2/
|   ├── ams.log
|   ├── ams.rkf
|   ├── OLDJOB2.dill
|   ├── OLDJOB2.err
|   ├── OLDJOB2.in
|   ├── OLDJOB2.out
|   ├── OLDJOB2.run
|   ├── engine.rkf
|   ├── output.xyz
├── OLDJOB3/
|   ├── ams.log
|   ├── ams.rkf
|   ├── OLDJOB3.dill
|   ├── OLDJOB3.err
|   ├── OLDJOB3.in
|   ├── OLDJOB3.out
|   ├── OLDJOB3.run
|   ├── engine.rkf
|   ├── output.xyz
├── input
└── logfile

These can be loaded using the load_all() function and by providing only the path to the top-level directory:

path       = "OLDDIR"
all_JOBS   = load_all(path)

Note that load_all() wraps the load() function used above and therefore requires existing .dill files in each of the loaded subdirectories. The load_all() function yields a dictionary with the paths of the .dill files as keys and the corresponding job object as values:

print(all_JOBS)
{'/home/user/OLDDIR/OLDJOB1/OLDJOB1.dill': <scm.plams.interfaces.adfsuite.ams.AMSJob object at 0x7f0baad340b8>,
 '/home/user/OLDDIR/OLDJOB2/OLDJOB2.dill': <scm.plams.interfaces.adfsuite.ams.AMSJob object at 0x7f0baacf24a8>,
 '/home/user/OLDDIR/OLDJOB3/OLDJOB3.dill': <scm.plams.interfaces.adfsuite.ams.AMSJob object at 0x7f0baad06cf8>}

We can now access these AMSJob instances:

for this_JOB in all_JOBS.values():
   if this_JOB.ok():
      energy     = this_JOB.results.get_energy()
      structure  = this_JOB.results.get_main_molecule()
      propertyX  = this_JOB.results.readrkf('AMSResults', 'DipoleMoment', file='engine')

Binding old RKF Files

In cases where the .dill files are not available any more, it is still possible to load the contents of previously generated .rkf files into a PLAMS workflow:

path       = "OLDDIR/OLDJOB1/"
ext_JOB    = AMSJob.load_external(path)
if ext_JOB.ok():
   energy     = ext_JOB.results.get_energy()
   structure  = ext_JOB.results.get_main_molecule()

If the .rkf file does originate from some other source than any of the direct AMS engines, also an instance of the more generic SingleJob class can be used:

path       = "OLDDIR/OLDJOB1/ams.rkf"
ext_JOB    = SingleJob.load_external(path)

The downside of this latter approach is that the accessibility to the data is very limited and has to be implemented mostly in terms of pattern-matching searches in the output files.

An alternative way is to make use of the KFReader class:

path       = "OLDDIR/OLDJOB1/ams.rkf"
rkf_reader = KFReader(path)
n_steps    = rkf_reader.read("History", "nEntries")
energy     = rkf_reader.read("History", "Energy({})".format(n_steps))
structure  = rkf_reader.read("History", "Coords({})".format(n_steps))

Note that also the KFReader class lacks most of the shortcut functions of a proper AMSResults object so that the access to the data has to be specified manually.

Parallelization

Parallel job execution

PLAMS supports running multiple jobs in parallel. Details on the synchronization between parallel job executions can be found here. To make sure your PLAMS script can take maximum advantage of parallel job execution there is a simple rule: Make sure to create and run as many jobs as possible before starting to access any results. (This is because the access to results of a job may block until that job has finished, preventing you to submit more independent jobs in the meantime.)

In the common case where there are no dependencies between jobs, this means that we should set up and run all jobs before starting to access any results. The script below shows how to parallelize the trivially parallel task of just executing the same job on a set of molecules.

mols = read_molecules('my_molecules')
# mols is a dictionary mapping filenames (without
# extension) to plams.Molecule instances, e.g.:
# my_molecules/benzene.xyz would become mols['benzene']

sett = Settings()
# ... all your settings go here ...

config.default_jobrunner = JobRunner(parallel=True, maxjobs=8)
sett.runscript.nproc = 4
# run up to 8 jobs (using 4 cores each) in parallel

jobs    = { n: AMSJob(name=n, settings=sett, molecule=m) for n,m in mols.items() }
results = { n: j.run() for n,j in jobs.items() }
# make and run all jobs before accessing any results

for n,r in results.items():
   print(n, r.get_energy() if r.ok() else 'Failed')

Obviously, runscript.nproc could also be set on a per-job basis. This is useful if some of your jobs do not scale well with the number of CPU cores, or if you have jobs of very different computational cost.

PLAMS scripts under Slurm

A PLAMS script can be run under the Slurm batch system and execute jobs within the resource allocation that is created for the script. The script shown in the previous section should work just fine when run under a batch system. The only change you should make is not to set the maximum number of jobs.

config.default_jobrunner = JobRunner(parallel=True)

Technical

The execution of the job is implemented as a Slurm job step, which may need to wait for free resources before actually starting. As Slurm is taking care of limiting the number of simultaneously executing jobs, we no longer need to do that through the PLAMS JobRunner and can therefore skip the maxjobs argument of its constructor.

Furthermore you may need to wrap the call to the PLAMS launch script in a job script, in which we recommend you cd to the directory from which the job was submitted. This makes sure you will find the PLAMS working directory in the normal location. (If $AMSBIN is not already in your environment, this is also the place to set up the AMS environment.)

#!/bin/sh
cd "$SLURM_SUBMIT_DIR"
$AMSBIN/plams myscript.plms

The above job script can then simply be submitted to the batch system:

sbatch [...] myscript.sh

Alternatively you can also skip the job script, and submit the PLAMS launch script itself:

sbatch [...] --chdir=. $AMSBIN/plams myscript.plms

Warning

The integration of PLAMS, AMS and Slurm will only work on Slurm versions >=15. Furthermore AMS needs to use an MPI implementation that is integrated with the Slurm. This is the case for the IntelMPI builds of AMS, but not the OpenMPI builds. Please refer to the installation manual for details on the capabilities of the different MPI versions. If the batch system integration does not work for you, you can still run PLAMS scripts via the batch system, but you will be restricted to running on a single node and will need to use the maxjobs argument in the constructor of the JobRunner to limit the number of simultaneously running jobs.