Cookbook

This is a collection of code snippets showing how to perform recurrent PLAMS tasks.

Settings and input

Create an input block with an header

These Settings

sett = Settings()
sett.input.ams.SomeInputBlock['_h'] = 'MyHeader'
sett.input.ams.SomeInputBlock.SomeOption = 2

will generate the following text input when used for an AMSJob:

SomeInputBlock MyHeader
    SomeOption 2
End

Create an empty input block

These Settings

sett = Settings()
sett.input.ams.SomeInputBlock

will generate the following text input when used for an AMSJob:

SomeInputBlock
End

Convert an AMS text input into an AMS job

The from_input() function can be used to convert text input into AMSJob instances.

job = AMSJob.from_input(name='jobname', text_input='''
Task SinglePoint
System
    Atoms
        H 0. 0. 0.
        H 0. 0. 1.
    End
End
Engine Band
    Basis
        Type DZ
        Core None
    End
EndEngine
''')

Convert an AMS text input into settings object

The Settings object corresponding to a piece of text input can easily be obtained as the .settings attribute of the AMSJob instance returned by the from_input() method.

settings = AMSJob.from_input("""
Task SinglePoint
Engine Band
    Basis
        Type DZ
        Core None
    End
EndEngine
""").settings

Convert an AMS .run file into an AMS job

The from_inputfile() method can be used to convert a .run file generated by the AMS GUI into a PLAMS AMSJob.

job = AMSJob.from_inputfile('/path/to/job.run')

Note

This function does not work on PLAMS-generated .run files. You can instead use the PLAMS-generated .in file.

Specify paths to files in the input

With PLAMS, you cannot specify relative paths to input files, because every PLAMS job launches in a new directory, which makes the relative paths invalid. To specify an absolute path, use os.path.abspath:

import os

sett = Settings()
sett.input.reaxff.forcefield = os.path.abspath('../my-forcefield.ff')

Restart from a previous job

To use restart features in AMS, for example the EngineRestart, or to read the InitialVelocities from the final velocities of a previous molecular dynamics run, you can use a convenient shortcut and simply assign the job to the corresponding settings entry:

sett = Settings()
sett.input.ams.EngineRestart = (previous_ams_job, 'engine') # resolves to the engine.rkf

sett2 = Settings()
sett2.input.ams.MolecularDynamics.InitialVelocities.Type = 'FromFile'
sett2.input.ams.MolecularDynamics.InitialVelocities.File = previous_ams_job # resolves to ams.rkf

Alternatively, call the rkfpath() method on the previous job’s AMSResults:

sett = Settings()
sett.input.ams.EngineRestart = previous_ams_job.results.rkfpath(file='engine')

sett2 = Settings()
sett2.input.ams.MolecularDynamics.InitialVelocities.Type = 'FromFile'
sett2.input.ams.MolecularDynamics.InitialVelocities.File = previous_ams_job.results.rkfpath()

Molecules

Generate a molecule from a SMILES string

See function from_smiles().

# Compute 10 conformers, optimize with UFF and pick the lowest in energy.
ethane = from_smiles('C-C', nconfs=10, forcefield='uff')[0]

Load all files in a folder as molecules

See function read_molecules().

molecules = read_molecules('/some_path/folder_containing_structure_files/')

for name, mol in molecules.items():
    print("Name of the file (without extension): ", name)
    print(mol)

Counting rings

Rings inside molecules can be counted in various ways, which are not all giving the same results. With the help of the RDKit library, a vast variety of ring counting approaches is readily available. The general approach to using these functions in a PLAMS scripts is to convert your PLAMS Molecule into an RDKit molecule, see the page on the RDKit interface. This is how one searches for the smallest set of rings in a molecule:

# import RDKit
from rdkit import Chem

# create a PLAMS molecule and convert it to an RDKit Mol
dicyclopentadiene = from_smiles('C1C=CC2C1C3CC2C=C3')
rdmol = to_rdmol(dicyclopentadiene)

# Calculate smalles set of rings
for atoms in Chem.GetSymmSSSR(rdmol):
     print ([atom_id for atom_id in atoms], len(atoms))

For more information see also the RDKit manual.

Extracting Results

You can use the following snippets to retrieve results after running the required calculations:

Directly from Functions

Results can be either red from previous calculations (see Accessing Old Jobs) or from an AMSResults instance of a computation just executed within the same workflow. In either case an AMSResults object should be present at runtime:

myAMSJob.run()
myAMSResults = myAMSJob.results if myAMSJob.ok() else None

Warning

Access to any results data should only occur under the condition that AMSJob.ok() indicate a successful termination of the computation.

Examples: Total Energy and Final Structure

Multiple functions of the AMSResults API allow for simple access of the most common results.

myAMSEnergy = myAMSResults.get_energy(unit='au')

myAMSStructure = myAMSResults.get_main_molecule()

AMSResults API Functions

The following members of an AMSResults instance can be used as shown in the above examples to read results:

Property Function Return Type Details
Structure get_molecule(section) Molecule Structure from section
  get_input_molecule() Molecule Input structure
  get_main_molecule() Molecule Final structure from any AMS task
  get_history_molecule(step) Molecule Structure from history section at step # step
Energy get_energy() Float Final energy
Gradients get_gradients() Array (numpy) Gradients from engine calculation
Stress tensor get_stresstensor() Array (numpy) Stress tensor from periodic engine calculation
Hessian get_hessian() Array (numpy) Hessian from frequency calculation (AMS/engine)
Elastic tensor get_elastictensor() Array (numpy) Elastic tensor from periodic calculation
Frequencies get_frequencies() Array (numpy) Vibrational frequencies
Atomic Charges get_charges() Array (numpy) Atomic partial charges
Dipole vector get_dipolemoment() Array (numpy) Electric dipole moment
Nuclear gradients of dipole vector get_dipolegradients() Array (numpy) Nuclear Gradients of Electric dipole moment

From the RKF Interface

Other properties not listed in the table above should be retrieved using the readrkf() function:

myProperty = myAMSResults.readrkf(section, variable)

It is the responsibility of the user to provide the correct names for section and variable under which the required result is stored in the rkf file.

Finding Section/Variable Pairs

Looking up the names of the needed sections and variable within rkf files is typically needed for more intricate properties when writing a new PLAMS workflow. There are two main approaches to search for this information.

From Python Directories

The AMSResults member function get_rkf_skeleton() returns a dictionary containing the available sections as keys and the containing variable names as values

KFBrowser

KFBrowser is a GUI module used to inspect rkf files.

1. Open KFBrowser in the GUI via SCM → KFBrowser
2. By default KFBrowser opens the ams.rkf file. Where neccessary, switch to File → open → <engine>.rkf
3. Press ctrl + e or select File → Expert Mode to display the stored file contents
4. Find the entry of interest. While this is a sometimes not trivial step, most often the required variable is found in either the Properties or AMSResults sections.
5. Once found, the names for section and variable listed in the rkf file directly corresponds to the section/variable pair to be used in the readrkf function as shown above.

Note

When reading results from a different rkf file than ams.rkf the filename has to be specified as:

myEngineProperty = myAMSResults.readrkf(section, variable, file=<engine>)

whereas <engine> corresponds to the file <engine>.rkf present in the calculation directory.

From molecular dynamics trajectories

General MD properties

The KFHistory class can be used to iterate through the History or MDHistory of a trajectory. In this example the energy, temperature and pressure per frame are read and printed.

kf = KFReader(mdjob.results['ams.rkf'])
hist = KFHistory(kf, "History")
mdhist = KFHistory(kf, "MDHistory")

frame = 0
for E, T, p in zip(hist.iter("Energy"), mdhist.iter("Temperature"), mdhist.iter("Pressure")):
    frame += 1
    print("Frame: {} Energy: {} Temperature: {} Pressure: {}".format(frame, E, T, p))

Properties that can be iterated in this way are

Table 1 General properties in section History
Property Return type Unit
Coords List of float bohr
nLatticeVectors Int n.a.
LatticeVectors List of float bohr
Energy Float hartree
Gradients List of float hartree/bohr
StressTensor List of float atomic units

Note

For AMS MD simulations you must set MolecularDynamics.Trajectory.WriteGradients = "True" to store the gradients on the ams.rkf file.

Table 2 General MD properties in section MDHistory
Property Return type Unit
Step Integer n.a.
Time Float fs
TotalEnergy Float Hartree
PotentialEnergy Float Hartree
KineticEnergy Float Hartree
Temperature Float Kelvin
ConservedEnergy Float Hartree
Velocities List of float bohr/fs
Charges List of float n.a.
PressureTensor List of float hartree/bohr3
Pressure Float hartree/bohr3
Density Float dalton/bohr3
Number of molecules Float n.a.

To read a single property into a numpy array, you can run

import numpy as np

# mdjob is a finished AMSJob
coords = mdjob.results.get_history_property('Coords', history_section='History')
coords = np.array(coords).reshape(len(coords), -1, 3) # in bohr
print(coords.shape)

Set history_section='MDHistory' to read from the MDHistory section.

Molecules from trajectories

The coordinates of an MD trajectory can efficiently be obtained by creating an RKFTrajectoryFile. To create an instance of RKFTrajectoryFile, simply pass the according ams.rkf file to it. In this example, the atomic coordinates and lattice vectors are read via RKFTrajectoryFile while the PLAMS Molecule function get_center_of_mass() to calculate the center of mass for every frame.

rkf = RKFTrajectoryFile(mdjob.results['ams.rkf'])
mol = rkf.get_plamsmol()

for i in range(rkf.get_length()):
    coords,cell = rkf.read_frame(i,molecule=mol)
    print(coords, cell, mol.get_center_of_mass())

It is also possible to iterate through the History section of trajectory file. This can be useful in cases were the numbers of atoms is changing per frame or the coordinates per single molecule are needed. Here’s an example where the molecule types present in that particular frame are read for every frame:

kf = KFReader(mdjob.results['ams.rkf'])
mdhist = KFHistory(kf, "MDHistory")
hist = KFHistory(kf, "History")

# get number of distinct molecule types and all their formulas
number_of_molecules = kf.read('Molecules','Num molecules')
formulas = [ kf.read('Molecules',f'Molecule name {i+1}') for i in range(number_of_molecules) ]

for mols, step in zip( hist.iter("Mols.Type"), mdhist.iter("Step")):
    line = f"{step:8d} "
    for i in sorted(set(mols)): line += f"{formulas[i-1]:s} "
    print(line)

Accessing Old Jobs

The following illustrates how to load data from previously executed jobs.

Binding Native PLAMS Jobs

Warning

The jobs should be loaded with a version of PLAMS that is consistent with the version originally used to run the jobs.

From an existing PLAMS working directory with the contents

OLDDIR/
├── OLDJOB1/
|   ├── ams.log
|   ├── ams.rkf
|   ├── OLDJOB1.dill
|   ├── OLDJOB1.err
|   ├── OLDJOB1.in
|   ├── OLDJOB1.out
|   ├── OLDJOB1.run
|   ├── engine.rkf
|   ├── output.xyz
├── input
└── logfile

we can bind an instance of the AMSJob class by making use of the .dill file. The AMSJob object in turn contains a results object, which gives access to the data previously calculated. This can be achieved using the load() function as illustrated in the following snippet:

path       = "OLDDIR/OLDJOB1/OLDJOB1.dill"
single_JOB = load(path)                                       # AMSJob instance
if single_JOB.ok():
   energy     = single_JOB.results.get_energy()               # load the desired properties
   structure  = single_JOB.results.get_main_molecule()
   propertyX  = single_JOB.results.readrkf('AMSResults', 'DipoleMoment', file='engine')

More often than not, the working directory will include multiple individual subdirectories, each containing individual PLAMS job.

OLDDIR/
├── OLDJOB1/
|   ├── ams.log
|   ├── ams.rkf
|   ├── OLDJOB1.dill
|   ├── OLDJOB1.err
|   ├── OLDJOB1.in
|   ├── OLDJOB1.out
|   ├── OLDJOB1.run
|   ├── engine.rkf
|   ├── output.xyz
├── OLDJOB2/
|   ├── ams.log
|   ├── ams.rkf
|   ├── OLDJOB2.dill
|   ├── OLDJOB2.err
|   ├── OLDJOB2.in
|   ├── OLDJOB2.out
|   ├── OLDJOB2.run
|   ├── engine.rkf
|   ├── output.xyz
├── OLDJOB3/
|   ├── ams.log
|   ├── ams.rkf
|   ├── OLDJOB3.dill
|   ├── OLDJOB3.err
|   ├── OLDJOB3.in
|   ├── OLDJOB3.out
|   ├── OLDJOB3.run
|   ├── engine.rkf
|   ├── output.xyz
├── input
└── logfile

These can be loaded using the load_all() function and by providing only the path to the top-level directory:

path       = "OLDDIR"
all_JOBS   = load_all(path)

Note that load_all() wraps the load() function used above and therefore requires existing .dill files in each of the loaded subdirectories. The load_all() function yields a dictionary with the paths of the .dill files as keys and the corresponding job object as values:

print(all_JOBS)
{'/home/user/OLDDIR/OLDJOB1/OLDJOB1.dill': <scm.plams.interfaces.adfsuite.ams.AMSJob object at 0x7f0baad340b8>,
 '/home/user/OLDDIR/OLDJOB2/OLDJOB2.dill': <scm.plams.interfaces.adfsuite.ams.AMSJob object at 0x7f0baacf24a8>,
 '/home/user/OLDDIR/OLDJOB3/OLDJOB3.dill': <scm.plams.interfaces.adfsuite.ams.AMSJob object at 0x7f0baad06cf8>}

We can now access these AMSJob instances:

for this_JOB in all_JOBS.values():
   if this_JOB.ok():
      energy     = this_JOB.results.get_energy()
      structure  = this_JOB.results.get_main_molecule()
      propertyX  = this_JOB.results.readrkf('AMSResults', 'DipoleMoment', file='engine')

Binding old RKF Files

In cases where the .dill files are not available any more, it is still possible to load the contents of previously generated .rkf files into a PLAMS workflow:

path       = "OLDDIR/OLDJOB1/"
ext_JOB    = AMSJob.load_external(path)
if ext_JOB.ok():
   energy     = ext_JOB.results.get_energy()
   structure  = ext_JOB.results.get_main_molecule()

If the .rkf file does originate from some other source than any of the direct AMS engines, also an instance of the more generic SingleJob class can be used:

path       = "OLDDIR/OLDJOB1/ams.rkf"
ext_JOB    = SingleJob.load_external(path)

The downside of this latter approach is that the accessibility to the data is very limited and has to be implemented mostly in terms of pattern-matching searches in the output files.

An alternative way is to make use of the KFReader class:

path       = "OLDDIR/OLDJOB1/ams.rkf"
rkf_reader = KFReader(path)
n_steps    = rkf_reader.read("History", "nEntries")
energy     = rkf_reader.read("History", "Energy({})".format(n_steps))
structure  = rkf_reader.read("History", "Coords({})".format(n_steps))

Note that also the KFReader class lacks most of the shortcut functions of a proper AMSResults object so that the access to the data has to be specified manually.

Parallelization

PLAMS supports running multiple jobs in parallel. Details on the synchronization between parallel job executions can be found here. To make sure your PLAMS script can take maximum advantage of parallel job execution there is a simple rule: Make sure to create and run as many jobs as possible before starting to access any results. (This is because the access to results of a job may block until that job has finished, preventing you to submit more independent jobs in the meantime.)

In the common case where there are no dependencies between jobs, this means that we should set up and run all jobs before starting to access any results. The script below shows how to parallelize the trivially parallel task of just executing the same job on a set of molecules.

mols = read_molecules('my_molecules')
# mols is a dictionary mapping filenames (without
# extension) to plams.Molecule instances, e.g.:
# my_molecules/benzene.xyz would become mols['benzene']

sett = Settings()
# ... all your settings go here ...

config.default_jobrunner = JobRunner(parallel=True, maxjobs=8)
sett.runscript.nproc = 4
# run up to 8 jobs (using 4 cores each) in parallel

jobs    = { n: AMSJob(name=n, settings=sett, molecule=m) for n,m in mols.items() }
results = { n: j.run() for n,j in jobs.items() }
# make and run all jobs before accessing any results

for n,r in results.items():
   print(n, r.get_energy() if r.ok() else 'Failed')

Obviously, runscript.nproc could also be set on a per-job basis. This is useful if some of your jobs do not scale well with the number of CPU cores, or if you have jobs of very different computational cost.