# Automating Workflows¶

Identifying systems with certain physical properties out of a large database of molecules is a typical task that can be easily automatized. In this tutorial we will show how this can be achieved with a simple PLAMS script.

Note: this tutorial focuses on scripting aspects rather than studying physical phenomena. Thus, the computational methods used might not be entirely appropriate to describe the physical properties of interest.

## Introducing the case study¶

Imagine that, out of a large database of test systems, we want to identify those molecules that absorb light with an energy smaller than 4 eV.

A simple approach would be to calculate the excitation energies (and the corresponding oscillator strengths) using ADF’s time-dependent DFT (TD-DFT) for all the systems of the database. As TD-DFT is a computationally expensive method, this procedure can be quite demanding for larger numbers of molecules.

A feasible approach may thus be realized by pre-screening the large database of molecules first, e.g. by using a less accurate but more efficient method like time-dependent DFTB (TD-DFTB). The (computationally expensive) TD-DFT calculations with ADF need then to be conducted only for those systems exhibiting the most promising adsorption energies in the previous TD-DFTB calculations. Thus, the basic workflow may look as follows:

Step 1
• Calculate excitation energies and oscillator strengths with TD-DFTB.
• Select those molecules with electronic excitations of energies below 6 eV and nonvanishing oscillator strengths (since TD-DFTB is less accurate than TD-DFT, we opt for a larger energy threshold in this step). This should significantly reduce the number of molecules to be considered in the following steps.
Step 2
• Optimize the structures of the promising molecules with ADF.
Step 3
• Compute the electronic excitations energies using ADF’s TD-DFT and select the molecules with excitations of energies below 4 eV and significant oscillator strengths.

Depending on your command of the Python language, you may want to read the following Python tutorials before proceeding:

## Workflow script¶

Click here to download a tarball file (workflow_tutorial.tar) containing the scripts workflow.py, my_settings.py and a folder molecules with the molecular structures in .xyz format.

You can extract the files by typing in your terminal:

tar -xf workflow_tutorial.tar


The workflow.py script performs the three steps described above:

# workflow.py
# Workflow automation with PLAMS

import os
import sys
sys.path.append(os.getcwd()) # make sure we can import from the current dir
import my_settings # my_settings.py contains some predefined settings

def get_good_excitations(job, energy, osc_str):
"""Returns the excitations satisfying the condition e < en_thresh and o > osc_thresh.
Units: energies in eV and oscillator strengths in Debye"""
good_excitations = []
# Before reading the results, make sure that the job terminated normally
if job.check():
# The results on the binary KF file are in atomic units
exc = [Units.convert(e,'au','eV')
for e in job.results.readkf('Excitations SS A','excenergies')]
osc = [Units.convert(o,'au','Debye')
for o in job.results.readkf('Excitations SS A','oscillator strengths')]
for e,o in zip(exc,osc):
# Check whether the excitation satisfies our condition:
if e < energy and o > osc_str:
good_excitations.append((e,o))
return good_excitations

def get_opt_molecule(self):
return self.get_molecule('Molecule','Coords')

def get_opt_molecule(self):
return self.get_molecule('Geometry','xyz')

################################ Step 0 ################################

# Create a list of Molecules from the .xyz files in the directory 'molecules':

molecules_dir = 'molecules'

molecules = []
# Loop over all the file in the directory
for mol_name in os.listdir(molecules_dir):
mol = Molecule(os.path.join(molecules_dir,mol_name))
# the properties attribute of the Molecule object is a PLAMS’ Settings object
mol.properties.name = os.path.splitext(mol_name)[0]
molecules.append(mol)

################################ Step 1 ################################

print("Step 1: optimize molecules and calculate excitations with DFTB")

promising_molecules = []
for mol in molecules:
# The settings are defined in the module my_settings,
# which we imported at the beginning of our script
dftb_job = DFTBJob(molecule=mol, settings=my_settings.dftb_GO_plus_excitations(),
name='DFTB_' + mol.properties.name)
dftb_job.run()
good_excitations = get_good_excitations(dftb_job, energy=6.0, osc_str=1.0E-4)
if good_excitations:
print ("Found a molecule with a promising excitation: " + mol.properties.name)
promising_molecules.append(dftb_job.results.get_opt_molecule())

print("Number of promising molecules: %i " % len(promising_molecules))

################################ Step 2 ################################

print("Step 2: Geometry optimizations with ADF")

optimized_molecules = []
for mol in promising_molecules:

# Before reading the results, make sure that the job terminated normally

################################ Step 3 ################################

print("Step 3: Calculate the excitations with ADF")

for mol in optimized_molecules:
if good_excitations:
print ("Found a molecule with a good excitation: " + mol.properties.name)
print ("Excitations (energy [eV], oscillator strength [Debye]:" + str(good_excitations))


Go to the folder workflow_tutorial and execute the script by typing:

\$ADFBIN/plams workflow.py


The calculation will take about a minute and should yield:

Step 1: optimize molecules and calculate excitations with DFTB
[13:34:33] Job DFTB_AlF3 started
...
[13:34:41] Job DFTB_S2Cl2 finished with status 'successful'
Found a molecule with a promising excitation: S2Cl2
Number of promising molecules: 2
Step 2: Geometry optimizations with ADF
...
[13:34:53] Job ADF_GO_S2Cl2 finished with status 'successful'
Step 3: Calculate the excitations with ADF
...
[13:35:18] Job ADF_EXCI_S2Cl2 finished with status 'successful'
Found a molecule with a good excitation: S2Cl2
Excitations (energy [eV], oscillator strength [Debye]):
[(3.426, 0.011), (3.563, 0.015), (3.568, 0.001)]


In this simple example the large database of molecules consists of 5 systems. 2 of the 5 molecules pass the TD-DFTB screening step (Step 1), and eventually only one molecule, S2Cl2, has electronic excitations satisfying our condition of an excitation energy smaller than 4 eV with a significant corresponding oscillator strength.

## Settings library¶

The Settings objects specifying the input option for the computational engines are defined in my_settings.py. We encourage you to develop your own library of input options.

# my_settings.py
# Small library of predefined settings

from scm.plams import *

# Simple immutable settings:

def dftb_GO_plus_excitations():
sett = Settings()
sett.input.DFTB.ResourcesDir = 'QUASINANO2015'
sett.input.Properties.Excitations.TDDFTB.calc = 'singlet'
sett.input.Properties.Excitations.TDDFTB.lowest = 10
return sett

# You can improve flexibility by having optional arguments:

sett = Settings()
sett.input.Basis.Type = basis
sett.input.XC['_1'] = XC
sett.input.NumericalQuality = 'Basic'
sett.input.Geometry
return sett

sett = Settings()
sett.input.Basis.Type = basis
sett.input.XC['_1'] = XC
sett.input.Excitations.lowest = n_excitations
sett.input.Excitations.OnlySing = ''
sett.input.NumericalQuality = 'Basic'
sett.input.Symmetry = 'NoSym'
return sett


## Miscellaneous remarks¶

• Before accessing a Results object of a PLAMS’ job you should assert the correct termination of the corresponding job via the check() member function:

job.run()

if job.check():