4.1. ADF Suite

PLAMS offers interfaces to three main binaries of the ADF Suite: ADF, BAND and DFTB as well as some other small utility binaries like DENSF of FCF. All possible input keywords and options are covered, as well as extraction of arbitrary data from binary files (called KF files) produced by these programs.

4.1.1. ADF, BAND and DFTB

ADF, BAND and DFTB are of course very different programs, but from our perspective they are rather similar. Their input files follow a common structure of blocks and subblocks. They store results as binary files in KF format and and they print human-readable summary of calculation to the standard output. They also share command line arguments, error messages etc. Thanks to that Python code responsible for creating, running and examining jobs with ADF, BAND and DFTB jobs overlaps a lot and can be grouped together in abstract classes. SCMJob and SCMResults are subclasses of, respectively, SingleJob and Results and serve as bases for concrete classes: ADFJob, BANDJob, DFTBJob, ADFResults, BANDResults and DFTBResults. Code contained in these concrete classes describes small technical differences and is used only internally, so they are omitted in the API specification below. From user perspective they all follow the common interface defined by SCMJob and SCMResults. That means in your scripts you would create instances of ADFJob, BANDJob or DFTBJob, but methods that you can use with them (and their corresponding results) can be taken from SCMJob and SCMResults.

4.1.1.1. Preparing input

Although input files for ADF, BAND and DFTB use different sets of keywords, they all have the same logical structure – they consist of blocks and subblocks containg keys and values. That kind of structure can be easily reflected by Settings objects since they are built in a similar way.

The input file is generated based on input branch of job’s Settings. All data present there is translated to input contents. Nested Settings instances define blocks and subblocks, as in the example below:

>>> myjob = ADFJob(molecule=Molecule('water.xyz'))
>>> myjob.settings.input.basis.type = 'DZP'
>>> myjob.settings.input.basis.core = 'None'
>>> myjob.settings.input.basis.createoutput = 'None'
>>> myjob.settings.input.scf.iterations = 100
>>> myjob.settings.input.scf.converge = '1.0e-06 1.0e-06'
>>> myjob.settings.input.save = 'TAPE13'

Input file created during execution of myjob looks like:

atoms
    #coordinates from water.xyz
end

basis
  createoutput None
  core None
  type DZP
end

save TAPE13

scf
  converge 1.0e-06 1.0e-06
  iterations 100
end

As you can see, entries present in myjob.settings.input. are listed in the alphabetical order. If an entry is a regular key-value pair it is printed in one line (like save TAPE13 above). If an entry is a nested Settings instance it is printed as a block and entries in this instance correspond to contents of a the block. Both keys and values are kept in their original case. Strings put as values can contain spaces like converge above – the whole string is printed after the key. That allows to handle lines that need to contain more than one key=value pair. If you need to put a key without any value, True or empty string can be given as a value:

>>> myjob.settings.input.geometry.SP = True
>>> myjob.settings.input.writefock = ''
# translates to:
geometry
  SP
end

writefock

If a value of a particualr key is False, that key is omitted. To produce an empty block simply type:

>>> myjob.settings.input.geometry  # this is equivalent to myjob.settings.input.geometry = Settings()
#
geometry
end

The algorithm translating Settings contents into input file does not check the correctness of the data - it simply takes keys and values from Settings instance and puts them in the text file. Due to that you are not going to be warned if you make a typo, use wrong keyword or improper syntax. Beware of that.

>>> myjob.settings.input.dog.cat.apple = 'pear'
#
dog
  cat
    apple pear
  subend
end

Some blocks require (or allow) some data to be put in the header line, next to the block name. Special key _h is helpful in these situations:

>>> myjob.settings.input.someblock._h = 'header=very important'
>>> myjob.settings.input.someblock.key1 = 'value1'
>>> myjob.settings.input.someblock.key2 = 'value2'
#
someblock header=very important
  key1 value1
  key2 value2
end

The order of blocks within input file and subblocks within a parent block follows Settings iteration order which is lexicographical (however, SCMJob is smart enough to put blocks like DEFINE or UNITS at the top of the input). In rare cases you would want to override this order, for example when you supply ATOMS block manually, which can be done when automatic molecule handling is disabled (see below). That behavior can be achieved by another type of special key:

>>> myjob.settings.input.block._1 = 'entire line that has to be the first line of block'
>>> myjob.settings.input.block._2 = 'second line'
>>> myjob.settings.input.block._4 = 'I will not be printed'
>>> myjob.settings.input.block.key1 = 'value1'
>>> myjob.settings.input.block.key2 = 'value2'
#
block
  entire line that has to be the first line of block
  second line
  key1 value1
  key2 value2
end

Sometimes one needs to put more instances of the same key within one block, like for example in CONSTRAINTS block in ADF. It can be done by using list of values instead of a single value:

>>> myjob.settings.input.constraints.atom = [1,5,4]
>>> myjob.settings.input.constraints.block = ['ligand', 'residue']
#
constraints
  atom 1
  atom 5
  atom 4
  block ligand
  block residue
end

Finally, in some rare cases key and value pair in the input needs to be printed in a form key=value instead of key value. When value is a string starting with the equal sign, no space is inserted between key and value:

>>> myjob.settings.input.block.key = '=value'
#
block
  key=value
end

Sometimes a value of a key in the input file needs to be a path to some file, usually KF file with results of some previous calculation. Of course such a path can be given explicitly newjob.restart = '/home/user/science/plams.12345/oldjob/oldjob.t21', but for user’s convenience instances of SCMJob or SCMResults (or directly KFFile) can be also used. Algorithm will detect it and use an absolute path to the main KF file instead:

>>> myjob.settings.input.restart = oldjob
>>> myjob.settings.input.fragment.frag1 = fragjob
#
restart /home/user/science/plams.12345/oldjob/oldjob.t21
fragment
  frag1 /home/user/science/fragmentresults/somejob/somejob.t21
end

Molecule instance stored in job’s molecule attribute is automatically processed during the input file preparation and printed in the proper format, depending on the program. It is possible to disable that and give molecular coordinates explicitly as entries in myjob.settings.input.. Automatic molecule processing can be turned off by myjob.settings.ignore_molecule = True.

4.1.1.2. Special atoms in ADF

In ADF atomic coordinates in atoms block can be enriched with some additional information like special names of atoms (for example in case of using different isotopes) or block/fragment membership. Since usually contents of atoms block are generated automatically based on the Molecule associated with a job, this information needs to be supplied inside the given Molecule instance. Details about every atom can be adjusted separately, by modifying attributes of a particular Atom instance according to the following convention:

  • Atomic symbol is generated based on atomic number stored in atnum attribute of a corresponding Atom. Atomic number 0 corresponds to the “dummy atom” for which the symbol is empty.
  • If an attribute ghost of an Atom is True, the above atomic symbol is prefixed with Gh..
  • If an Atom has an attribute name its contents are added after the symbol. Hence setting atnum to 0 and adjusting name allows to put an arbitrary string as the atomic symbol.
  • If an Atom has an attribute fragment its contents are added after atomic coordinates with f= prefix.
  • If an Atom has an attribute block its contents are added after atomic coordinates with b= prefix.

The following example illustrates the usage of this mechanism:

>>> mol = Molecule('xyz/Ethanol.xyz')
>>> mol[0].ghost = True
>>> mol[1].name = 'D'
>>> mol[2].ghost = True
>>> mol[2].name = 'T'
>>> mol[3].atnum = 0
>>> mol[3].name = 'J.XYZ'
>>> mol[4].atnum = 0
>>> mol[4].name = 'J.ASD'
>>> mol[4].ghost = True
>>> mol[5].fragment = 'myfragment'
>>> mol[6].block = 'block1'
>>> mol[7].fragment = 'frag'
>>> mol[7].block = 'block2'
>>> myjob = ADFJob(molecule=mol)
#
atoms
      1      Gh.C       0.01247       0.02254       1.08262
      2       C.D      -0.00894      -0.01624      -0.43421
      3    Gh.H.T      -0.49334       0.93505       1.44716
      4     J.XYZ       1.05522       0.04512       1.44808
      5  Gh.J.ASD      -0.64695      -1.12346       2.54219
      6         H       0.50112      -0.91640      -0.80440 f=myfragment
      7         H       0.49999       0.86726      -0.84481 b=block1
      8         H      -1.04310      -0.02739      -0.80544 f=frag b=block2
      9         O      -0.66442      -1.15471       1.56909
end

4.1.1.3. Preparing runscript

Runscripts for ADF, BAND and DFTB are very simple - they are just single execution of one of the binaries with proper standard input and output handling. The number of parallel processes (-n parameter) can be adjusted with myjob.settings.runscript.nproc.

4.1.1.4. Results extraction

All three programs print results to the standard output. The output file can be examined with standard text processing tools (grep_output() and awk_output()). Besides that all calculation details are saved in the binary file in KF format. This file is called TAPE21 for ADF, RUNKF for BAND and dftb.rkf for DFTB. PLAMS renames those files to, respectively [jobname].t21, [jobname].runkf and [jobname].rkf. Data stored in those files can be accessed using additional methods defined in SCMResults class.

4.1.1.5. API

class SCMJob(molecule=None, name='plamsjob', settings=None, depend=None)[source]

Abstract class gathering common mechanisms for jobs with ADF Suite programs.

static _atom_symbol(atom)[source]

Return the atomic symbol of atom. Ensure proper formatting for ADFSuite input taking into account ghost and name entries in properties of atom.

_remove_mol()[source]

Remove from settings.input all entries added by _serialize_mol(). Abstract method.

_serialize_input(special)[source]

Transform all contents of setting.input branch into string with blocks, keys and values.

On the highest level alphabetic order of iteration is modified: keys occuring in attribute _top are printed first. Special values can be indicated with special argument, which should be a dictionary having types of objects as keys and functions translating these types to strings as values.

Automatic handling of molecule can be disabled with settings.ignore_molecule = True.

_serialize_mol()[source]

Process Molecule instance stored in molecule attribute and add it as relevant entries of settings.input branch. Abstract method.

check()[source]

Check if termination status variable from General section of main KF file equals NORMAL TERMINATION.

get_runscript()[source]

Generate a runscript. Returned string is of the form:

$ADFBIN/name [-n nproc] <jobname.in [>jobname.out]

name is taken from the class attribute _command. -n flag is added if settings.runscript.nproc exists. [>jobname.out] is used based on settings.runscript.stdout_redirect.

hash_input()[source]

Calculate the hash of the input file.

All instances of SCMJob or SCMResults present as values in settings.input branch are replaced with hashes of corresponding job’s inputs.

class SCMResults(job)[source]

Abstract class gathering common mechanisms for results of ADF Suite programs.

_atomic_numbers_input_order()[source]

Return a list of atomic numbers, in the input order. Abstract method.

_export_attribute(attr, other)[source]

If attr is a KF file take care of a proper path. Otherwise use parent method. See Results._copy_to for details.

_get_single_value(section, variable, output_unit, native_unit='au')[source]

A small method template for all the single number “get_something()” methods extracting data from main KF file. Returned value is converted from native_unit to output_unit.

_int2inp()[source]

Obtain mapping from internal atom order to the input one. Abstract method.

_kfpath()[source]

Return the absolute path to the main KF file.

_kfpresent()[source]

Check if this instance has a valid _kf attribute.

collect()[source]

Collect files present in the job folder. Use parent method from Results, then create an instance of KFFile for the main KF file and store it as _kf attribute.

get_molecule(section, variable, unit='bohr', internal=False, n=1)[source]

Read molecule coordinates from section/variable of the main KF file.

Returned Molecule instance is created by copying a molecule from associated SCMJob instance and updating atomic coordinates with values read from section/variable. The format in which coordinates are stored is not consistent for all programs or even for different sections of the same KF file. Sometimes coordinates are stored in bohr, sometimes in angstrom. The order of atoms can be either input order or internal order. These settings can be adjusted with unit and internal parameters. Some variables store more than one geometry, in those cases n can be used to choose the preferred one.

get_properties()[source]

Return a dictionary with all the entries from Properties section in the main KF file.

newkf(filename)[source]

Create new KFFile instance using file filename in the job folder.

Example usage:

>>> res = someadfjob.run()
>>> tape13 = res.newkf('$JN.t13')
>>> print(tape13.read('Geometry', 'xyz'))
readkf(section, variable)[source]

Read data from section/variable of the main KF file.

The type of the returned value depends on the type of variable defined inside KF file. It can be: single int, list of ints, single float, list of floats, single boolean, list of booleans or string.

refresh()[source]

Refresh the contents of files list. Use parent method from Results, then look at all attributes that are instances of KFFile and check if they point to existing files. If not, try to reinstantiate them with current job path (that can happen while loading a pickled job after the entire job folder was moved).

4.1.2. Other tools: Densf, FCF

Apart from main computational programs mentioned above, ADFSuite offers a range of small utility tools that can be used to obtain more specific results. Those tools usually base on the prior run of one of the main programs and need the KF file produced by them as a part of the input.

From the functional point of view these tools are very similar to ADF, BAND and DFTB. Their results are stored in KF files and their input files follow the same structure of blocks, keys and values. Because of that the same classes (SCMJob and SCMResults) are used as bases and hence preparation, running and results extraction for utility tools follow the rules described above, in ADF, BAND and DFTB

The main difference is that usually utility jobs don’t need molecular coordinates as part of the input (they extract this information from previous calculation’s KF file). So no Molecule instance is needed and the molecule attribute of the job object is simply ignored. Because of that get_molecule() method does not work with FCFResults, DensfResults etc.

Below you can find the list of dedicated job classes that are currently available. Details about input specification for those jobs can be found in corresponding part of ADF Suite documentation.

class FCFJob(inputjob1=None, inputjob2=None, name='plamsjob', settings=None, depend=None)[source]

A class representing calculation of Franck-Condon factors using fcf program.

Two new attributes are introduced: inputjob1 and inputjob2. They are used to supply KF files from previous runs to fcf program. The value can either be a string with a path to KF file or an instance of any type of SCMJob or SCMResults (in this case the path to corresponding KF file will be extracted automatically). If the value of inputjob1 or inputjob2 is None, no automatic handling occurs and user needs to manually supply paths to input jobs using proper keywords placed in myjob.settings.input (STATES or STATE1 and STATE2).

The resulting TAPE61 file is renamed to jobname.t61.

class DensfJob(inputjob=None, name='plamsjob', settings=None, depend=None)[source]

A class representing calculation of molecular properties on a grid using densf program.

A new attribute inputjob is introduced to supply KF file from previously run job. The value can either be a string with a path to KF file or an instance of any type of SCMJob or SCMResults (in this case the path to corresponding KF file will be extracted automatically). If the value of inputjob is None, no automatic handling occurs and user needs to manually supply path to input job using INPUTFILE keyword placed in myjob.settings.input.

The resulting TAPE41 file is renamed to jobname.t41.

4.1.3. KF files

KF is the main format for storing binary data used in all ADFSuite programs. PLAMS offers an easy and efficient way of accessing the data stored in existing KF files, as well as modifying and creating them.

4.1.3.1. KFFile

class KFFile(path, autosave=True)[source]

A class for reading and writing binary files in KF format.

This class acts as a wrapper around KFReader collecting all the data written by user in some “temporary zone” and using Fortran binaries udmpkf and cpkf to write this data to the physical file when needed.

The constructor argument path should be a string with a path to an existing KF file or a new KF file that you wish to create. If a path to existing file is passed, new KFReader instance is created allowing to read all the data from this file.

When write() method is used, the new data is not immediately written to a disk. Instead of that, it is temporarily stored in tmpdata dictionary. When method save() is invoked, contents of that dictionary are written to a physical file and tmpdata is emptied.

Other methods like read() or delete_section() are aware of tmpdata and work flawlessly, regardless if save() was called or not.

By default, save() is automatically invoked after each write(), so physical file on a disk is always “actual”. This behavior can be adjusted with autosave constructor parameter. Having autosave enabled is usually a good idea, however, if you need to write a lot of small pieces of data to your file, the overhead of calling udmpkf and cpkf after every write() can lead to significant delays. In such a case it is advised to disable autosave and call save() manually, when needed.

Dictionary-like bracket notation can be used as a shortcut to read and write variables:

mykf = KFFile('someexistingkffile.kf')
#all three below are equivalent
x = mykf['General%Termination Status']
x = mykf[('General','Termination Status')]
x = mykf.read('General','Termination Status')

#all three below are equivalent
mykf['Geometry%xyz'] = somevariable
mykf[('Geometry','xyz')] = somevariable
mykf.write('Geometry','xyz', somevariable)
read(section, variable)[source]

Extract and return data for a variable located in a section.

For single-value numerical or boolean variables returned value is a single number or bool. For longer variables this method returns a list of values. For string variables a single string is returned.

write(section, variable, value)[source]

Write a variable with a value in a section . If such a variable already exists in this section, the old value is overwritten.

save()[source]

Save all changes stored in tmpdata to physical file on a disk.

delete_section(section)[source]

Delete the entire section from this KF file.

sections()[source]

Return a list with all section names, ordered alphabetically.

read_section(section)[source]

Return a dictionary with all variables from a given section.

Note

Some sections can contain very large amount of data. Turning them into dictionaries can cause memory shortage or performance issues. Use this method carefully.

__getitem__(name)[source]

Allow to use x = mykf['section%variable'] or x = mykf[('section','variable')] instead of x = kf.read('section', 'variable').

__setitem__(name, value)[source]

Allow to use mykf['section%variable'] = value or mykf[('section','variable')] = value instead of kf.write('section', 'variable', value).

__iter__()[source]

Iteration yields pairs of section name and variable name.

static _split(name)[source]

Ensure that a key used in bracket notation is of the form 'section%variable' or ('section','variable'). If so, return a tuple ('section','variable').

static _str(val)[source]

Return a string representation of val in the form that can be understood by udmpkf.

4.1.3.2. KFReader

class KFReader(path, blocksize=4096, autodetect=True)[source]

A class for efficient Python-native reader of binary files in KF format.

This class offers read-only access to any fragment of data from a KF file. Unlike other Python KF readers, this one does not use the Fortran binary dmpkf to process KF files, but instead reads and interprets raw binary data straight from the file, on Python level. That approach results in significant speedup (by a factor of few hundreds for large files extracted variable by variable).

The constructor argument path should be a string with a path (relative or absolute) to an existing KF file.

blocksize indicates the length of basic KF file block. So far, all KF files produced by any of ADFSuite programs have the same block size of 4096 bytes. Unless you’re doing something very special, you should not touch this value.

Organization of data inside KF file can depend on a machine on which this file was produced. Two parameters can vary: the length of integer (32 or 64 bit) and endian (little or big). These parameters have to be determined before any reading can take place, otherwise the results will have no sense. If the constructor argument autodetect is True, the constructor attempts to automatically detect the format of a given KF file, allowing to read files created on a machine with different endian or integer length. This automatic detection is enabled by default and it is advised to leave it that way. If you wish to disable it, you should set endian and word attributes manually before reading anything (see the code for details).

Note

This class consists of quite technical, low level code. If you don’t need to modify or extend KFReader, you can safely ignore all private methods, all you need is read() and occasionally __iter__()

read(section, variable)[source]

Extract and return data for a variable located in a section.

For single-value numerical or boolean variables returned value is a single number or bool. For longer variables this method returns a list of values. For string variables a single string is returned.

__iter__()[source]

Iteration yields pairs of section name and variable name.

_autodetect()[source]

Try to automatically detect the format (int size and endian) of this KF file.

_read_block(f, pos)[source]

Read a single block of binary data from posistion pos in file f.

_parse(block, format)[source]

Translate a block of binary data into list of values in specified format.

format should be a list of pairs (a,t) where t is one of the following characters: 's' for string, 'i' for 32-bit integer, 'q' for 64-bit integer and a is the number of occurrences (or length of a string).

For example, if format is equal to [(32,'s'),(4,'i'),(2,'d'),(2,'i')], the contents of block are divided into 72 bytes (32*1 + 4*4 + 2*8 + 2*4 = 72) chunks (possibly droping the last one, if it’s shorter than 72 bytes). Then each chunk is translated to a 9-tuple of string, 4 ints, 2 floats and 2 ints. List of such tuples is the returned value.

_get_data(datablock)[source]

Extract all data from a single data block. Returned value is a 4-tuple of lists, one list for each data type (respectively: int, float, str, bool).

_create_index()[source]

Find and parse relevant index blocks of KFFile to extract the information about location of all sections and variables.

Two dictionaries are populated during this process. _data contains, for each section, a list of triples describing how logical blocks of data are mapped into physical ones. For example, _data['General'] = [(3,6,12), (9,40,45)] means that logical blocks 3-8 of section General are located in physical blocks 6-11 and logical blocks 9-13 in physical blocks 40-44. This list is always sorted via first tuple elements allowing efficient access to arbitrary logical block of each section.

The second dictionary, _sections, is used to locate each variable within its section. For each section, it contains another dictionary of each variable of this section. So _section[sec][var] contains all information needed to extract variable var from section sec. This is a 4-tuple containing the following information: variable type, logic block in which the variable first occurs, position within this block where its data start and the length of the variable. Combining this information with mapping stored in _data allows to extract each single variable.

static _datablocks(lst, n=1)[source]

Transform a list of tuples [(x1,a1,b1),(x2,a2,b2),...] into an iterator over range(a1,b1)+range(a2,b2)+... Iteration starts from nth element of this list.