Summarizing PLAMS Jobs with JobAnalysis¶

This example demonstrates the functionality of the JobAnalysis tool, as described under Summarizing PLAMS Jobs with JobAnalysis.

Downloads: Notebook | Script ?

Requires: AMS2026 or later

Create Example Jobs¶

To begin with, we create a variety of AMS jobs with different settings, tasks, engines and calculation types.

This allows us to generate diverse example single point/geometry optimization calculations with DFTB, ADF etc.

from scm.plams import from_smiles, AMSJob, PlamsError, Settings, Molecule, Atom
from scm.base import ChemicalSystem
from scm.input_classes.drivers import AMS
from scm.input_classes.engines import DFTB
from scm.utils.conversions import plams_molecule_to_chemsys


def example_job_dftb(smiles, task, use_chemsys=False):
    # Generate molecule from smiles
    mol = from_smiles(smiles)
    if use_chemsys:
        mol = plams_molecule_to_chemsys(mol)

    # Set up calculation settings using PISA
    sett = Settings()
    sett.runscript.nproc = 1
    driver = AMS()
    driver.Task = task
    driver.Engine = DFTB()
    sett.input = driver
    return AMSJob(molecule=mol, settings=sett, name="dftb")


def example_job_adf(smiles, task, basis, gga=None, use_chemsys=False):
    # Generate molecule from smiles
    mol = from_smiles(smiles)
    if use_chemsys:
        mol = plams_molecule_to_chemsys(mol)

    # Set up calculation settings using standard settings
    sett = Settings()
    sett.runscript.nproc = 1
    sett.input.AMS.Task = task
    sett.input.ADF.Basis.Type = basis
    if gga:
        sett.input.ADF.XC.GGA = gga
    return AMSJob(molecule=mol, settings=sett, name="adf")


def example_job_neb(iterations, use_chemsys=False):
    # Set up molecules
    main_molecule = Molecule()
    main_molecule.add_atom(Atom(symbol="C", coords=(0, 0, 0)))
    main_molecule.add_atom(Atom(symbol="N", coords=(1.18, 0, 0)))
    main_molecule.add_atom(Atom(symbol="H", coords=(2.196, 0, 0)))
    final_molecule = main_molecule.copy()
    final_molecule.atoms[1].x = 1.163
    final_molecule.atoms[2].x = -1.078

    mol = {"": main_molecule, "final": final_molecule}

    if use_chemsys:
        mol = {k: plams_molecule_to_chemsys(v) for k, v in mol.items()}

    # Set up calculation settings
    sett = Settings()
    sett.runscript.nproc = 1
    sett.input.ams.Task = "NEB"
    sett.input.ams.NEB.Images = 9
    sett.input.ams.NEB.Iterations = iterations
    sett.input.DFTB

    return AMSJob(molecule=mol, settings=sett, name="neb")

Now, we create a selection of jobs covering different systems and settings:

from scm.plams import config, JobRunner

config.default_jobrunner = JobRunner(parallel=True, maxthreads=8)

smiles = ["CC", "C", "O", "CO"]
tasks = ["SinglePoint", "GeometryOptimization"]
engines = ["DFTB", "ADF"]
jobs = []
for i, s in enumerate(smiles):
    for j, t in enumerate(tasks):
        job_dftb = example_job_dftb(s, t, use_chemsys=i % 2)
        job_adf1 = example_job_adf(s, t, "DZ", use_chemsys=True)
        job_adf2 = example_job_adf(s, t, "TZP", "PBE")
        jobs += [job_dftb, job_adf1, job_adf2]

job_neb1 = example_job_neb(10)
job_neb2 = example_job_neb(100, use_chemsys=True)
jobs += [job_neb1, job_neb2]

for j in jobs:
    j.run()

for j in jobs:
    j.ok()

[23.03|10:47:05] JOB dftb STARTED
[23.03|10:47:05] JOB adf STARTED
[23.03|10:47:05] JOB adf STARTED
[23.03|10:47:05] JOB dftb STARTED
[23.03|10:47:05] JOB adf STARTED
[23.03|10:47:05] JOB adf STARTED
[23.03|10:47:05] Renaming job adf to adf.002
[23.03|10:47:05] JOB dftb STARTED
[23.03|10:47:05] Renaming job dftb to dftb.002
[23.03|10:47:05] JOB adf STARTED
... output trimmed ....
[23.03|10:47:31] JOB adf.008 SUCCESSFUL
[23.03|10:47:37] JOB adf.015 FINISHED
[23.03|10:47:37] JOB adf.015 SUCCESSFUL
[23.03|10:47:42] JOB adf.003 FINISHED
[23.03|10:47:42] JOB adf.003 SUCCESSFUL
[23.03|10:47:42] Waiting for job adf.004 to finish
[23.03|10:47:53] JOB adf.016 FINISHED
[23.03|10:47:53] JOB adf.016 SUCCESSFUL
[23.03|10:48:12] JOB adf.004 FINISHED
[23.03|10:48:12] JOB adf.004 SUCCESSFUL

Job Analysis¶

The JobAnalysis tool can be used to extract data from a large number of jobs, and analyse the results.

Adding and Loading Jobs¶

Jobs can be loaded by passing job objects directly to the JobAnalysis, or alternatively loading from a path. This latter option is useful for loading jobs run previously in other scripts.

from scm.plams import JobAnalysis

ja = JobAnalysis(jobs=jobs)
# ja = JobAnalysis(paths=[j.path for j in jobs]) # alternatively load jobs from a set of paths

Additional jobs can also be added or removed after initialization of the JobAnalysis tool.

extra_job = example_job_dftb("CCC", "SinglePoint")
extra_job.run()
extra_job.ok()

ja = ja.add_job(extra_job)

[23.03|10:58:08] JOB dftb STARTED
[23.03|10:58:08] Waiting for job dftb to finish
[23.03|10:58:08] Renaming job dftb to dftb.009
[23.03|10:58:08] JOB dftb.009 RUNNING
[23.03|10:58:08] JOB dftb.009 FINISHED
[23.03|10:58:09] JOB dftb.009 SUCCESSFUL

The loaded jobs and the initial analysis fields can be shows by displaying the JobAnalysis table:

ja.display_table()

Path	Name	OK	Check	ErrorMsg
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/dftb	dftb	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.002	adf.002	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf	adf	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/dftb.002	dftb.002	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.003	adf.003	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.004	adf.004	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/dftb.003	dftb.003	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.005	adf.005	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.006	adf.006	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/dftb.004	dftb.004	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.007	adf.007	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.008	adf.008	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/dftb.005	dftb.005	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.009	adf.009	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.010	adf.010	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/dftb.006	dftb.006	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.011	adf.011	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.012	adf.012	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/dftb.007	dftb.007	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.013	adf.013	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.014	adf.014	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/dftb.008	dftb.008	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.015	adf.015	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/adf.016	adf.016	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/neb	neb	False	False	NEB optimization did NOT converge
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/neb.002	neb.002	True	True	None
/Users/ormrodmorley/Documents/code/ams/amshome/userdoc/PythonExamples/job-analysis/plams_workdir.004/dftb.009	dftb.009	True	True	None

Adding and Removing Fields¶

On initialization, some analysis fields are automatically included in the analysis (Path, Name, OK, Check and ErrorMsg). These are useful to see which jobs were loaded, and whether they succeeded. However, one or more of these fields can be removed with the remove_field method.

ja = ja.remove_field("Path")

A range of other common fields can be added with the add_standard_field(s) method.

ja = ja.add_standard_fields(["Formula", "Smiles", "CPUTime", "SysTime"])

In addition, all fields deriving from the job input settings can be added with the add_settings_input_fields method. By default, these will have names corresponding to the concatenated settings entries. Individual settings field can be added with the add_settings_field method. This is useful to see the differences in the input settings of various jobs which may have succeeded/failed.

ja = ja.add_settings_input_fields()

For output results, fields from the rkfs can be added with the add_rkf_field method, using a specified rkf file (default ams.rkf), section and variable.

ja = ja.add_rkf_field("General", "engine")

Finally, custom fields can also be added with the add_field method, by defining a field key, value accessor and optional arguments like the display name and value formatting. This is most useful to extract results from jobs using built-in methods on the job results class.

ja = ja.add_field(
    "Energy",
    lambda j: j.results.get_energy(unit="kJ/mol"),
    display_name="Energy [kJ/mol]",
    fmt=".2f",
)
ja = ja.add_field("AtomType", lambda j: [at.symbol for at in j.results.get_main_molecule()])
ja = ja.add_field("Charge", lambda j: j.results.get_charges())

ja.display_table(max_rows=5)

Name	OK	Check	ErrorMsg	Formula	Smiles	CPUTime	SysTime	InputAmsTask	InputAdfBasisType	InputAdfXcGga	InputAmsNebImages	InputAmsNebIterations	AmsGeneralEngine	Energy [kJ/mol]	AtomType	Charge
dftb	True	True	None	C2H6	CC	0.211799	0.023247	SinglePoint	None	None	None	None	dftb	-19594.01	[‘C’, ‘C’, ‘H’, ‘H’, ‘H’, ‘H’, ‘H’, ‘H’]	[-0.07293185 -0.07372966 0.0229089 0.02613236 0.02365099 0.02302581
0.02508041 0.02586303]
adf.002	True	True	None	C2H6	CC	3.853033	0.318997	SinglePoint	DZ	None	None	None	adf	-3973.29	[‘C’, ‘C’, ‘H’, ‘H’, ‘H’, ‘H’, ‘H’, ‘H’]	[-0.83243445 -0.83187828 0.27390333 0.28078289 0.27678388 0.27344691
0.27910823 0.28028749]
…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…	…
neb	False	False	NEB optimization did NOT converge	: CHN, final: CHN	: C=N, final: C#N	0.474088	0.086321	NEB	None	None	9	10	dftb	None	[‘C’, ‘N’, ‘H’]	None
neb.002	True	True	None	: CHN, final: CHN	: C=N, final: C#N	1.480769	0.313988	NEB	None	None	9	100	dftb	-14936.53	[‘C’, ‘N’, ‘H’]	[-0.00724506 -0.21146552 0.21871057]
dftb.009	True	True	None	C3H8	CCC	0.191499	0.013078	SinglePoint	None	None	None	None	dftb	-27967.66	[‘C’, ‘C’, ‘C’, ‘H’, ‘H’, ‘H’, ‘H’, ‘H’, ‘H’, ‘H’, ‘H’]	[-0.08484515 -0.0311758 -0.08604109 0.02623135 0.02847202 0.02639002
0.02083141 0.02232931 0.02618436 0.02685258 0.02477098]

Processing Data¶

Once an initial analysis has been created, the data can be further processed, depending on the use case. For example, to inspect the difference between failed and successful jobs, jobs can be filtered down and irrelevant fields removed.

Here we first filter the jobs to those which have the NEB task:

ja_neb = ja.filter_jobs(lambda data: data["InputAmsTask"] == "NEB")

Then we remove the “uniform fields” i.e. fields where all the values are the same. This lets us remove the noise and focus on the fields which have differences.

ja_neb = ja_neb.remove_uniform_fields(ignore_empty=True)
ja_neb.display_table()

Name	OK	Check	CPUTime	SysTime	InputAmsNebIterations
neb	False	False	0.474088	0.086321	10
neb.002	True	True	1.480769	0.313988	100

Another use case may be to analyze the results from one or more jobs. For this, it can be useful to utilize the expand functionality to convert job(s) to multiple rows. During this process, fields selected for expansion will have their values extracted into individual rows, whilst other fields have their values duplicated.

First we filter to a single job, the geometry optimization of water:

ja_adf_water = ja.filter_jobs(
    lambda data: (
        data["InputAmsTask"] == "GeometryOptimization"
        and data["InputAdfBasisType"] is not None
        and data["Smiles"] == "O"
    )
)
ja_adf_water.display_table()

Name	OK	Check	ErrorMsg	Formula	Smiles	CPUTime	SysTime	InputAmsTask	InputAdfBasisType	InputAdfXcGga	InputAmsNebImages	InputAmsNebIterations	AmsGeneralEngine	Energy [kJ/mol]	AtomType	Charge
adf.011	True	True	None	H2O	O	2.684484	0.324325	GeometryOptimization	DZ	None	None	None	adf	-1316.30	[‘O’, ‘H’, ‘H’]	[-0.84168651 0.42084715 0.42083936]
adf.012	True	True	None	H2O	O	4.183557	0.484223	GeometryOptimization	TZP	PBE	None	None	adf	-1363.77	[‘O’, ‘H’, ‘H’]	[-0.6739805 0.33698187 0.33699863]

Then we “expand” a given field to flatten the arrays and have one row per entry in the array. This lets us see the charge per atom for each job:

ja_adf_water_expanded = (
    ja_adf_water.expand_field("AtomType").expand_field("Charge").remove_uniform_fields()
)
ja_adf_water_expanded.display_table()

Name	CPUTime	SysTime	InputAdfBasisType	InputAdfXcGga	Energy [kJ/mol]	AtomType	Charge
adf.011	2.684484	0.324325	DZ	None	-1316.30	O	-0.8416865089598673
adf.011	2.684484	0.324325	DZ	None	-1316.30	H	0.4208471524496712
adf.011	2.684484	0.324325	DZ	None	-1316.30	H	0.42083935651019355
adf.012	4.183557	0.484223	TZP	PBE	-1363.77	O	-0.6739804981076105
adf.012	4.183557	0.484223	TZP	PBE	-1363.77	H	0.3369818667176927
adf.012	4.183557	0.484223	TZP	PBE	-1363.77	H	0.3369986313899128

Expansion can be undone with the corresponding collapse method.

Fields can be also further filtered, modified or reordered to customize the analysis. This example also illustrates the “fluent” syntax of the JobAnalysis tool, whereb

ja_adf = (
    ja_adf_water_expanded.remove_field("Name")
    .format_field("CPUTime", ".2f")
    .format_field("Charge", ".4f")
    .rename_field("InputAdfBasisType", "Basis")
    .rename_field("InputAdfBasisType", "GGA")
    .reorder_fields(["AtomType", "Charge", "Energy"])
)
ja_adf.display_table()

AtomType	Charge	Energy [kJ/mol]	CPUTime	SysTime	GGA	InputAdfXcGga
O	-0.8417	-1316.30	2.68	0.324325	DZ	None
H	0.4208	-1316.30	2.68	0.324325	DZ	None
H	0.4208	-1316.30	2.68	0.324325	DZ	None
O	-0.6740	-1363.77	4.18	0.484223	TZP	PBE
H	0.3370	-1363.77	4.18	0.484223	TZP	PBE
H	0.3370	-1363.77	4.18	0.484223	TZP	PBE

Extracting Analysis Data¶

Analysis data can be extracted in a variety of ways.

As has been demonstrated, a visual representation of the table can be easily generated using the to_table method (or display_table in a notebook). The format can be selected as markdown, html or rst. This will return the data with the specified display names and formatting.

print(ja_adf.to_table(fmt="rst"))

+----------+---------+-----------------+---------+----------+-----+---------------+
| AtomType | Charge  | Energy [kJ/mol] | CPUTime | SysTime  | GGA | InputAdfXcGga |
+==========+=========+=================+=========+==========+=====+===============+
| O        | -0.8417 | -1316.30        | 2.68    | 0.324325 | DZ  | None          |
+----------+---------+-----------------+---------+----------+-----+---------------+
| H        | 0.4208  | -1316.30        | 2.68    | 0.324325 | DZ  | None          |
+----------+---------+-----------------+---------+----------+-----+---------------+
| H        | 0.4208  | -1316.30        | 2.68    | 0.324325 | DZ  | None          |
+----------+---------+-----------------+---------+----------+-----+---------------+
| O        | -0.6740 | -1363.77        | 4.18    | 0.484223 | TZP | PBE           |
+----------+---------+-----------------+---------+----------+-----+---------------+
| H        | 0.3370  | -1363.77        | 4.18    | 0.484223 | TZP | PBE           |
+----------+---------+-----------------+---------+----------+-----+---------------+
| H        | 0.3370  | -1363.77        | 4.18    | 0.484223 | TZP | PBE           |
+----------+---------+-----------------+---------+----------+-----+---------------+

Alternatively, raw data can be retrieved via the get_analysis method, which returns a dictionary of analysis keys to values.

print(ja_adf.get_analysis())

{'AtomType': ['O', 'H', 'H', 'O', 'H', 'H'], 'Charge': [-0.8416865089598673, 0.4208471524496712, 0.42083935651019355, -0.6739804981076105, 0.3369818667176927, 0.3369986313899128], 'Energy': [-1316.2997406507902, -1316.2997406507902, -1316.2997406507902, -1363.76629419659, -1363.76629419659, -1363.76629419659], 'CPUTime': [2.684484, 2.684484, 2.684484, 4.183557, 4.183557, 4.183557], 'SysTime': [0.324325, 0.324325, 0.324325, 0.484223, 0.484223, 0.484223], 'InputAdfBasisType': ['DZ', 'DZ', 'DZ', 'TZP', 'TZP', 'TZP'], 'InputAdfXcGga': [None, None, None, 'PBE', 'PBE', 'PBE']}

Data can also be easily written to a csv file using to_csv_file, to be exported to another program.

csv_name = "./tmp.csv"
ja_adf.to_csv_file(csv_name)

with open(csv_name) as csv:
    print(csv.read())

AtomType,Charge,Energy,CPUTime,SysTime,InputAdfBasisType,InputAdfXcGga
O,-0.8416865089598673,-1316.2997406507902,2.684484,0.324325,DZ,
H,0.4208471524496712,-1316.2997406507902,2.684484,0.324325,DZ,
H,0.42083935651019355,-1316.2997406507902,2.684484,0.324325,DZ,
O,-0.6739804981076105,-1363.76629419659,4.183557,0.484223,TZP,PBE
H,0.3369818667176927,-1363.76629419659,4.183557,0.484223,TZP,PBE
H,0.3369986313899128,-1363.76629419659,4.183557,0.484223,TZP,PBE

Finally, for more complex data analysis, the results can be converted to a pandas dataframe. This is recommended for more involved data manipulations.

df = ja_adf.to_dataframe()
df

	AtomType	Charge	Energy	CPUTime	SysTime	InputAdfBasisType	InputAdfXcGga
0	O	-0.841687	-1316.299741	2.684484	0.324325	DZ	None
1	H	0.420847	-1316.299741	2.684484	0.324325	DZ	None
2	H	0.420839	-1316.299741	2.684484	0.324325	DZ	None
3	O	-0.673980	-1363.766294	4.183557	0.484223	TZP	PBE
4	H	0.336982	-1363.766294	4.183557	0.484223	TZP	PBE
5	H	0.336999	-1363.766294	4.183557	0.484223	TZP	PBE

Python Script¶

#!/usr/bin/env python
# coding: utf-8

# ## Create Example Jobs

# To begin with, we create a variety of AMS jobs with different settings, tasks, engines and calculation types.
#
# This allows us to generate diverse example single point/geometry optimization calculations with DFTB, ADF etc.

from scm.plams import from_smiles, AMSJob, PlamsError, Settings, Molecule, Atom
from scm.base import ChemicalSystem
from scm.input_classes.drivers import AMS
from scm.input_classes.engines import DFTB
from scm.utils.conversions import plams_molecule_to_chemsys


def example_job_dftb(smiles, task, use_chemsys=False):
    # Generate molecule from smiles
    mol = from_smiles(smiles)
    if use_chemsys:
        mol = plams_molecule_to_chemsys(mol)

    # Set up calculation settings using PISA
    sett = Settings()
    sett.runscript.nproc = 1
    driver = AMS()
    driver.Task = task
    driver.Engine = DFTB()
    sett.input = driver
    return AMSJob(molecule=mol, settings=sett, name="dftb")


def example_job_adf(smiles, task, basis, gga=None, use_chemsys=False):
    # Generate molecule from smiles
    mol = from_smiles(smiles)
    if use_chemsys:
        mol = plams_molecule_to_chemsys(mol)

    # Set up calculation settings using standard settings
    sett = Settings()
    sett.runscript.nproc = 1
    sett.input.AMS.Task = task
    sett.input.ADF.Basis.Type = basis
    if gga:
        sett.input.ADF.XC.GGA = gga
    return AMSJob(molecule=mol, settings=sett, name="adf")


def example_job_neb(iterations, use_chemsys=False):
    # Set up molecules
    main_molecule = Molecule()
    main_molecule.add_atom(Atom(symbol="C", coords=(0, 0, 0)))
    main_molecule.add_atom(Atom(symbol="N", coords=(1.18, 0, 0)))
    main_molecule.add_atom(Atom(symbol="H", coords=(2.196, 0, 0)))
    final_molecule = main_molecule.copy()
    final_molecule.atoms[1].x = 1.163
    final_molecule.atoms[2].x = -1.078

    mol = {"": main_molecule, "final": final_molecule}

    if use_chemsys:
        mol = {k: plams_molecule_to_chemsys(v) for k, v in mol.items()}

    # Set up calculation settings
    sett = Settings()
    sett.runscript.nproc = 1
    sett.input.ams.Task = "NEB"
    sett.input.ams.NEB.Images = 9
    sett.input.ams.NEB.Iterations = iterations
    sett.input.DFTB

    return AMSJob(molecule=mol, settings=sett, name="neb")


# Now, we create a selection of jobs covering different systems and settings:

from scm.plams import config, JobRunner

config.default_jobrunner = JobRunner(parallel=True, maxthreads=8)

smiles = ["CC", "C", "O", "CO"]
tasks = ["SinglePoint", "GeometryOptimization"]
engines = ["DFTB", "ADF"]
jobs = []
for i, s in enumerate(smiles):
    for j, t in enumerate(tasks):
        job_dftb = example_job_dftb(s, t, use_chemsys=i % 2)
        job_adf1 = example_job_adf(s, t, "DZ", use_chemsys=True)
        job_adf2 = example_job_adf(s, t, "TZP", "PBE")
        jobs += [job_dftb, job_adf1, job_adf2]

job_neb1 = example_job_neb(10)
job_neb2 = example_job_neb(100, use_chemsys=True)
jobs += [job_neb1, job_neb2]

for j in jobs:
    j.run()

for j in jobs:
    j.ok()


# ## Job Analysis

# The `JobAnalysis` tool can be used to extract data from a large number of jobs, and analyse the results.

# ### Adding and Loading Jobs
#
# Jobs can be loaded by passing job objects directly to the `JobAnalysis`, or alternatively loading from a path. This latter option is useful for loading jobs run previously in other scripts.

from scm.plams import JobAnalysis


ja = JobAnalysis(jobs=jobs)
# ja = JobAnalysis(paths=[j.path for j in jobs]) # alternatively load jobs from a set of paths


# Additional jobs can also be added or removed after initialization of the `JobAnalysis` tool.

extra_job = example_job_dftb("CCC", "SinglePoint")
extra_job.run()
extra_job.ok()

ja = ja.add_job(extra_job)


# The loaded jobs and the initial analysis fields can be shows by displaying the `JobAnalysis` table:

print(ja.to_table())


# ### Adding and Removing Fields

# On initialization, some analysis fields are automatically included in the analysis (`Path`, `Name`, `OK`, `Check` and `ErrorMsg`). These are useful to see which jobs were loaded, and whether they succeeded. However, one or more of these fields can be removed with the `remove_field` method.

ja = ja.remove_field("Path")


# A range of other common fields can be added with the `add_standard_field(s)` method.

ja = ja.add_standard_fields(["Formula", "Smiles", "CPUTime", "SysTime"])


# In addition, all fields deriving from the job input settings can be added with the `add_settings_input_fields` method. By default, these will have names corresponding to the concatenated settings entries. Individual settings field can be added with the `add_settings_field` method. This is useful to see the differences in the input settings of various jobs which may have succeeded/failed.

ja = ja.add_settings_input_fields()


# For output results, fields from the rkfs can be added with the `add_rkf_field` method, using a specified rkf file (default `ams.rkf`), section and variable.

ja = ja.add_rkf_field("General", "engine")


# Finally, custom fields can also be added with the `add_field` method, by defining a field key, value accessor and optional arguments like the display name and value formatting. This is most useful to extract results from jobs using built-in methods on the job results class.

ja = ja.add_field(
    "Energy",
    lambda j: j.results.get_energy(unit="kJ/mol"),
    display_name="Energy [kJ/mol]",
    fmt=".2f",
)
ja = ja.add_field("AtomType", lambda j: [at.symbol for at in j.results.get_main_molecule()])
ja = ja.add_field("Charge", lambda j: j.results.get_charges())

print(ja.to_table(max_rows=5))


# ### Processing Data

# Once an initial analysis has been created, the data can be further processed, depending on the use case.
# For example, to inspect the difference between failed and successful jobs, jobs can be filtered down and irrelevant fields removed.
#
# Here we first filter the jobs to those which have the `NEB` task:

ja_neb = ja.filter_jobs(lambda data: data["InputAmsTask"] == "NEB")


# Then we remove the "uniform fields" i.e. fields where all the values are the same. This lets us remove the noise and focus on the fields which have differences.

ja_neb = ja_neb.remove_uniform_fields(ignore_empty=True)
print(ja_neb.to_table())


# Another use case may be to analyze the results from one or more jobs.
# For this, it can be useful to utilize the `expand` functionality to convert job(s) to multiple rows.
# During this process, fields selected for expansion will have their values extracted into individual rows, whilst other fields have their values duplicated.
#
# First we filter to a single job, the geometry optimization of water:

ja_adf_water = ja.filter_jobs(
    lambda data: (
        data["InputAmsTask"] == "GeometryOptimization"
        and data["InputAdfBasisType"] is not None
        and data["Smiles"] == "O"
    )
)
print(ja_adf_water.to_table())


# Then we "expand" a given field to flatten the arrays and have one row per entry in the array. This lets us see the charge per atom for each job:

ja_adf_water_expanded = ja_adf_water.expand_field("AtomType").expand_field("Charge").remove_uniform_fields()
print(ja_adf_water_expanded.to_table())


# Expansion can be undone with the corresponding `collapse` method.
#
# Fields can be also further filtered, modified or reordered to customize the analysis. This example also illustrates the "fluent" syntax of the `JobAnalysis` tool, whereb

ja_adf = (
    ja_adf_water_expanded.remove_field("Name")
    .format_field("CPUTime", ".2f")
    .format_field("Charge", ".4f")
    .rename_field("InputAdfBasisType", "Basis")
    .rename_field("InputAdfBasisType", "GGA")
    .reorder_fields(["AtomType", "Charge", "Energy"])
)
print(ja_adf.to_table())


# ### Extracting Analysis Data

# Analysis data can be extracted in a variety of ways.
#
# As has been demonstrated, a visual representation of the table can be easily generated using the `to_table` method (or `display_table` in a notebook).
# The format can be selected as markdown, html or rst. This will return the data with the specified display names and formatting.

print(ja_adf.to_table(fmt="rst"))


# Alternatively, raw data can be retrieved via the `get_analysis` method, which returns a dictionary of analysis keys to values.

print(ja_adf.get_analysis())


# Data can also be easily written to a csv file using `to_csv_file`, to be exported to another program.

csv_name = "./tmp.csv"
ja_adf.to_csv_file(csv_name)

with open(csv_name) as csv:
    print(csv.read())


# Finally, for more complex data analysis, the results can be converted to a [pandas](https://pandas.pydata.org) dataframe. This is recommended for more involved data manipulations.

df = ja_adf.to_dataframe()
print(df)

Summarizing PLAMS Jobs with JobAnalysis¶

Create Example Jobs¶

Job Analysis¶

Adding and Loading Jobs¶

Adding and Removing Fields¶

Processing Data¶

Extracting Analysis Data¶

See also¶

Related examples¶

Related tutorials¶

Related documentation¶

Python Script¶