Convert Between ParAMS and ASE Formats

This tutorial shows how to

  • convert between ParAMS and ASE (Atomic Simulation Environment) data format

Downloads: Notebook | Script ?

Requires: AMS2026 or later

Related examples
Related tutorials
Related documentation

Overview

The ASE .xyz format is a convenient format for storing single point properties of a collection of structures. It is commonly used and supported by academic machine learning potential projects, so you may find it useful.

Here, we will initialize/save a ResultsImporter object from/to ASE .xyz format.

For advanced usage you can also work directly with lists of ASE Atoms, and the ParAMS JobCollection and DataSet classes. See the documentation for params_to_ase, and ase_to_params.

from scm.params import ResultsImporter
import os

One-liner conversion from ParAMS .yaml to ASE .xyz

Here we use the examples directory that already contains files in the ParAMS .yaml format:

yaml_dir = os.path.expandvars(
    "$AMSHOME/scripting/scm/params/examples/DataSets/LiquidAr_32Atoms_100Frames"
)

# put training_set.xyz, validation_set.xyz in current working directory
xyz_target_dir = os.getcwd()

xyz_files = ResultsImporter.from_yaml(yaml_dir).store_ase(xyz_target_dir, format="extxyz")

print(xyz_files)
[PosixPath('/home/hellstrom/adfhome/scripting/scm/params/doc/source/examples/convert_ase/training_set.xyz'), PosixPath('/home/hellstrom/adfhome/scripting/scm/params/doc/source/examples/convert_ase/validation_set.xyz')]

One-liner conversion from ASE .xyz to ParAMS .yaml

xyz_dir = os.getcwd()

# put job_collection.yaml etc. in a folder yaml_ref_data
yaml_target_dir = "yaml_ref_data"

yaml_files = ResultsImporter.from_ase(
    f"{xyz_dir}/training_set.xyz", f"{xyz_dir}/validation_set.xyz"
).store(yaml_target_dir, backup=False)

print(yaml_files)
['yaml_ref_data/job_collection.yaml', 'yaml_ref_data/results_importer_settings.yaml', 'yaml_ref_data/training_set.yaml', 'yaml_ref_data/validation_set.yaml']

More about ASE .xyz format

Let’s look at the first few lines of the ASE .xyz files:

for file in xyz_files:
    print(f"--- first 5 lines of {file} ---")
    with open(file) as f:
        print("".join(f.readlines()[:5]))
--- first 5 lines of /home/hellstrom/adfhome/scripting/scm/params/doc/source/examples/convert_ase/training_set.xyz ---
32
Lattice="10.52 0.0 0.0 0.0 10.52 0.0 0.0 0.0 10.52" Properties=species:S:1:pos:R:3:forces:R:3 nAtoms=32.0 energy_weight=1.0 forces_weights=1.0 energy=-5354.689144659606 pbc="T T T"
Ar       5.22816630      -0.09999816       7.90156704       0.00827748       0.00316181      -0.00109192
Ar       5.36838670       0.11217830       2.61448917      -0.00296065      -0.01538116       0.00091621
Ar       5.47472123       5.39235886       7.71471647      -0.02373655      -0.02892664       0.00741117

--- first 5 lines of /home/hellstrom/adfhome/scripting/scm/params/doc/source/examples/convert_ase/validation_set.xyz ---
32
Lattice="10.52 0.0 0.0 0.0 10.52 0.0 0.0 0.0 10.52" Properties=species:S:1:pos:R:3:forces:R:3 nAtoms=32.0 energy_weight=1.0 forces_weights=1.0 energy=-5354.752060382252 pbc="T T T"
Ar       5.26000000       0.00000000       7.89000000      -0.00000013       0.00000037      -0.00000037
Ar       5.26000000       0.00000000       2.63000000      -0.00000013       0.00000037       0.00000013
Ar       5.26000000       5.26000000       7.89000000      -0.00000013      -0.00000013      -0.00000037

To read the ASE format, use ase.io.read("filename.xyz", ":") to get a list of ASE Atoms. For more details, see the ASE documentation.

import ase.io

list_of_ase_atoms = ase.io.read(xyz_files[0], ":")  # files[0] == "training_set.xyz"
atoms = list_of_ase_atoms[0]
print("First structure:")
print(atoms)
print(f"Energy: {atoms.get_potential_energy()}")
First structure:
Atoms(symbols='Ar32', pbc=True, cell=[10.52, 10.52, 10.52], forces=..., calculator=SinglePointCalculator(...))
Energy: -5354.689144659606

More about ParAMS ResultsImporter initialized from ASE .xyz

ri = ResultsImporter.from_ase(f"{xyz_dir}/training_set.xyz", f"{xyz_dir}/validation_set.xyz")

The structure is stored in the job collection:

for name in ri.job_collection:
    print(f"ID: {name}")
    print(ri.job_collection[name])
    break
ID: training_set0001
ReferenceEngineID: None
AMSInput: |
   properties
     gradients yes
   End
   system
     Atoms
                Ar       5.2281663000      -0.0999981600       7.9015670400
                Ar       5.3683867000       0.1121783000       2.6144891700
... output trimmed ....
                Ar      -0.1261450000       5.4866211100       2.6387010300
     End
     Lattice
           10.5200000000     0.0000000000     0.0000000000
            0.0000000000    10.5200000000     0.0000000000
            0.0000000000     0.0000000000    10.5200000000
     End
   End
   task singlepoint

The reference values are stored in the data sets:

# print the first training set entry
for ds_entry in ri.get_data_set("training_set"):
    print(ds_entry)
    break
---
Expression: energy('training_set0001')
Weight: 1.0
ReferenceValue: -5354.689144659606
Unit: eV, 27.211386245988
# print the first validation set entry
for ds_entry in ri.get_data_set("validation_set"):
    print(ds_entry)
    break
---
Expression: energy('validation_set0001')
Weight: 1.0
ReferenceValue: -5354.752060382252
Unit: eV, 27.211386245988

When you initialize a ResultsImporter using from_ase, it will automatically set the ResultsImporter units to the ASE units. If you use the results importer to import new data, the data will be stored in the ASE units (eV, eV/angstrom, etc.).

print(f"{ri.settings['units']['energy']=}")
print(f"{ri.settings['units']['forces']=}")
ri.settings['units']['energy']=('eV', 27.211386245988)
ri.settings['units']['forces']=('eV/angstrom', 51.422067476325886)

See also

Python Script

#!/usr/bin/env python
# coding: utf-8

# ## Overview
#
# The ASE .xyz format is a convenient format for storing single point properties of a collection of structures. It is commonly used and supported by academic machine learning potential projects, so you may find it useful.
#
# Here, we will initialize/save a ResultsImporter object from/to ASE .xyz format.
#
# For advanced usage you can also work directly with lists of ASE Atoms, and the ParAMS ``JobCollection`` and ``DataSet`` classes. See the documentation for ``params_to_ase``, and ``ase_to_params``.

from scm.params import ResultsImporter
import os


# ## One-liner conversion from ParAMS .yaml to ASE .xyz
#
# Here we use the examples directory that already contains files in the ParAMS .yaml format:

yaml_dir = os.path.expandvars("$AMSHOME/scripting/scm/params/examples/DataSets/LiquidAr_32Atoms_100Frames")

# put training_set.xyz, validation_set.xyz in current working directory
xyz_target_dir = os.getcwd()

xyz_files = ResultsImporter.from_yaml(yaml_dir).store_ase(xyz_target_dir, format="extxyz")

print(xyz_files)


# ## One-liner conversion from ASE .xyz to ParAMS .yaml

xyz_dir = os.getcwd()

# put job_collection.yaml etc. in a folder yaml_ref_data
yaml_target_dir = "yaml_ref_data"

yaml_files = ResultsImporter.from_ase(f"{xyz_dir}/training_set.xyz", f"{xyz_dir}/validation_set.xyz").store(
    yaml_target_dir, backup=False
)

print(yaml_files)


# ## More about ASE .xyz format
#
# Let's look at the first few lines of the ASE .xyz files:

for file in xyz_files:
    print(f"--- first 5 lines of {file} ---")
    with open(file) as f:
        print("".join(f.readlines()[:5]))


# To read the ASE format, use ``ase.io.read("filename.xyz", ":")`` to get a list of ASE Atoms. For more details, see the ASE documentation.

import ase.io

list_of_ase_atoms = ase.io.read(xyz_files[0], ":")  # files[0] == "training_set.xyz"
atoms = list_of_ase_atoms[0]
print("First structure:")
print(atoms)
print(f"Energy: {atoms.get_potential_energy()}")


# ### More about ParAMS ResultsImporter initialized from ASE .xyz

ri = ResultsImporter.from_ase(f"{xyz_dir}/training_set.xyz", f"{xyz_dir}/validation_set.xyz")


# The structure is stored in the job collection:

for name in ri.job_collection:
    print(f"ID: {name}")
    print(ri.job_collection[name])
    break


# The reference values are stored in the data sets:

# print the first training set entry
for ds_entry in ri.get_data_set("training_set"):
    print(ds_entry)
    break


# print the first validation set entry
for ds_entry in ri.get_data_set("validation_set"):
    print(ds_entry)
    break


# When you initialize a ResultsImporter using ``from_ase``, it will automatically set the ResultsImporter units to the ASE units. If you use the results importer to import new data, the data will be stored in the ASE units (eV, eV/angstrom, etc.).

print(f"{ri.settings['units']['energy']=}")
print(f"{ri.settings['units']['forces']=}")