Read, write, file formats

File formats overview:

Format

Read

Write

Note

.in (AMS System block)

__init__()

__str__()

Lossless serialization, includes bonds

.in (AMS System block)

from_in()

write_in()

Lossless serialization, includes bonds

.kf, .rkf (AMS binary file)

from_kf()

write_kf()

Lossless serialization, includes bonds

.xyz (plain)

from_xyz()

write_xyz()

May give rounding errors, does not include bonds, lattice, charge or atomic attributes

.xyz (AMS extended)

from_xyz()

write_xyz()

See here. May give rounding errors, does not include bonds

.xyz (ASE extended)

No

No

.cif

No

No

Convert to/from PLAMS Molecule for .cif files

.mol2

No

No

Convert to/from PLAMS Molecule for .mol2 files

You can create or serialize a ChemicalSystem object using various file formats. Among these, System Block is one of the most significant, offering a versatile text-based way to describe your chemical system. For more information on the syntax and options for the System Block, see the AMS System Block documentation.

All text based formats can also be written using Python format strings:

ChemicalSystem.__format__(format_spec: str) str

Formats a ChemicalSystem into a string representation.

The format_spec string starts with and identifier for the format to write, followed and optional : and a list of space separated key=value pairs configuring options of the format. E.g. the following would produce the AMS System block format in internal units (bohr) with the string “H2O” as a System name in the block header:

cs = ChemicalSystem(...)
print(f"{cs:in:units=internal name=H2O}")

Would produce the output:

System H2O
    Atoms [bohr]
        O 0 0 -0.7262847342654172
        H 0 1.4669943905469853 0.3631423765813393
        H 0 -1.4669943905469853 0.3631423765813393
    End
End

The following formats and options are supported at the moment:

  • in for writing the AMS System block. This is the default format, if none is specified. The following options are supported:

    • name=... to put an arbitrary string as the system’s name into the block header.

    • units=[default|internal] to switch between default units and the units used by the ChemicalSystem internally. Internally the ChemicalSystem uses atomic units (e.g. bohr for lengths). Printing the System block in internal units avoids a possible loss in precision in the unit conversion and guarantees an exact ChemicalSystem -> str -> ChemicalSystem round-trip, i.e.:

      cs = ChemicalSystem(...)
      assert ChemicalSystem(f"{cs:in:units=internal}") == cs
      
    • skip=[gui|adf|band|forcefield|dftb|reaxff|qe|...] to avoid printing of a particular property group in the end-of-line string of an atom in the System%Atoms subblock. Multiple groups may be specified as a comma separated list.

    • unused_atom_attributes=[drop|include] to determine whether unused atomic attributes groups should be written out as Modify%EnableAtomAttributes entries. The default is not to do this, aka drop. This means that unused atomic attributes groups get lost in the ChemicalSystem -> str -> ChemicalSystem round-trip. If preserving them is imporant, this can be achieved by setting this option to include.

  • xyz for writing a plain XYZ file without lattice or atomic attributes. This format has no options.

  • extended_xyz for writing the AMS extended XYZ format. This format has no options.

When you convert a ChemicalSystem object into a string (either by explicitly calling str(my_chemical_system) or by using a print statement) the output will be in the System Block format.

>>> from scm.libbase import ChemicalSystem

>>> # Read it from a text file in the 'System Block' format:
>>> my_chemical_system = ChemicalSystem.from_in(filename="water.in")

>>> print(my_chemical_system)

System
    Atoms
        O    0.0000000000000000  0.0000000000000000  0.0000000000000000
        H    1.0000000000000000  0.0000000000000000  0.0000000000000000
        H    0.0000000000000000  1.0000000000000000  0.0000000000000000
    End
End

>>> # Write it to a file in the 'System Block' format:
>>> my_chemical_system.write_in(filename="another_water.in")
Note on lossless serialization

When you read or write a ChemicalSystem using either the System Block format or a kf file, the object is perfectly serialized. In other words, writing the object to a kf file and reading it back will result in an identical ChemicalSystem.

However, be cautious when using the xyz format as it doesn’t offer lossless serialization. Writing and reading back using this format may result in the loss of certain information, such as bonds between atoms.