2.13. The params Python library

The Getting Started: Lennard-Jones Potential for Argon and Calculate reference values with ParAMS tutorials used the ParAMS Main Script to run the parametrization or calculate reference data. They used .yaml files for the job collection, training set, and engine collection.

This tutorial shows how you can do the same things using the params Python library. The Python library is more flexible, and offers many useful functions for manipulating the contents of data_sets, collections, and parameter interfaces.

2.13.1. Run an optimization with the Python library

Here, the Getting Started: Lennard-Jones Potential for Argon tutorial is run with the params Python library.

Download LJ_Ar_example.zip, but replace the params.conf.py file with a file my_optimization.py containing the following:

#!/usr/bin/env amspython

from scm.params import *

def main():
    ### The training_set and job_collection contain paths to the corresponding .yaml files
    training_set = DataSet('training_set.yaml')
    job_collection = JobCollection('job_collection.yaml')

    ### The LennardJonesParameters interface has two parameters:
    ### 'eps' (epsilon) and 'rmin' (distance at which the potential reaches a minimum)
    interface = LennardJonesParameters()
    interface['eps'].value = 3e-4     # Hartree
    interface['eps'].range = (1e-5, 1e-2)
    interface['rmin'].value = 4.0     # angstrom
    interface['rmin'].range = (1.0, 8.0)
    print("Initial parameters and ranges:")
    print(interface)

    ### Define an optimizer for the optimization task. Use either a CMAOptimizer or Scipy
    #optimizer = CMAOptimizer(sigma=0.1, popsize=10, minsigma=5e-4)
    optimizer = Scipy(method='Nelder-Mead')   # Nelder-Mead

    ### loss function: 'sse' = sum of squared errors
    loss = 'sse'

    ### run the optimization in serial
    parallel = ParallelLevels(parametervectors=1, jobs=1)

    ### Callbacks allow further control of the optimization procedure
    ### Here, we stop the optimization after 2 minutes if it has not finished.
    callbacks = [Logger(), Timeout(60*2)]

    opt = Optimization(job_collection=job_collection,
                       data_sets=training_set,
                       parameter_interface=interface,
                       optimizer=optimizer,
                       loss=loss,
                       parallel=parallel,
                       callbacks=callbacks)

    opt.summary()
    opt.optimize()

if __name__ == '__main__':
    main()

This file is almost identical to the params.conf.py file. The differences are

  • The job_collection and training_set are now instances of Job Collection and Data Set.
  • The optimization is handled by the Optimization class. The summary method prints out a summary (akin to summary.txt), and the optimize method runs the optimization.

To run the file, use the amspython Python interpreter:

"$AMSBIN/amspython" my_optimization.py

2.13.2. Calculate reference values

Here, the Calculate reference values with ParAMS tutorial is run with the params Python library.

Download LJ_Ar_no_reference_data_example.zip, but replace the file params.conf.py with a file calculate_reference_values.py file containing the following:

#!/usr/bin/env amspython

from scm.params import *

def main():
    job_collection = JobCollection('job_collection.yaml')
    data_set = DataSet('training_set.yaml')
    engine_collection = EngineCollection('job_collection_engines.yaml')

    dse = DataSetEvaluator()
    # 'saved_reference_calculations' is equivalent to 'reference.cache' from the `params genref` command
    dse.calculate_reference(job_collection, data_set, 
                            engine_collection, folder='saved_reference_calculations')

    # equivalent to 'training_set.ref.yaml' from the `params genref` command
    data_set.store('new_training_set.yaml') 


if __name__ == '__main__':
    main()

Here, the Data Set Evaluator class is used to calculate the reference values.

To run the file, use the amspython Python interpreter:

"$AMSBIN/amspython" calculate_reference_values.py

2.13.3. Training and validation sets with the Python library

Here, the Training and validation sets tutorial is run with the params Python library.

Download LJ_Ar_validation_set.zip, but replace the params.conf.py file with a file my_optimization.py containing the following:

#!/usr/bin/env amspython

from scm.params import *

def main():
    ### The training_set and job_collection contain paths to the corresponding .yaml files
    training_set = DataSet('training_set.yaml')
    validation_set = DataSet('validation_set.yaml')
    job_collection = JobCollection('job_collection.yaml')

    ### The LennardJonesParameters interface has two parameters:
    ### 'eps' (epsilon) and 'rmin' (distance at which the potential reaches a minimum)
    interface = LennardJonesParameters()
    interface['eps'].value = 3e-4     # Hartree
    interface['eps'].range = (1e-5, 1e-2)
    interface['rmin'].value = 4.0     # angstrom
    interface['rmin'].range = (1.0, 8.0)
    print("Initial parameters and ranges:")
    print(interface)

    ### Define an optimizer for the optimization task. Use either a CMAOptimizer or Scipy
    #optimizer = CMAOptimizer(sigma=0.1, popsize=10, minsigma=5e-4)
    optimizer = Scipy(method='Nelder-Mead')   # Nelder-Mead

    ### loss function: 'sse' = sum of squared errors
    loss = 'sse'

    ### run the optimization in serial
    parallel = ParallelLevels(parametervectors=1, jobs=1)

    ### Callbacks allow further control of the optimization procedure
    ### Here, we stop the optimization after 2 minutes if it has not finished.
    callbacks = [Logger(), Timeout(60*2)]

    opt = Optimization(job_collection=job_collection,
                       data_sets=[training_set, validation_set], # the first entry must be the training set
                       parameter_interface=interface,
                       optimizer=optimizer,
                       loss=loss,
                       parallel=parallel,
                       callbacks=callbacks,
                       logger_every=5,
                       eval_every=5)

    opt.summary()
    opt.optimize()

if __name__ == '__main__':
    main()

The data_sets argument to the Optimization constructor contains a list of DataSet. The first entry must be the training set.

To run the file, use the amspython Python interpreter:

"$AMSBIN/amspython" my_optimization.py