2.13. The params Python library¶
The Getting Started: Lennard-Jones Potential for Argon and Calculate reference values with ParAMS tutorials used the ParAMS Main Script to run the parametrization or calculate reference data. They used .yaml files for the job collection, training set, and engine collection.
This tutorial shows how you can do the same things using the params
Python library. The Python library is more flexible, and
offers many useful functions for manipulating the contents of data_sets,
collections, and parameter interfaces.
2.13.1. Run an optimization with the Python library¶
Here, the Getting Started: Lennard-Jones Potential for Argon tutorial is run with the params Python library.
Download LJ_Ar_example.zip
, but replace the params.conf.py
file with a file my_optimization.py
containing the following:
#!/usr/bin/env amspython
from scm.params import *
def main():
### The training_set and job_collection contain paths to the corresponding .yaml files
training_set = DataSet('training_set.yaml')
job_collection = JobCollection('job_collection.yaml')
### The LennardJonesParameters interface has two parameters:
### 'eps' (epsilon) and 'rmin' (distance at which the potential reaches a minimum)
interface = LennardJonesParameters()
interface['eps'].value = 3e-4 # Hartree
interface['eps'].range = (1e-5, 1e-2)
interface['rmin'].value = 4.0 # angstrom
interface['rmin'].range = (1.0, 8.0)
print("Initial parameters and ranges:")
print(interface)
### Define an optimizer for the optimization task. Use either a CMAOptimizer or Scipy
#optimizer = CMAOptimizer(sigma=0.1, popsize=10, minsigma=5e-4)
optimizer = Scipy(method='Nelder-Mead') # Nelder-Mead
### loss function: 'sse' = sum of squared errors
loss = 'sse'
### run the optimization in serial
parallel = ParallelLevels(parametervectors=1, jobs=1)
### Callbacks allow further control of the optimization procedure
### Here, we stop the optimization after 2 minutes if it has not finished.
callbacks = [Logger(), Timeout(60*2)]
opt = Optimization(job_collection=job_collection,
data_sets=training_set,
parameter_interface=interface,
optimizer=optimizer,
loss=loss,
parallel=parallel,
callbacks=callbacks)
opt.summary()
opt.optimize()
if __name__ == '__main__':
main()
This file is almost identical to the params.conf.py
file. The differences are
- The
job_collection
andtraining_set
are now instances of Job Collection and Data Set. - The optimization is handled by the Optimization class. The
summary
method prints out a summary (akin tosummary.txt
), and theoptimize
method runs the optimization.
To run the file, use the amspython Python interpreter:
"$AMSBIN/amspython" my_optimization.py
2.13.2. Calculate reference values¶
Here, the Calculate reference values with ParAMS tutorial is run with the params Python library.
Download LJ_Ar_no_reference_data_example.zip
, but replace the file params.conf.py
with a file calculate_reference_values.py
file containing the following:
#!/usr/bin/env amspython
from scm.params import *
def main():
job_collection = JobCollection('job_collection.yaml')
data_set = DataSet('training_set.yaml')
engine_collection = EngineCollection('job_collection_engines.yaml')
dse = DataSetEvaluator()
# 'saved_reference_calculations' is equivalent to 'reference.cache' from the `params genref` command
dse.calculate_reference(job_collection, data_set,
engine_collection, folder='saved_reference_calculations')
# equivalent to 'training_set.ref.yaml' from the `params genref` command
data_set.store('new_training_set.yaml')
if __name__ == '__main__':
main()
Here, the Data Set Evaluator class is used to calculate the reference values.
To run the file, use the amspython Python interpreter:
"$AMSBIN/amspython" calculate_reference_values.py
2.13.3. Training and validation sets with the Python library¶
Here, the Training and validation sets tutorial is run with the params Python library.
Download LJ_Ar_validation_set.zip
, but replace the params.conf.py
file with a file my_optimization.py
containing the following:
#!/usr/bin/env amspython
from scm.params import *
def main():
### The training_set and job_collection contain paths to the corresponding .yaml files
training_set = DataSet('training_set.yaml')
validation_set = DataSet('validation_set.yaml')
job_collection = JobCollection('job_collection.yaml')
### The LennardJonesParameters interface has two parameters:
### 'eps' (epsilon) and 'rmin' (distance at which the potential reaches a minimum)
interface = LennardJonesParameters()
interface['eps'].value = 3e-4 # Hartree
interface['eps'].range = (1e-5, 1e-2)
interface['rmin'].value = 4.0 # angstrom
interface['rmin'].range = (1.0, 8.0)
print("Initial parameters and ranges:")
print(interface)
### Define an optimizer for the optimization task. Use either a CMAOptimizer or Scipy
#optimizer = CMAOptimizer(sigma=0.1, popsize=10, minsigma=5e-4)
optimizer = Scipy(method='Nelder-Mead') # Nelder-Mead
### loss function: 'sse' = sum of squared errors
loss = 'sse'
### run the optimization in serial
parallel = ParallelLevels(parametervectors=1, jobs=1)
### Callbacks allow further control of the optimization procedure
### Here, we stop the optimization after 2 minutes if it has not finished.
callbacks = [Logger(), Timeout(60*2)]
opt = Optimization(job_collection=job_collection,
data_sets=[training_set, validation_set], # the first entry must be the training set
parameter_interface=interface,
optimizer=optimizer,
loss=loss,
parallel=parallel,
callbacks=callbacks,
logger_every=5,
eval_every=5)
opt.summary()
opt.optimize()
if __name__ == '__main__':
main()
The data_sets
argument to the Optimization
constructor contains a list of DataSet. The first entry must be
the training set.
To run the file, use the amspython Python interpreter:
"$AMSBIN/amspython" my_optimization.py