Simple Active Learning (SAL) is a workflow for on-the-fly training (active learning) of machine learning (ML) potentials during molecular dynamics (MD). It is “simple” because it only applies to a single MD simulation.


The workflow

  • Trains an initial ML potential

  • Runs the MD simulation

  • Pauses the MD simulation and launches new reference (typically DFT) calculations at set intervals or if the ML potential is not accurate enough

  • Retrains the ML potential to the new reference data

  • Rewinds the MD simulation to the last point where it was known to be accurate

  • Continues the MD simulation, pauses, retrains, rewinds, continues, …

Optionally, the workflow can be restarted from a previous workflow (skipping the initial training).


Fig. 3 Example: You run 10 ps MD, dividing the simulation into 4 segments (“active learning steps”) indicated by the blue bars. At the blue bars, the accuracy of the model is checked and a decision is made whether to continue the MD simulation or to retrain the model and rewind to a previous point.


Fig. 4 Example: You train a committee model that estimates the model’s uncertainty. As soon as the uncertainty increases above a given threshold, the MD simulation stops. The model is retrained and the simulation rewinds to the previous active learning step.

There are five main pieces of input:

  • Input system. This is the initial system for the MD simulation. The input is exactly the same as for any other AMS simulation.

  • Molecular dynamics settings. It can be equilibrium or non-equilibrium MD. The settings/input are exactly the same as for any other AMS simulation.

  • Reference engine settings. This can be any engine, but would typically be one of the DFT engines ADF, BAND, or Quantum ESPRESSO. The settings/input are exactly the same as for any other AMS simulation. This engine determines the level of theory to which the ML potential is trained.

  • ParAMS ML training settings. You can train any ML potential that is supported by ParAMS, for example, M3GNet. The settings/input are exactly the same as for running standalone ParAMS with Task MachineLearning.

  • Active learning settings. These settings determine, for example, how frequently to launch new reference calculation, and how to judge if the ML potential is accurate enough.

The three main pieces of output are:

  • The requested MD trajectory, that can be analyzed for results

  • The trained ML model parameters, that can potentially also be used for other (production) simulations

  • All training and validation data, containing the results from the reference calculations


To run Simple Active Learning, you need licenses for

  • Advanced workflows and tools (includes the workflow and ParAMS),

  • Classical force fields and machine learning potentials (to run the ML potential simulations)

  • The reference engine (e.g., ADF, BAND, or Quantum ESPRESSO)

What’s new in AMS2024?

The Simple Active Learning workflow is new in AMS2024.