Active Learning: Single-Molecule Setup and Run

Set up a minimal Simple Active Learning workflow for a small organic molecule, combining a reference method, molecular dynamics, and M3GNet retraining in Python.

Initial imports

from scm.simple_active_learning import SimpleActiveLearningJob
import scm.plams as plams
from scm.external_engines.core import interface_is_installed

assert interface_is_installed("m3gnet"), (
    "You must first install m3gnet with the AMS package manager"
)

Initialize PLAMS

plams.init()
PLAMS working folder: /path/to/plams_workdir

Input system

mol = plams.from_smiles("OCC=O")
for at in mol:
    at.properties = {}
plams.plot_molecule(mol)
image generated from notebook

Reference engine settings

For time reasons we use the UFF force field as the reference method. Typically you would instead train to DFT using ADF, BAND, or Quantum ESPRESSO.

ref_s = plams.Settings()
ref_s.input.ForceField.Type = "UFF"
ref_s.runscript.nproc = 1
print(plams.AMSJob(settings=ref_s).get_input())
Engine ForceField
  Type UFF
EndEngine

Molecular dynamics settings

Here, we use the convenient AMSNVTJob recipe to easily initialize sone MD settings.

md_s = plams.AMSNVTJob(temperature=300, timestep=0.5, nsteps=10000).settings
print(plams.AMSJob(settings=md_s).get_input())
MolecularDynamics
  BinLog
    DipoleMoment False
    PressureTensor False
    Time False
... output trimmed ....
End

Task MolecularDynamics

ParAMS ML Training settings

(Technical note: When using SimpleActiveLearningJob the ParAMS settings go under input.ams. When using ParAMSJob the settings instead simply go under input. See the ParAMS Python tutorials.)

ml_s = plams.Settings()
ml_s.input.ams.MachineLearning.Backend = "M3GNet"
ml_s.input.ams.MachineLearning.CommitteeSize = 1
ml_s.input.ams.MachineLearning.M3GNet.Model = "UniversalPotential"
ml_s.input.ams.MachineLearning.MaxEpochs = 200
print(SimpleActiveLearningJob(settings=ml_s).get_input())
MachineLearning
  Backend M3GNet
  CommitteeSize 1
  M3GNet
    Model UniversalPotential
  End
  MaxEpochs 200
End

Active learning settings

al_s = plams.Settings()
al_s.input.ams.ActiveLearning.Steps.Type = "Geometric"
al_s.input.ams.ActiveLearning.Steps.Geometric.Start = 10  # 10 MD frames
al_s.input.ams.ActiveLearning.Steps.Geometric.NumSteps = 5  # 10 AL steps
print(SimpleActiveLearningJob(settings=al_s).get_input())
ActiveLearning
  Steps
    Geometric
      NumSteps 5
      Start 10
    End
    Type Geometric
  End
End

Simple Active Learning Job

settings = ref_s + md_s + ml_s + al_s
job = SimpleActiveLearningJob(settings=settings, molecule=mol, name="sal")
print(job.get_input())
ActiveLearning
  Steps
    Geometric
      NumSteps 5
      Start 10
... output trimmed ....
     4 8 1.0
  End
  Charge 0
End

Run the job

job.run(watch=True)
[31.01|17:43:33] JOB sal STARTED
[31.01|17:43:33] JOB sal RUNNING
[31.01|17:43:34] Simple Active Learning 2023.205,  Nodes: 1, Procs: 1
[31.01|17:43:36] Composition of main system: C2H4O2
[31.01|17:43:36] All REFERENCE calculations will be performed with the following ForceField engine:
... output trimmed ....
[31.01|17:55:16] Active learning finished!
[31.01|17:55:16] Rerunning the simulation with the final parameters...
[31.01|17:56:52] Goodbye!
[31.01|17:56:52] JOB sal FINISHED
[31.01|17:56:52] JOB sal SUCCESSFUL





<scm.params.plams.simple_active_learning_job.SimpleActiveLearningResults at 0x7fd4176eaa90>

See also

Python Script

#!/usr/bin/env python
# coding: utf-8

# ## Initial imports

from scm.simple_active_learning import SimpleActiveLearningJob
import scm.plams as plams
from scm.external_engines.core import interface_is_installed

assert interface_is_installed("m3gnet"), "You must first install m3gnet with the AMS package manager"


# ## Initialize PLAMS

plams.init()


# ## Input system

mol = plams.from_smiles("OCC=O")
for at in mol:
    at.properties = {}
plams.plot_molecule(mol)


# ## Reference engine settings
# For time reasons we use the UFF force field as the reference method. Typically you would instead train to DFT using ADF, BAND, or Quantum ESPRESSO.

ref_s = plams.Settings()
ref_s.input.ForceField.Type = "UFF"
ref_s.runscript.nproc = 1


print(plams.AMSJob(settings=ref_s).get_input())


# ## Molecular dynamics settings
# Here, we use the convenient ``AMSNVTJob`` recipe to easily initialize sone MD settings.

md_s = plams.AMSNVTJob(temperature=300, timestep=0.5, nsteps=10000).settings


print(plams.AMSJob(settings=md_s).get_input())


# ## ParAMS ML Training settings
#
# (Technical note: When using ``SimpleActiveLearningJob`` the ParAMS settings go under ``input.ams``. When using ``ParAMSJob`` the settings instead simply go under ``input``. See the ParAMS Python tutorials.)

ml_s = plams.Settings()
ml_s.input.ams.MachineLearning.Backend = "M3GNet"
ml_s.input.ams.MachineLearning.CommitteeSize = 1
ml_s.input.ams.MachineLearning.M3GNet.Model = "UniversalPotential"
ml_s.input.ams.MachineLearning.MaxEpochs = 200
print(SimpleActiveLearningJob(settings=ml_s).get_input())


# ## Active learning settings

al_s = plams.Settings()
al_s.input.ams.ActiveLearning.Steps.Type = "Geometric"
al_s.input.ams.ActiveLearning.Steps.Geometric.Start = 10  # 10 MD frames
al_s.input.ams.ActiveLearning.Steps.Geometric.NumSteps = 5  # 10 AL steps
print(SimpleActiveLearningJob(settings=al_s).get_input())


# ## Simple Active Learning Job

settings = ref_s + md_s + ml_s + al_s
job = SimpleActiveLearningJob(settings=settings, molecule=mol, name="sal")
print(job.get_input())


# ## Run the job

job.run(watch=True)