{ "cells": [ { "cell_type": "markdown", "id": "5c117f05-f316-4aac-8d53-30824e7487e0", "metadata": {}, "source": [ "## Load a SimpleActiveLearningJob from disk\n", "\n", "Use ``load_external`` to load the job from the previous Jupyter Notebook tutorial. Make sure to provide the correct path!" ] }, { "cell_type": "code", "execution_count": 2, "id": "6991e304-c3ed-49a6-bf81-d1123540958c", "metadata": {}, "outputs": [], "source": [ "from scm.simple_active_learning import SimpleActiveLearningJob\n", "import scm.plams as plams\n", "import matplotlib.pyplot as plt\n", "import os" ] }, { "cell_type": "code", "execution_count": 3, "id": "cd1c69be-d5fa-4f67-8213-9237b70b2943", "metadata": {}, "outputs": [], "source": [ "# replace the path with your own path !\n", "previous_sal_job_path = os.path.expandvars(\"$AMSHOME/examples/SAL/Output/SingleMolecule/plams_workdir/sal\")\n", "job = SimpleActiveLearningJob.load_external(previous_sal_job_path)" ] }, { "cell_type": "markdown", "id": "ec7211b3-c28f-4163-a56d-c376f0939925", "metadata": {}, "source": [ "## Access the log file\n", "\n", "The results of the active learning are printed in a human-friendly format in the log file. For example, let's print the last few lines:" ] }, { "cell_type": "code", "execution_count": 4, "id": "da07e278-6da4-4da2-bfcd-19274f6f8506", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[31.01|17:55:16] \n", "[31.01|17:55:16] Current (cumulative) timings:\n", "[31.01|17:55:16] Time (s) Fraction\n", "[31.01|17:55:16] Ref. calcs 20.44 0.034\n", "[31.01|17:55:16] ML training 365.82 0.608\n", "[31.01|17:55:16] Simulations 215.21 0.358\n", "[31.01|17:55:16] \n", "[31.01|17:55:16] \n", "[31.01|17:55:16] Step 5 finished successfully!\n", "[31.01|17:55:16] \n", "[31.01|17:55:16] --- Begin summary ---\n", "[31.01|17:55:16] Step Attempt Status Reason finalframe_forces_max_delta\n", "[31.01|17:55:16] 1 1 FAILED Inaccurate 1.7423\n", "[31.01|17:55:16] 1 2 SUCCESS Accurate 0.5734\n", "[31.01|17:55:16] 2 1 FAILED Inaccurate 1.0679\n", "[31.01|17:55:16] 2 2 SUCCESS Accurate 0.3920\n", "[31.01|17:55:16] 3 1 FAILED Inaccurate 0.9762\n", "[31.01|17:55:16] 3 2 FAILED Inaccurate 0.7560\n", "[31.01|17:55:16] 3 3 SUCCESS Accurate 0.1781\n", "[31.01|17:55:16] 4 1 FAILED Inaccurate 0.3389\n", "[31.01|17:55:16] 4 2 SUCCESS Accurate 0.2827\n", "[31.01|17:55:16] 5 1 SUCCESS Accurate 0.5300\n", "[31.01|17:55:16] --- End summary ---\n", "[31.01|17:55:16] \n", "[31.01|17:55:16] The engine settings for the final trained ML engine are:\n", "[31.01|17:55:16] \n", "Engine MLPotential\n", " Backend M3GNet\n", " MLDistanceUnit angstrom\n", " MLEnergyUnit eV\n", " Model Custom\n", " ParameterDir /home/hellstrom/adfhome/scripting/scm/params/examples/ActiveLearning/jupyter_notebooks/example1_molecule_2hydroxyethanal/plams_workdir/sal/step4_attempt1_training/results/optimization/m3gnet/m3gnet\n", "EndEngine\n", "\n", "\n", "\n", "[31.01|17:55:16] Active learning finished!\n", "[31.01|17:55:16] Rerunning the simulation with the final parameters...\n", "[31.01|17:56:52] Goodbye!\n", "\n" ] } ], "source": [ "n_lines = 40\n", "end_of_logfile = \"\\n\".join(job.results.read_file(\"simple_active_learning.log\").split(\"\\n\")[-n_lines:])\n", "print(end_of_logfile)" ] }, { "cell_type": "markdown", "id": "b935ea12-d613-425e-ba6f-5ca771480f84", "metadata": {}, "source": [ "Above we can easily see that there were 5 active learning steps, and the engine settings for the final trained ML potential.\n", "\n", "Tip: You can copy-paste the lines from ``Engine MLPotential`` to ``EndEngine`` into AMSinput to use those engine settings for other production simulations in the GUI." ] }, { "cell_type": "markdown", "id": "6bb116b9-e4e4-4cad-954a-5ba153dd5e3f", "metadata": {}, "source": [ "## Access the MD trajectories\n", "\n", "By default, the ``ActiveLearning%AtEnd%RerunSimulation`` option is enabled. This means that after the active learning loop has finished, the entire simulation is rerun from scratch with the final set of parameters.\n", "\n", "There are thus two trajectories:\n", "\n", "* A trajectory run only with the final parameters, and that is just a normal AMS job. This trajectory is located in the directory ``final_production_simulation`` if it exists.\n", "\n", "* A trajectory where the parameters have been updated on-the-fly, and which may also have an inconsistent MD sampling frequency. This trajectory is located in one of the ``stepX_attemptY_simulation`` directories.\n", "\n", "Use the ``get_simulation_directory`` method to get the corresponding directories.\n", "\n", "Let's start with the **final production simulation**:" ] }, { "cell_type": "code", "execution_count": 5, "id": "c8746749-2d8f-48cc-bc92-50c3ad8200ec", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/hellstrom/adfhome/scripting/scm/simple_active_learning/examples/Output/SingleMolecule/plams_workdir/sal/final_production_simulation\n" ] } ], "source": [ "final_production_simulation_dir = job.results.get_simulation_directory(allow_final=True)\n", "print(final_production_simulation_dir)" ] }, { "cell_type": "markdown", "id": "00fb1b05-6569-442b-b950-622f5c38d789", "metadata": {}, "source": [ "View the trajectory in AMSmovie:" ] }, { "cell_type": "code", "execution_count": 6, "id": "b16c1c88-dffb-4c34-a98d-8dc9fef20516", "metadata": {}, "outputs": [], "source": [ "final_job = plams.AMSJob.load_external(final_production_simulation_dir)\n", "final_rkf = final_job.results.rkfpath()" ] }, { "cell_type": "code", "execution_count": 7, "id": "b6a9b03f-bfe7-47f1-ba36-ec67bb1b8948", "metadata": {}, "outputs": [], "source": [ "!amsmovie {final_rkf}" ] }, { "cell_type": "markdown", "id": "fc7ebca5-ff48-43bf-9528-a6a7874f8e2f", "metadata": {}, "source": [ "Let's then get the **other (with on-the-fly-updated-engine) trajectory**: " ] }, { "cell_type": "code", "execution_count": 8, "id": "9fd27790-6216-43ac-94c0-1436edd59621", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/hellstrom/adfhome/scripting/scm/simple_active_learning/examples/Output/SingleMolecule/plams_workdir/sal/step5_attempt1_simulation\n" ] } ], "source": [ "onthefly_simulation_dir = job.results.get_simulation_directory(allow_final=False)\n", "print(onthefly_simulation_dir)" ] }, { "cell_type": "code", "execution_count": 9, "id": "27aec137-ad2c-4410-afdc-d55d2466b8dc", "metadata": {}, "outputs": [], "source": [ "onthefly_job = plams.AMSJob.load_external(onthefly_simulation_dir)\n", "onthefly_rkf = onthefly_job.results.rkfpath()" ] }, { "cell_type": "code", "execution_count": 10, "id": "50472af0-90b2-48ff-bec1-2f958d1331df", "metadata": {}, "outputs": [], "source": [ "!amsmovie {onthefly_rkf}" ] }, { "cell_type": "markdown", "id": "f16a7e03-9584-4d13-9b24-b106b146a3e6", "metadata": {}, "source": [ "Let's compare energy-vs-frame for the two trajectories:" ] }, { "cell_type": "code", "execution_count": 11, "id": "8edcccd8-5e11-4c4f-b230-bd81676249ba", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "plt.plot(\n", " final_job.results.get_history_property(\"Time\", \"MDHistory\"),\n", " final_job.results.get_history_property(\"EngineEnergy\"),\n", ")\n", "plt.plot(\n", " onthefly_job.results.get_history_property(\"Time\", \"MDHistory\"),\n", " onthefly_job.results.get_history_property(\"EngineEnergy\"),\n", ")\n", "plt.legend([\"Final\", \"On-the-fly\"])\n", "plt.xlabel(\"Time (fs)\")\n", "plt.ylabel(\"Engine energy (hartree)\");" ] }, { "cell_type": "markdown", "id": "94e2bab0-0bc1-4d7b-bb6c-f7bfafc71b97", "metadata": {}, "source": [ "The energy profiles look quite similar. For the on-the-fly trajectory, there are more datapoints for short times since the Simple Active Learning tool samples more frequently in the beginning of the simulation when there are only a few MD steps per active learning step." ] }, { "cell_type": "markdown", "id": "d43181e1-f76c-4a32-9ddd-5996d981c4d4", "metadata": {}, "source": [ "## Access the ParAMS training results\n", "\n", "Similarly to the final production trajectory, there is an input option ``ActiveLearning%AtEnd%RetrainModel`` which will retrain the model at the end, guaranteeing that all the generated reference data is used during the training or validation. However, this option is off by default.\n", "\n", "The method ``get_params_results_directory()`` returns the ParAMS results directory, which can be\n", "* used as the value for ``MachineLearning%LoadModel`` to continue with another active learning run, or\n", "* opened in the ParAMS GUI to view all results, including loss function minimization and predicted-vs-reference scatter plots\n", "\n", "The method ``get_params_job()`` returns a ``ParAMSJob`` whose results can directly be accessed using the normal ``ParAMSJob`` and ``ParAMSResults`` python APIs." ] }, { "cell_type": "code", "execution_count": 12, "id": "c5e72c41-dc4d-46fa-855d-24256cf557d6", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/hellstrom/adfhome/scripting/scm/simple_active_learning/examples/Output/SingleMolecule/plams_workdir/sal/step4_attempt1_training/results\n" ] } ], "source": [ "params_results_dir = job.results.get_params_results_directory(allow_final=True)\n", "print(params_results_dir)" ] }, { "cell_type": "markdown", "id": "30cb7d21-db9b-4613-a5c6-68b3f21addbc", "metadata": {}, "source": [ "**Open it in the ParAMS GUI**:" ] }, { "cell_type": "code", "execution_count": 13, "id": "070c3a5d-509b-43c6-ba37-4be432d487bb", "metadata": {}, "outputs": [], "source": [ "!params -gui \"{params_results_dir}\"" ] }, { "cell_type": "markdown", "id": "60348ad8-3615-4473-8f5d-f760d824a8d8", "metadata": {}, "source": [ "The ParAMS GUI is the best way to quickly get an overview of the data sets and results.\n", "\n", "However, it can also be useful to **access results from Python**. \n", "\n", "The details of the ParAMSJob and ParAMSResults classes are shown in the ParAMS Python examples, here we just provide a quick example:" ] }, { "cell_type": "code", "execution_count": 14, "id": "fa9cc66c-93e2-4ae7-9407-f3a92798511c", "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": { "needs_background": "light" }, "output_type": "display_data" } ], "source": [ "params_job = job.results.get_params_job()\n", "params_job.results.plot_simple_correlation(\"forces\", source=\"best\");" ] }, { "cell_type": "markdown", "id": "a42527f5-8498-4acd-bcf6-1ffa1fb589b5", "metadata": {}, "source": [ "If you want to access individual data entries or the MAE in Python, you can use the ``params_job.results.get_data_set_evaluator()`` method. See the ParAMS DataSetEvaluator documentation for details." ] }, { "cell_type": "markdown", "id": "95af11e4-8c9b-4922-a061-3efbc2c53456", "metadata": {}, "source": [ "## Access the production engine settings\n", "\n", "The engine settings used for production simulation can be accessed from the ParAMS job:" ] }, { "cell_type": "code", "execution_count": 15, "id": "8b435a36-a937-4199-b4fb-ada7bcc0fd18", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Engine MLPotential\n", " Backend M3GNet\n", " MLDistanceUnit angstrom\n", " MLEnergyUnit eV\n", " Model Custom\n", " ParameterDir /home/hellstrom/adfhome/scripting/scm/simple_active_learning/examples/Output/SingleMolecule/plams_workdir/sal/step4_attempt1_training/results/optimization/m3gnet/m3gnet\n", "EndEngine\n", "\n", "\n" ] } ], "source": [ "engine_settings = params_job.results.get_production_engine_settings()\n", "print(plams.AMSJob(settings=engine_settings).get_input())" ] }, { "cell_type": "markdown", "id": "26f36de7-0c44-4b49-8958-3dee5bb3e23e", "metadata": {}, "source": [ "## Access the reference data" ] }, { "cell_type": "markdown", "id": "f549be48-c774-4d40-9675-a8cdbd1a297e", "metadata": {}, "source": [ "The ``stepX_attemptY_reference_data`` directories can be accessed using ``get_reference_data_directory()``:" ] }, { "cell_type": "code", "execution_count": 16, "id": "50384340-2c0d-4133-8921-51b99e76ae1b", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/hellstrom/adfhome/scripting/scm/simple_active_learning/examples/Output/SingleMolecule/plams_workdir/sal/step5_attempt1_reference_data\n" ] } ], "source": [ "ref_dir = job.results.get_reference_data_directory()\n", "print(ref_dir)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" } }, "nbformat": 4, "nbformat_minor": 5 }