{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Manually set ParAMSJob Settings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Call the ``get_input()`` function to see what the ``params.in`` file will look like.\n",
    "\n",
    "Note:\n",
    "\n",
    "* All paths must be absolute paths. This is needed to be able to run the job through PLAMS\n",
    "* ``DataSet`` and ``Optimizer`` are *recurring blocks*, so they are initialized as lists. The ``DataSet[0]`` syntax means that the settings are set for the *first* dataset, etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Task Optimization\n",
      "\n",
      "DataSet\n",
      "  Name training_set\n",
      "  Path /path/training_set.yaml\n",
      "end\n",
      "DataSet\n",
      "  Name validation_set\n",
      "  Path /path/validation_set.yaml\n",
      "end\n",
      "\n",
      "JobCollection /path/job_collection.yaml\n",
      "\n",
      "LoggingInterval\n",
      "  General 10\n",
      "end\n",
      "\n",
      "Optimizer\n",
      "  CMAES\n",
      "    Popsize 8\n",
      "    Sigma0 0.01\n",
      "  End\n",
      "  Type CMAES\n",
      "end\n",
      "\n",
      "ParallelLevels\n",
      "  Optimizations 1\n",
      "end\n",
      "\n",
      "SkipX0 No\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "from scm.plams import *\n",
    "from scm.params import *\n",
    "import os\n",
    "\n",
    "job = ParAMSJob()\n",
    "job.settings.input.Task = 'Optimization'\n",
    "job.settings.input.JobCollection = '/path/job_collection.yaml' # absolute path\n",
    "job.settings.input.DataSet = [Settings(), Settings()] #DataSet is a recurring block\n",
    "job.settings.input.DataSet[0].Name = 'training_set'\n",
    "job.settings.input.DataSet[0].Path = '/path/training_set.yaml' # absolute path\n",
    "job.settings.input.DataSet[1].Name = 'validation_set'\n",
    "job.settings.input.DataSet[1].Path = '/path/validation_set.yaml' # absolute path\n",
    "job.settings.input.LoggingInterval.General = 10\n",
    "job.settings.input.SkipX0 = 'No' # Booleans are specified as strings \"Yes\" or \"No\"\n",
    "job.settings.input.Optimizer = [Settings()] # Optimizer is a recurring block\n",
    "job.settings.input.Optimizer[0].Type = 'CMAES'\n",
    "job.settings.input.Optimizer[0].CMAES.Sigma0 = 0.01\n",
    "job.settings.input.Optimizer[0].CMAES.Popsize = 8\n",
    "job.settings.input.ParallelLevels.Optimizations = 1 # ParallelLevels is NOT a recurring block\n",
    "\n",
    "print(job.get_input())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load a job from a params.in file\n",
    "\n",
    "If you already have a ``params.in`` file (for example created by the GUI or by hand), you can simply load it into a ``ParAMSJob`` using ``from_inputfile()``.\n",
    "\n",
    "Note that any paths in the ``params.in`` file get converted to absolute paths."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "task Optimization\n",
      "\n",
      "parameterinterface /home/hellstrom/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml\n",
      "\n",
      "dataset\n",
      "  name training_set\n",
      "  path /home/hellstrom/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml\n",
      "end\n",
      "\n",
      "jobcollection /home/hellstrom/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml\n",
      "\n",
      "exitcondition\n",
      "  timelimit 120\n",
      "  type TimeLimit\n",
      "end\n",
      "exitcondition\n",
      "  maxoptimizersconverged 1\n",
      "  type MaxOptimizersConverged\n",
      "end\n",
      "\n",
      "optimizer\n",
      "  scipy\n",
      "    algorithm Nelder-Mead\n",
      "  End\n",
      "  type Scipy\n",
      "end\n",
      "\n",
      "parallellevels\n",
      "  jobs 1\n",
      "  optimizations 1\n",
      "  parametervectors 1\n",
      "  processes 1\n",
      "end\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "params_in_file = os.path.expandvars('$AMSHOME/scripting/scm/params/examples/LJ_Ar/params.in')\n",
    "job = ParAMSJob.from_inputfile(params_in_file)\n",
    "print(job.get_input())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create a ParAMSJob from a directory with .yaml files\n",
    "\n",
    "In the input file you need to specify many paths to different .yaml files. This can be tedious to set up manually. If you have a directory with .yaml files (e.g. the ``jobname.params`` directory created by the GUI), you can initialize a ParAMSJob to read those yaml files using ``from_yaml()``. The files need to have the default names:\n",
    "\n",
    "\n",
    "\n",
    "* job_collection.yaml\n",
    "* training_set.yaml\n",
    "* validation_set.yaml\n",
    "* job_collection_engines.yaml or engine_collection.yaml\n",
    "* parameter_interface.yaml or parameters.yaml\n",
    "\n",
    "Note 1: When you run a ParAMSJob, any .yaml files in the current working directory will be used if they have the default names and if the corresponding settings are unset. In this way, you do not need to specify the paths in the ``settings`` if you have the .yaml files in the same directory as the script ``.py`` file that runs the job.\n",
    "\n",
    "Note 2: ``from_yaml()`` only sets the settings for the yaml files and leaves all other settings empty. ``from_inputfile()`` reads all the settings from the params.in file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ParameterInterface /home/hellstrom/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml\n",
      "\n",
      "DataSet\n",
      "  Name training_set\n",
      "  Path /home/hellstrom/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml\n",
      "end\n",
      "\n",
      "JobCollection /home/hellstrom/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "job = ParAMSJob.from_yaml(os.path.expandvars('$AMSHOME/scripting/scm/params/examples/LJ_Ar'))\n",
    "print(job.get_input())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Input validation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The allowed settings blocks, keys, and values are described in the documentation. If you make a mistake in the block or key names, ``get_input()`` will raise an error:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Input error: unrecognized entry \"nonexistingkey\" found in line 10\n"
     ]
    }
   ],
   "source": [
    "job.settings.input.NonExistingKey = 3.14159\n",
    "try:\n",
    "    print(job.get_input())\n",
    "except Exception as e:\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you want to print the input anyway, use ``get_input(validate=False)``:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ParameterInterface /home/hellstrom/adfhome/scripting/scm/params/examples/LJ_Ar/parameter_interface.yaml\n",
      "\n",
      "DataSet\n",
      "  Name training_set\n",
      "  Path /home/hellstrom/adfhome/scripting/scm/params/examples/LJ_Ar/training_set.yaml\n",
      "end\n",
      "\n",
      "JobCollection /home/hellstrom/adfhome/scripting/scm/params/examples/LJ_Ar/job_collection.yaml\n",
      "\n",
      "NonExistingKey 3.14159\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "print(job.get_input(validate=False)) # print the input anyway"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Delete a block or key"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To delete an entry from the Settings, use ``del``:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "del job.settings.input.NonExistingKey "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Attributes for easier setup of .yaml files"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "``ParAMSJob`` has some special attributes which makes it easier to set up the settings. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DataSet\n",
      "  Name training_set\n",
      "  path my_training_set.yaml\n",
      "end\n",
      "\n",
      "JobCollection my_job_collection.yaml\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "job = ParAMSJob()\n",
    "job.job_collection = 'my_job_collection.yaml' # will be converted to absolute path if it exists\n",
    "job.training_set = 'my_training_set.yaml' # will be converted to absolute path if it exists\n",
    "print(job.get_input())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that ``job.training_set`` and ``job.validation_set`` are quite special: when you assign a string to them as above, it will set the corresponding ``path`` in the ``settings``. But when you read them you will get the corresponding Settings block:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Name: \ttraining_set\n",
      "path: \tmy_training_set.yaml\n",
      "\n",
      "<class 'scm.plams.core.settings.Settings'>\n"
     ]
    }
   ],
   "source": [
    "print(job.training_set)\n",
    "print(type(job.training_set))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "my_training_set.yaml\n",
      "<class 'str'>\n"
     ]
    }
   ],
   "source": [
    "print(job.training_set.path)\n",
    "print(type(job.training_set.path))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Assigning to ``job.validation_set`` will create another item in the ``job.settings.input.DataSet`` list:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DataSet\n",
      "  Name training_set\n",
      "  path my_training_set.yaml\n",
      "end\n",
      "DataSet\n",
      "  Name validation_set\n",
      "  path validation_set.yaml\n",
      "end\n",
      "\n",
      "JobCollection my_job_collection.yaml\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "job.validation_set = 'validation_set.yaml'\n",
    "print(job.get_input())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To set other settings for the training set or validation set, use the standard dot-notation:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DataSet\n",
      "  Name training_set\n",
      "  path my_training_set.yaml\n",
      "end\n",
      "DataSet\n",
      "  EvaluateEvery 100\n",
      "  Name validation_set\n",
      "  path validation_set.yaml\n",
      "end\n",
      "\n",
      "JobCollection my_job_collection.yaml\n",
      "\n",
      "LoggingInterval\n",
      "  General 100\n",
      "end\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "job.validation_set.EvaluateEvery = 100\n",
    "job.settings.input.LoggingInterval.General = 100\n",
    "print(job.get_input())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also use the ``job.parameter_interface`` and ``job.engine_collection`` in the same way as ``job.job_collection``:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "ParameterInterface my_parameter_interface.yaml\n",
      "\n",
      "DataSet\n",
      "  Name training_set\n",
      "end\n",
      "\n",
      "EngineCollection my_engine_collection.yaml\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "job = ParAMSJob()\n",
    "job.parameter_interface = 'my_parameter_interface.yaml' # will be converted to absolute path if it exists\n",
    "job.engine_collection = 'my_engine_collection.yaml' # will be converted to absolute path if it exists\n",
    "print(job.get_input())\n",
    "# note: job.training_set is always defined, this is why a DataSet block is printed below"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Functions for recurring blocks: Optimizers, Stoppers, ExitConditions\n",
    "\n",
    "Use the below functions to easily add optimizers, stoppers, or exit conditions:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DataSet\n",
      "  Name training_set\n",
      "end\n",
      "\n",
      "ExitCondition\n",
      "  MaxTotalFunctionCalls 100000\n",
      "  Type MaxTotalFunctionCalls\n",
      "end\n",
      "ExitCondition\n",
      "  TimeLimit 86400\n",
      "  Type TimeLimit\n",
      "end\n",
      "ExitCondition\n",
      "  StopsAfterConvergence\n",
      "    OptimizersConverged 3\n",
      "    OptimizersStopped 1\n",
      "  End\n",
      "  Type StopsAfterConvergence\n",
      "end\n",
      "\n",
      "Optimizer\n",
      "  CMAES\n",
      "    PopSize 8\n",
      "    Sigma0 0.01\n",
      "  End\n",
      "  Type CMAES\n",
      "end\n",
      "Optimizer\n",
      "  Scipy\n",
      "  End\n",
      "  Type Scipy\n",
      "end\n",
      "\n",
      "Stopper\n",
      "  BestFunctionValueUnmoving\n",
      "    Tolerance 0.1\n",
      "  End\n",
      "  Type BestFunctionValueUnmoving\n",
      "end\n",
      "Stopper\n",
      "  MaxFunctionCalls 1000\n",
      "  Type MaxFunctionCalls\n",
      "end\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "job = ParAMSJob()\n",
    "\n",
    "job.add_exit_condition(\"MaxTotalFunctionCalls\", 100000)\n",
    "job.add_exit_condition(\"TimeLimit\", 24*60*60)\n",
    "job.add_exit_condition(\"StopsAfterConvergence\", {'OptimizersConverged': 3, 'OptimizersStopped': 1})\n",
    "\n",
    "job.add_optimizer(\"CMAES\", {'Sigma0': 0.01, 'PopSize': 8})\n",
    "job.add_optimizer(\"Scipy\")\n",
    "\n",
    "job.add_stopper(\"BestFunctionValueUnmoving\", {'Tolerance': 0.1})\n",
    "job.add_stopper(\"MaxFunctionCalls\", 1000)\n",
    "\n",
    "print(job.get_input())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To delete an added recurring block, use ``pop`` together with zero-based indices:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "DataSet\n",
      "  Name training_set\n",
      "end\n",
      "\n",
      "ExitCondition\n",
      "  MaxTotalFunctionCalls 100000\n",
      "  Type MaxTotalFunctionCalls\n",
      "end\n",
      "ExitCondition\n",
      "  StopsAfterConvergence\n",
      "    OptimizersConverged 3\n",
      "    OptimizersStopped 1\n",
      "  End\n",
      "  Type StopsAfterConvergence\n",
      "end\n",
      "\n",
      "Optimizer\n",
      "  CMAES\n",
      "    PopSize 8\n",
      "    Sigma0 0.01\n",
      "  End\n",
      "  Type CMAES\n",
      "end\n",
      "\n",
      "Stopper\n",
      "  MaxFunctionCalls 1000\n",
      "  Type MaxFunctionCalls\n",
      "end\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "job.settings.input.ExitCondition.pop(1) # 2nd exit condition\n",
    "job.settings.input.Optimizer.pop(1) # 2nd optimizer\n",
    "job.settings.input.Stopper.pop(0) # first stopper\n",
    "print(job.get_input())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note: the ``ExitConditionBooleanCombination`` and ``StopperBooleanCombination`` work with indices starting with 1."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}