from __future__ import annotations
from pathlib import Path
from typing import Iterable, Literal, Sequence
from scm.pisa.block import DriverBlock,EngineBlock,FixedBlock,FreeBlock,InputBlock
from scm.pisa.key import BoolKey,FloatKey,FloatListKey,IntKey,IntListKey,MultipleChoiceKey,PathStringKey,StringKey,BoolType
[docs]class ParAMS(DriverBlock):
r"""
:ivar ApplyStoppersToBestOptimizer: By default the stoppers are not applied to the best optimizer (the one who has seen the best value thus far). This is because many stoppers are based on comparisons to the best optimizer, and in most scenarios one would like to keep a well-performing optimizer alive. For some stopper configurations this paradigm does not make sense and we would prefer to apply the stoppers equally to all optimizers.
:vartype ApplyStoppersToBestOptimizer: BoolType | BoolKey
:ivar CheckStopperInterval: Number of loss function evaluations between evaluations of the stopper conditions.
:vartype CheckStopperInterval: int | IntKey
:ivar EndTimeout: The amount of time the manager will wait trying to smoothly join each optimizer at the end of the run. If exceeded the manager will abandon the optimizer and shutdown. This can raise errors from the abandoned threads, but may be needed to ensure the manager closes and does not hang.
This option is often needed if the Scipy optimizers are being used and should be set to a low value.
:vartype EndTimeout: float | FloatKey
:ivar EngineCollection: Path to (optional) JobCollection Engines YAML file.
:vartype EngineCollection: str | StringKey
:ivar EvaluateLoss: Evaluate the loss function based on the job results. This will produce the same output files as Task Optimization.
If No, this will be skipped and only the jobs will be run (and saved).
Warning: If both Store Jobs and Evaluate Loss are No then this task will not produce any output.
:vartype EvaluateLoss: BoolType | BoolKey
:ivar ExitConditionBooleanCombination: If multiple ExitConditions are used, this key indicates how their evaluations relate to one another.
Use an integer to refer to a exit condition (defined by order in input file).
Recognizes the symbols: ( ) & |
E.g. (1 & 2) | 3.
Defaults to an OR combination of all selected exit conditions.
:vartype ExitConditionBooleanCombination: str | StringKey
:ivar FilterInfiniteValues: If Yes, removes points from the calculation with non-finite loss values.
Non-finite points can cause numerical issues in the sensitivity calculation.
:vartype FilterInfiniteValues: BoolType | BoolKey
:ivar GlompoLogging: Include status and progress information from the optimization manager in the printstreams.
:vartype GlompoLogging: BoolType | BoolKey
:ivar GlompoSummaryFiles: Indicates what GloMPO-style outputs you would like saved to disk. Higher values also save all lower level information.
Available options:
• None: Nothing is saved.
• 1: YAML file with summary info about the optimization settings, performance and the result.
• 2: PNG file showing the trajectories of the optimizers.
• 3: HDF5 file containing iteration history for each optimizer.
• 4: More detailed HDF5 log including the residual results for each optimizer, data set and iteration.
:vartype GlompoSummaryFiles: Literal["None", "1", "2", "3", "4"]
:ivar JobCollection: Path to JobCollection YAML file.
:vartype JobCollection: str | StringKey
:ivar MoreExtractorsPath: Path to directory with extractors.
:vartype MoreExtractorsPath: str | Path | StringKey
:ivar NumberBootstraps: Number of repeats of the calculation with different sub-samples.
A small spread from a large number of bootstraps provides confidence on the estimation of the sensitivity.
:vartype NumberBootstraps: int | IntKey
:ivar NumberCalculationSamples: Number of samples from the full set available to use in the calculation.
If not specified or -1, uses all available points. For the sensitivity calculation, this will be redrawn for every bootstrap.
:vartype NumberCalculationSamples: int | IntKey
:ivar NumberSamples: Number of samples to generate during the sampling procedure.
:vartype NumberSamples: int | IntKey
:ivar PLAMSWorkingDirectory: Path to PLAMS working directory to temporarily hold Job results files.
:vartype PLAMSWorkingDirectory: str | Path | StringKey
:ivar ParameterInterface: Path to parameter interface YAML file.
:vartype ParameterInterface: str | StringKey
:ivar PrintStatusInterval: Number of seconds between printing of a status summary.
:vartype PrintStatusInterval: float | FloatKey
:ivar RandomSeed: Random seed to use during the sampling procedure (for reproducibility).
:vartype RandomSeed: int | IntKey
:ivar RestartDirectory: Specify a directory to continue interrupted GenerateReference or SinglePoint calculations. The directory depends on the task:
GenerateReference: results/reference_jobs
SinglePoint: results/single_point/jobs
Note: If you use the GUI this directory will be COPIED into the results folder and the name will be prepended with 'dep-'. This can take up a lot of disk space, so you may want to remove the 'dep-' folder after the job has finished.
:vartype RestartDirectory: str | Path | StringKey
:ivar ResultsDirectory: Directory in which output files will be created.
:vartype ResultsDirectory: str | Path | StringKey
:ivar ResumeCheckpoint: Path to checkpoint file from which a previous optimization can be resumed.
:vartype ResumeCheckpoint: str | Path | StringKey
:ivar RunReweightCalculation: Run a more expensive sensitivity calculation that will also return suggested weights for the training set which will produce more balanced sensitivities between all the parameters.
Note: The Gaussian kernel is recommended for the loss values kernel in this case.
:vartype RunReweightCalculation: BoolType | BoolKey
:ivar RunSampling: Produce a set of samples of the loss function and active parameters. Samples from the parameter space are drawn from a uniform random distribution.
Such a set of samples serves as the input to the sensitivity calculation.
:vartype RunSampling: BoolType | BoolKey
:ivar SampleWithReplacement: Sample from the available data with or without replacement.
This only has an effect if the number of samples for the calculation is less than the total number available otherwise replace is Yes by necessity.
:vartype SampleWithReplacement: BoolType | BoolKey
:ivar SamplesDirectory: Path to an 'optimization' directory containing the results of a previously run sampling.
First looks for a 'glompo_log.h5' file. If not found, will look for 'running_loss.txt' and 'running_active_parameters.txt' in a sub-directory. The sub-directory used will depend on the DataSet Name.
For the Reweight calculation only a 'glompo_log.h5' file (with residuals) may be used.
:vartype SamplesDirectory: str | Path | StringKey
:ivar SaveResiduals: During the sampling, save the individual difference between reference and predicted values for every sample and training set item.
Required for the Reweight calculation, and will be automatically activated if the reweight calculation is requested.
Saving and analyzing the residuals can provide valuable insight into your training set, but can quickly occupy a large amount of disk space. Only save the residuals if you would like to run the reweight calculation or have a particular reason to do so.
:vartype SaveResiduals: BoolType | BoolKey
:ivar Scaler: Type of scaling applied to the parameters. A scaled input space is needed by many optimization algorithms.
Available options:
• Linear: Scale all parameters between 0 and 1.
• Std: Scale all parameters between -1 and 1.
• None: Applies no scaling.
• Optimizers (Default): Does not specify a scaling at the manager level, but allows the selection to be governed by the optimizer/s. If they do not require any particular scaler, then 'linear' is selected as the ultimate fallback.
:vartype Scaler: Literal["Linear", "Std", "None", "Optimizers"]
:ivar SetToAnalyze: Name of the data set to use for the sensitivity analysis.
:vartype SetToAnalyze: Literal["TrainingSet", "ValidationSet"]
:ivar ShareBestEvaluationBetweenOptimizers: Share new best evaluations from one optimizer to another.
Some algorithms can use this information to accelerate their own convergence. However, optimizers typically have to be configured to receive and handle the information.
This option can work very well with CMA-ES injections.
:vartype ShareBestEvaluationBetweenOptimizers: BoolType | BoolKey
:ivar SkipX0: Do not evaluate the initial parameters before starting the optimization.
If the initial parameters evaluated and do not return a finite loss function value, the optimization will abort. A non-infinite value typically indicates crashed jobs.
:vartype SkipX0: BoolType | BoolKey
:ivar SplitPrintstreams: Split print statements from each optimizer to separate files.
:vartype SplitPrintstreams: BoolType | BoolKey
:ivar StopperBooleanCombination: If multiple Stoppers are used this is required to indicate how their evaluations relate to one another.
Use an integer to refer to a stopper (defined by order in input file).
Recognizes the symbols: ( ) & |
E.g. (1 & 2) | 3.
Defaults to an OR combination of all selected stoppers.
:vartype StopperBooleanCombination: str | StringKey
:ivar StoreJobs: Keeps the results files for each of the jobs.
If No, all pipeable jobs will be run through the AMS Pipe and no files will be saved (not even the ones not run through the pipe). If Auto, the pipeable jobs are run through the pipe and the results of nonpipeable jobs are saved to disk. If Yes, no jobs are run through the pipe and all job results are stored on disk.
Warning: If both Store Jobs and Evaluate Loss are No then task SinglePoint will not produce any output.
:vartype StoreJobs: Literal["Auto", "Yes", "No"]
:ivar Task: Task to run.
Available options:
•MachineLearning: Optimization for machine learning models.
•Optimization: Global optimization powered by GloMPO
•Generate Reference: Run jobs with reference engine to get reference values
•Single Point: Evaluate the current configuration of jobs, training data, and parameters
•Sensitivity: Measure the sensitivity of the loss function to each of the active parameters
:vartype Task: Literal["Optimization", "GenerateReference", "SinglePoint", "Sensitivity", "MachineLearning"]
:ivar Validation: Fraction of the training set to be used as a validation set. Will be ignored if a validation set has been explicitly defined.
:vartype Validation: float | FloatKey
:ivar CheckpointControl: Settings to control the production of checkpoints from which the optimization can be resumed.
:vartype CheckpointControl: ParAMS._CheckpointControl
:ivar Constraints: Parameter constraint rules to apply to the loss function. One per line. Use 'p' to reference the parameter set. You may use indices or names in square brackets to refer to a specific variable.
Note, that indices are absolute i.e., they do not reference the active subset of parameters.
Eg:
p[0]>=p[2]
p['O:p_boc2']==p['H:p_boc2']
:vartype Constraints: str | Sequence[str] | FreeBlock
:ivar ControlOptimizerSpawning: Control the spawning of optimizers. Note, this is different from ExitConditions. These simply stop new optimizers from spawning, but leave existing ones untouched. ExitConditions shutdown active optimizers and stop the optimization.
:vartype ControlOptimizerSpawning: ParAMS._ControlOptimizerSpawning
:ivar DataSet: Configuration settings for each data set in the optimization.
:vartype DataSet: ParAMS._DataSet
:ivar Engine: If set, use this engine for the ParAMS SinglePoint. Mutually exclusive with EngineCollection.
:vartype Engine: EngineBlock
:ivar ExitCondition: A condition used to stop the optimization when it returns true.
:vartype ExitCondition: ParAMS._ExitCondition
:ivar Generator: A Generator used to produce x0 starting points for the optimizers.
:vartype Generator: ParAMS._Generator
:ivar LoggingInterval: Number of function evaluations between every log to file.
:vartype LoggingInterval: ParAMS._LoggingInterval
:ivar LossValuesKernel: Kernel applied to the parameters for which sensitivity is being measured.
:vartype LossValuesKernel: ParAMS._LossValuesKernel
:ivar MachineLearning: Options for Task MachineLearning.
:vartype MachineLearning: ParAMS._MachineLearning
:ivar Optimizer: An optimizer which may be used during the optimization.
:vartype Optimizer: ParAMS._Optimizer
:ivar OptimizerSelector: If multiple Optimizers are included, then this block must be included and configures the Selector which will choose between them.
:vartype OptimizerSelector: ParAMS._OptimizerSelector
:ivar ParallelLevels: Distribution of threads/processes between the parallelization levels.
:vartype ParallelLevels: ParAMS._ParallelLevels
:ivar ParametersKernel: Kernel applied to the parameters for which sensitivity is being measured.
:vartype ParametersKernel: ParAMS._ParametersKernel
:ivar Stopper: A Stopper used to terminate optimizers early.
:vartype Stopper: ParAMS._Stopper
"""
[docs] class _CheckpointControl(FixedBlock):
r"""
Settings to control the production of checkpoints from which the optimization can be resumed.
:ivar AtEnd: Create a checkpoint when the exit condition/s are triggered.
:vartype AtEnd: BoolType | BoolKey
:ivar AtInitialisation: Create a checkpoint immediately at the start of an optimization.
:vartype AtInitialisation: BoolType | BoolKey
:ivar CheckpointingDirectory: Directory in which the checkpoints will be saved.
Defaults to 'checkpoints' in the results directory
:vartype CheckpointingDirectory: str | Path | StringKey
:ivar EveryFunctionCalls: Create a checkpoint every n function evaluations.
If not specified or -1, checkpoints are not created based on function calls.
:vartype EveryFunctionCalls: int | IntKey
:ivar EverySeconds: Create a checkpoint every n seconds.
If not specified or -1, a checkpoint is not created based time.
:vartype EverySeconds: float | FloatKey
:ivar KeepPast: Number of earlier checkpoints to keep. Older ones are deleted when a new one is created.
-1 does not delete any previous checkpoints, and 0 retains only the most recent checkpoint.
This number excludes the most recent checkpoint which is obviously always retained! So the actual number of files will be larger than this number by one.
:vartype KeepPast: int | IntKey
:ivar NamingFormat: Convention used to name the checkpoints.
The following special keys are supported:
• %(date): Current calendar date in YYYYMMDD format
• %(year): Year formatted to YYYY
• %(yr): Year formatted to YY
• %(month): Numerical month formatted to MM
• %(day): Calendar day of the month formatted to DD
• %(time): Current calendar time formatted to HHMMSS (24-hour style)
• %(hour): Hour formatted to HH (24-hour style)
• %(min): Minutes formatted to MM
• %(sec): Seconds formatted to SS
• %(count): Index count of the number of checkpoints constructed. Starts at zero, formatted to 3 digits.
:vartype NamingFormat: str | StringKey
:ivar RaiseFail: Raise an error and stop the optimization if a checkpoint fails to be constructed, otherwise issue a warning and continue the optimization.
:vartype RaiseFail: BoolType | BoolKey
"""
def __post_init__(self):
self.AtEnd: BoolType | BoolKey = BoolKey(name='AtEnd', comment='Create a checkpoint when the exit condition/s are triggered.', gui_name='Checkpoint at end: ', default=False)
self.AtInitialisation: BoolType | BoolKey = BoolKey(name='AtInitialisation', comment='Create a checkpoint immediately at the start of an optimization.', gui_name='Checkpoint at start: ', default=False)
self.CheckpointingDirectory: str | Path | StringKey = PathStringKey(name='CheckpointingDirectory', comment="Directory in which the checkpoints will be saved.\nDefaults to 'checkpoints' in the results directory", default='', ispath=True)
self.EveryFunctionCalls: int | IntKey = IntKey(name='EveryFunctionCalls', comment='Create a checkpoint every n function evaluations.\n\nIf not specified or -1, checkpoints are not created based on function calls.', gui_name='Checkpoint interval (function evaluations): ')
self.EverySeconds: float | FloatKey = FloatKey(name='EverySeconds', comment='Create a checkpoint every n seconds.\n\nIf not specified or -1, a checkpoint is not created based time.', gui_name='Checkpoint interval (seconds): ', default=3600.0, unit='s')
self.KeepPast: int | IntKey = IntKey(name='KeepPast', comment='Number of earlier checkpoints to keep. Older ones are deleted when a new one is created.\n-1 does not delete any previous checkpoints, and 0 retains only the most recent checkpoint.\n\nThis number excludes the most recent checkpoint which is obviously always retained! So the actual number of files will be larger than this number by one.', gui_name='Number of older checkpoints to keep: ', default=0)
self.NamingFormat: str | StringKey = StringKey(name='NamingFormat', comment='Convention used to name the checkpoints.\n\nThe following special keys are supported:\n• %(date): Current calendar date in YYYYMMDD format\n• %(year): Year formatted to YYYY\n• %(yr): Year formatted to YY\n• %(month): Numerical month formatted to MM\n• %(day): Calendar day of the month formatted to DD\n• %(time): Current calendar time formatted to HHMMSS (24-hour style)\n• %(hour): Hour formatted to HH (24-hour style)\n• %(min): Minutes formatted to MM\n• %(sec): Seconds formatted to SS\n• %(count): Index count of the number of checkpoints constructed. Starts at zero, formatted to 3 digits.', default='glompo_checkpoint_%(date)_%(time)')
self.RaiseFail: BoolType | BoolKey = BoolKey(name='RaiseFail', comment='Raise an error and stop the optimization if a checkpoint fails to be constructed, otherwise issue a warning and continue the optimization.', gui_name='Exit on failed checkpoint: ', default=False)
[docs] class _Constraints(FreeBlock):
r"""
Parameter constraint rules to apply to the loss function. One per line. Use 'p' to reference the parameter set. You may use indices or names in square brackets to refer to a specific variable.
Note, that indices are absolute i.e., they do not reference the active subset of parameters.
Eg:
p[0]>=p[2]
p['O:p_boc2']==p['H:p_boc2']
"""
def __post_init__(self):
pass
[docs] class _ControlOptimizerSpawning(FixedBlock):
r"""
Control the spawning of optimizers. Note, this is different from ExitConditions. These simply stop new optimizers from spawning, but leave existing ones untouched. ExitConditions shutdown active optimizers and stop the optimization.
:ivar MaxEvaluations: No new optimizers will be started after this number of function evaluations has been used.
Note, this is different from the equivalent exit condition which would terminate existing optimizers rather than simply not allowing new ones to spawn.
:vartype MaxEvaluations: int | IntKey
:ivar MaxOptimizers: No new optimizers will be started after this number has been spawned.
Note, this is different from the equivalent exit condition which would terminate existing optimizers rather than simply not allowing new ones to spawn.
:vartype MaxOptimizers: int | IntKey
"""
def __post_init__(self):
self.MaxEvaluations: int | IntKey = IntKey(name='MaxEvaluations', comment='No new optimizers will be started after this number of function evaluations has been used.\n\nNote, this is different from the equivalent exit condition which would terminate existing optimizers rather than simply not allowing new ones to spawn.', gui_name='– n loss function evaluations:')
self.MaxOptimizers: int | IntKey = IntKey(name='MaxOptimizers', comment='No new optimizers will be started after this number has been spawned.\n\nNote, this is different from the equivalent exit condition which would terminate existing optimizers rather than simply not allowing new ones to spawn.', gui_name='– n optimizers started:')
[docs] class _DataSet(FixedBlock):
r"""
Configuration settings for each data set in the optimization.
:ivar BatchSize: Number of data set entries to be evaluated per epoch. Default 0 means all entries.
:vartype BatchSize: int | IntKey
:ivar EvaluateEvery: This data set is evaluated every n evaluations of the training set.
This will always be set to 1 for the training set. For other data sets it will be adjusted to the closest multiple of LoggingInterval%General, i.e., you cannot evaluate an extra data set more frequently than you log it.
:vartype EvaluateEvery: int | IntKey
:ivar LossFunction: Loss function used to quantify the error between model and reference values. This becomes the minimization task.
Available options:
• mae: Mean absolute error
• rmse: Root mean squared error
• sse: Sum of squared errors
• sae: Sum of absolute errors
:vartype LossFunction: Literal["mae", "rmse", "sse", "sae"]
:ivar MaxJobs: Limit each evaluation to a subset of n jobs. Default 0 meaning all jobs are used.
:vartype MaxJobs: int | IntKey
:ivar MaxJobsShuffle: Use a different job subset every for every evaluation.
:vartype MaxJobsShuffle: BoolType | BoolKey
:ivar Name: Unique data set identifier.
The first occurrence of DataSet will always be called training_set.
The second will always be called validation_set.
These cannot be overwritten.
Later occurrences will default to data_set_xx where xx starts at 03 and increments from there. This field can be used to customize the latter names.
:vartype Name: str | StringKey
:ivar Path: Path to DataSet YAML file.
:vartype Path: str | StringKey
:ivar UsePipe: Use AMS Pipe for suitable jobs to speed-up evaluation.
:vartype UsePipe: BoolType | BoolKey
"""
def __post_init__(self):
self.BatchSize: int | IntKey = IntKey(name='BatchSize', comment='Number of data set entries to be evaluated per epoch. Default 0 means all entries.', default=0)
self.EvaluateEvery: int | IntKey = IntKey(name='EvaluateEvery', comment='This data set is evaluated every n evaluations of the training set.\n\nThis will always be set to 1 for the training set. For other data sets it will be adjusted to the closest multiple of LoggingInterval%General, i.e., you cannot evaluate an extra data set more frequently than you log it.', default=1)
self.LossFunction: Literal["mae", "rmse", "sse", "sae"] = MultipleChoiceKey(name='LossFunction', comment='Loss function used to quantify the error between model and reference values. This becomes the minimization task.\n\nAvailable options:\n• mae: Mean absolute error\n• rmse: Root mean squared error\n• sse: Sum of squared errors\n• sae: Sum of absolute errors', default='sse', choices=['mae', 'rmse', 'sse', 'sae'])
self.MaxJobs: int | IntKey = IntKey(name='MaxJobs', comment='Limit each evaluation to a subset of n jobs. Default 0 meaning all jobs are used.', default=0)
self.MaxJobsShuffle: BoolType | BoolKey = BoolKey(name='MaxJobsShuffle', comment='Use a different job subset every for every evaluation.', default=False)
self.Name: str | StringKey = StringKey(name='Name', comment='Unique data set identifier.\n\nThe first occurrence of DataSet will always be called training_set.\nThe second will always be called validation_set.\nThese cannot be overwritten.\n\nLater occurrences will default to data_set_xx where xx starts at 03 and increments from there. This field can be used to customize the latter names.', default='')
self.Path: str | StringKey = StringKey(name='Path', comment='Path to DataSet YAML file.')
self.UsePipe: BoolType | BoolKey = BoolKey(name='UsePipe', comment='Use AMS Pipe for suitable jobs to speed-up evaluation.', default=True)
[docs] class _Engine(EngineBlock):
r"""
If set, use this engine for the ParAMS SinglePoint. Mutually exclusive with EngineCollection.
"""
def __post_init__(self):
pass
[docs] class _ExitCondition(FixedBlock):
r"""
A condition used to stop the optimization when it returns true.
:ivar MaxOptimizersConverged: Return True after n optimizers have converged normally.
An optimizer that has 'converged' is distinct from an optimizer that was shutdown via the Stopper settings (i.e. 'stopped').
:vartype MaxOptimizersConverged: int | IntKey
:ivar MaxOptimizersStarted: Return True after n optimizers have been started.
Note, this is best used in combination with other conditions because it will stop the optimization as soon as the correct number have been started and not allow newly spawned optimizer to iterate at all.
:vartype MaxOptimizersStarted: int | IntKey
:ivar MaxOptimizersStopped: Return True after n optimizers have been stopped.
An optimizer that has been 'stopped' is distinct from an optimizer that stopped due to its internal convergence conditions (i.e., 'converged').
:vartype MaxOptimizersStopped: int | IntKey
:ivar MaxTotalFunctionCalls: Return True after n function calls have been executed.
:vartype MaxTotalFunctionCalls: int | IntKey
:ivar TargetFunctionValue: Return True after an optimizer finds a function value less than or equal to n.
:vartype TargetFunctionValue: float | FloatKey
:ivar TimeLimit: Return True after the entire optimization has been running for n seconds.
Note, this is NOT the time for any particular optimizer.
See also: Time Limit Through Restarts
:vartype TimeLimit: float | FloatKey
:ivar TimeLimitThroughRestarts: Return True after the sum-total of all optimization runs (i.e., through all restarts) has been running for n seconds.
Note, this is NOT the time for any particular optimizer.
See also: Time Limit
:vartype TimeLimitThroughRestarts: float | FloatKey
:ivar Type: A condition used to stop the optimization when it returns true.
Available options:
• Time Limit: Limit the optimization walltime.
• Max Function Calls: Stop the optimization after a certain number of loss function evaluations.
• Max Optimizers Started: Shutdown after a certain number of optimizers have been spawned.
• Max Optimizers Converged: Shutdown when optimizers have converged via their own internal conditions.
• Max Optimizers Stopped: Shutdown when the Stopping configuration has resulted in a number of optimizers being stopped.
• Stops After Convergence: Stop when some optimizers have converged and others have been stopped.
• Time Limit Through Restarts: Limit the total cumulative walltime across multiple restarts of the optimization.
• Target Function Value: Shutdown when a small enough loss function value is found.
:vartype Type: Literal["TimeLimit", "MaxTotalFunctionCalls", "MaxOptimizersStarted", "MaxOptimizersConverged", "MaxOptimizersStopped", "StopsAfterConvergence", "TimeLimitThroughRestarts", "TargetFunctionValue"]
:ivar StopsAfterConvergence: Returns True when at least n_a optimizers have been stopped *after* n_b optimizers have converged.
:vartype StopsAfterConvergence: ParAMS._ExitCondition._StopsAfterConvergence
"""
[docs] class _StopsAfterConvergence(FixedBlock):
r"""
Returns True when at least n_a optimizers have been stopped *after* n_b optimizers have converged.
:ivar OptimizersConverged: Minimum number of converged optimizers.
:vartype OptimizersConverged: int | IntKey
:ivar OptimizersStopped: Maximum number of stopped optimizers.
:vartype OptimizersStopped: int | IntKey
"""
def __post_init__(self):
self.OptimizersConverged: int | IntKey = IntKey(name='OptimizersConverged', comment='Minimum number of converged optimizers.', gui_name='Min # converged: ', default=1)
self.OptimizersStopped: int | IntKey = IntKey(name='OptimizersStopped', comment='Maximum number of stopped optimizers.', gui_name='Max # stopped: ', default=0)
def __post_init__(self):
self.MaxOptimizersConverged: int | IntKey = IntKey(name='MaxOptimizersConverged', comment="Return True after n optimizers have converged normally.\n\nAn optimizer that has 'converged' is distinct from an optimizer that was shutdown via the Stopper settings (i.e. 'stopped').")
self.MaxOptimizersStarted: int | IntKey = IntKey(name='MaxOptimizersStarted', comment='Return True after n optimizers have been started.\n\nNote, this is best used in combination with other conditions because it will stop the optimization as soon as the correct number have been started and not allow newly spawned optimizer to iterate at all.')
self.MaxOptimizersStopped: int | IntKey = IntKey(name='MaxOptimizersStopped', comment="Return True after n optimizers have been stopped.\n\nAn optimizer that has been 'stopped' is distinct from an optimizer that stopped due to its internal convergence conditions (i.e., 'converged').")
self.MaxTotalFunctionCalls: int | IntKey = IntKey(name='MaxTotalFunctionCalls', comment='Return True after n function calls have been executed.')
self.TargetFunctionValue: float | FloatKey = FloatKey(name='TargetFunctionValue', comment='Return True after an optimizer finds a function value less than or equal to n.')
self.TimeLimit: float | FloatKey = FloatKey(name='TimeLimit', comment='Return True after the entire optimization has been running for n seconds.\n\nNote, this is NOT the time for any particular optimizer.\n\nSee also: Time Limit Through Restarts', unit='s')
self.TimeLimitThroughRestarts: float | FloatKey = FloatKey(name='TimeLimitThroughRestarts', comment='Return True after the sum-total of all optimization runs (i.e., through all restarts) has been running for n seconds.\n\nNote, this is NOT the time for any particular optimizer.\n\nSee also: Time Limit', unit='s')
self.Type: Literal["TimeLimit", "MaxTotalFunctionCalls", "MaxOptimizersStarted", "MaxOptimizersConverged", "MaxOptimizersStopped", "StopsAfterConvergence", "TimeLimitThroughRestarts", "TargetFunctionValue"] = MultipleChoiceKey(name='Type', comment='A condition used to stop the optimization when it returns true.\n\nAvailable options:\n• Time Limit: Limit the optimization walltime.\n• Max Function Calls: Stop the optimization after a certain number of loss function evaluations.\n• Max Optimizers Started: Shutdown after a certain number of optimizers have been spawned.\n• Max Optimizers Converged: Shutdown when optimizers have converged via their own internal conditions.\n• Max Optimizers Stopped: Shutdown when the Stopping configuration has resulted in a number of optimizers being stopped.\n• Stops After Convergence: Stop when some optimizers have converged and others have been stopped.\n• Time Limit Through Restarts: Limit the total cumulative walltime across multiple restarts of the optimization.\n• Target Function Value: Shutdown when a small enough loss function value is found.', default='TimeLimit', choices=['TimeLimit', 'MaxTotalFunctionCalls', 'MaxOptimizersStarted', 'MaxOptimizersConverged', 'MaxOptimizersStopped', 'StopsAfterConvergence', 'TimeLimitThroughRestarts', 'TargetFunctionValue'])
self.StopsAfterConvergence: ParAMS._ExitCondition._StopsAfterConvergence = self._StopsAfterConvergence(name='StopsAfterConvergence', comment='Returns True when at least n_a optimizers have been stopped *after* n_b optimizers have converged.', gui_name='Max stops after n converged: ')
[docs] class _Generator(FixedBlock):
r"""
A Generator used to produce x0 starting points for the optimizers.
:ivar Type: Algorithm used to pick starting points for the optimizers.
Available options:
• Incumbent: Optimizers will be started at the best point seen thus far by any optimizer. First point is random.
• ExploreExploit: Early starting points are random, but later points are closer to good minima.
• Perturbation: Each starting point is a drawn from a multivariate Gaussian distribution centred at the initial parameter set.
• Random: Optimizers are started at random locations in parameter space.
• SinglePoint: All optimizers are started at the initial parameter values.
:vartype Type: Literal["Incumbent", "ExploreExploit", "Perturbation", "Random", "SinglePoint"]
:ivar ExploreExploit: Blends a randomly generated point with the location of an existing optimizer, based on the time progress through the optimization. The optimizer is chosen based on a weighted roulette selection based on their function value. Early in the optimization optimizers are started randomly, and later they are started near previously found good minima.
:vartype ExploreExploit: ParAMS._Generator._ExploreExploit
:ivar Perturbation: Randomly generates parameter vectors from a multivariate normal distribution around the starting parameters.
:vartype Perturbation: ParAMS._Generator._Perturbation
"""
[docs] class _ExploreExploit(FixedBlock):
r"""
Blends a randomly generated point with the location of an existing optimizer, based on the time progress through the optimization. The optimizer is chosen based on a weighted roulette selection based on their function value. Early in the optimization optimizers are started randomly, and later they are started near previously found good minima.
:ivar Focus: The blend parameter between random point and incumbent points. Used as follows: p=(f_calls / max_f_calls) ** focus
:vartype Focus: float | FloatKey
:ivar MaxFunctionCalls: Maximum function calls allowed for the optimization, at and beyond this point there is a 100% chance that a previously evaluated point will be returned by the generator. If the optimization is not limited by the number of function calls, provide an estimate.
:vartype MaxFunctionCalls: int | IntKey
"""
def __post_init__(self):
self.Focus: float | FloatKey = FloatKey(name='Focus', comment='The blend parameter between random point and incumbent points. Used as follows: p=(f_calls / max_f_calls) ** focus', default=1.0)
self.MaxFunctionCalls: int | IntKey = IntKey(name='MaxFunctionCalls', comment='Maximum function calls allowed for the optimization, at and beyond this point there is a 100% chance that a previously evaluated point will be returned by the generator. If the optimization is not limited by the number of function calls, provide an estimate.')
[docs] class _Perturbation(FixedBlock):
r"""
Randomly generates parameter vectors from a multivariate normal distribution around the starting parameters.
:ivar StandardDeviation: Standard deviation of the multivariate normal distribution. Used here to control how wide the generator should explore around the starting parameters.
:vartype StandardDeviation: float | FloatKey
"""
def __post_init__(self):
self.StandardDeviation: float | FloatKey = FloatKey(name='StandardDeviation', comment='Standard deviation of the multivariate normal distribution. Used here to control how wide the generator should explore around the starting parameters.', default=0.2)
def __post_init__(self):
self.Type: Literal["Incumbent", "ExploreExploit", "Perturbation", "Random", "SinglePoint"] = MultipleChoiceKey(name='Type', comment='Algorithm used to pick starting points for the optimizers.\n\nAvailable options:\n• Incumbent: Optimizers will be started at the best point seen thus far by any optimizer. First point is random.\n• ExploreExploit: Early starting points are random, but later points are closer to good minima.\n• Perturbation: Each starting point is a drawn from a multivariate Gaussian distribution centred at the initial parameter set.\n• Random: Optimizers are started at random locations in parameter space.\n• SinglePoint: All optimizers are started at the initial parameter values.', gui_name='Starting points generator:', default='SinglePoint', choices=['Incumbent', 'ExploreExploit', 'Perturbation', 'Random', 'SinglePoint'])
self.ExploreExploit: ParAMS._Generator._ExploreExploit = self._ExploreExploit(name='ExploreExploit', comment='Blends a randomly generated point with the location of an existing optimizer, based on the time progress through the optimization. The optimizer is chosen based on a weighted roulette selection based on their function value. Early in the optimization optimizers are started randomly, and later they are started near previously found good minima.')
self.Perturbation: ParAMS._Generator._Perturbation = self._Perturbation(name='Perturbation', comment='Randomly generates parameter vectors from a multivariate normal distribution around the starting parameters.')
[docs] class _LoggingInterval(FixedBlock):
r"""
Number of function evaluations between every log to file.
:ivar Flush: The number of function evaluations between flushes of the output streams to disk.
If Flush = General, then files will be flushed after every logged call.
:vartype Flush: int | IntKey
:ivar General: The number of function evaluations between writes to stdout, running_loss.txt, running_stats.txt and all `latest/` and `best/` files.
This is also the interval with which the validation set will be evaluated and logged.
:vartype General: int | IntKey
:ivar History: The number of function evaluations between saves to the history directory with copies of latest/.
:vartype History: int | IntKey
:ivar Parameters: The number of function evaluations between writes to running_active_parameters.txt.
:vartype Parameters: int | IntKey
"""
def __post_init__(self):
self.Flush: int | IntKey = IntKey(name='Flush', comment='The number of function evaluations between flushes of the output streams to disk.\n\nIf Flush = General, then files will be flushed after every logged call.', default=10)
self.General: int | IntKey = IntKey(name='General', comment='The number of function evaluations between writes to stdout, running_loss.txt, running_stats.txt and all `latest/` and `best/` files.\n\nThis is also the interval with which the validation set will be evaluated and logged.', default=10)
self.History: int | IntKey = IntKey(name='History', comment='The number of function evaluations between saves to the history directory with copies of latest/.', default=500)
self.Parameters: int | IntKey = IntKey(name='Parameters', comment='The number of function evaluations between writes to running_active_parameters.txt.', default=500)
[docs] class _LossValuesKernel(FixedBlock):
r"""
Kernel applied to the parameters for which sensitivity is being measured.
:ivar Alpha: Cut-off parameter for the Threshold kernel between zero and one.
All loss values are scaled by taking the logarithm and then adjusted to a range between zero and one. This parameter is a value within this scaled space.
:vartype Alpha: float | FloatKey
:ivar Gamma: Bandwidth parameter for the conjunctive-Gaussian kernel.
:vartype Gamma: float | FloatKey
:ivar Sigma: Bandwidth parameter for the Gaussian kernel.
If not specified or -1, calculates a reasonable default based on the number of parameters being tested.
:vartype Sigma: float | FloatKey
:ivar Type: Name of the kernel to applied to the parameters for which sensitivity is being measured.
:vartype Type: Literal["Gaussian", "ConjunctiveGaussian", "Threshold", "Polynomial", "Linear"]
:ivar Polynomial: Settings for the Polynomial kernel.
:vartype Polynomial: ParAMS._LossValuesKernel._Polynomial
"""
[docs] class _Polynomial(FixedBlock):
r"""
Settings for the Polynomial kernel.
:ivar Order: Maximum order of the polynomial.
:vartype Order: int | IntKey
:ivar Shift: Free parameter (≥ 0) trading off higher-order versus lower-order effects.
:vartype Shift: float | FloatKey
"""
def __post_init__(self):
self.Order: int | IntKey = IntKey(name='Order', comment='Maximum order of the polynomial.', default=1)
self.Shift: float | FloatKey = FloatKey(name='Shift', comment='Free parameter (≥ 0) trading off higher-order versus lower-order effects.', default=0.0)
def __post_init__(self):
self.Alpha: float | FloatKey = FloatKey(name='Alpha', comment='Cut-off parameter for the Threshold kernel between zero and one.\n\nAll loss values are scaled by taking the logarithm and then adjusted to a range between zero and one. This parameter is a value within this scaled space. ')
self.Gamma: float | FloatKey = FloatKey(name='Gamma', comment='Bandwidth parameter for the conjunctive-Gaussian kernel.', default=0.3)
self.Sigma: float | FloatKey = FloatKey(name='Sigma', comment='Bandwidth parameter for the Gaussian kernel.\n\nIf not specified or -1, calculates a reasonable default based on the number of parameters being tested.')
self.Type: Literal["Gaussian", "ConjunctiveGaussian", "Threshold", "Polynomial", "Linear"] = MultipleChoiceKey(name='Type', comment='Name of the kernel to applied to the parameters for which sensitivity is being measured.', default='ConjunctiveGaussian', choices=['Gaussian', 'ConjunctiveGaussian', 'Threshold', 'Polynomial', 'Linear'])
self.Polynomial: ParAMS._LossValuesKernel._Polynomial = self._Polynomial(name='Polynomial', comment='Settings for the Polynomial kernel.')
[docs] class _MachineLearning(FixedBlock):
r"""
Options for Task MachineLearning.
:ivar Backend: The backend to use. You must separately install the backend before running a training job.
:vartype Backend: Literal["Custom", "M3GNet", "NequIP", "Test"]
:ivar CommitteeSize: The number of independently trained ML potentials.
:vartype CommitteeSize: int | IntKey
:ivar LoadModel: Load a previously fitted model from a ParAMS results directory. A ParAMS results directory should contain two subdirectories ``optimization`` and ``settings_and_initial_data``. This option ignores all settings inside model blocks.
:vartype LoadModel: str | Path | StringKey
:ivar MaxEpochs: Set the maximum number of epochs a backend should perform.
:vartype MaxEpochs: int | IntKey
:ivar RunAMSAtEnd: Whether to run the (committee) ML potential through AMS at the end. This will create the energy/forces scatter plots for the final trained model.
:vartype RunAMSAtEnd: BoolType | BoolKey
:ivar Custom: Set up a custom fitting program within ParAMS
:vartype Custom: ParAMS._MachineLearning._Custom
:ivar LossCoeffs: Modify the coefficients for the machine learning loss function. For backends that support weights, this is on top of the supplied dataset weights and sigmas.
:vartype LossCoeffs: ParAMS._MachineLearning._LossCoeffs
:ivar M3GNet: Options for M3GNet fitting.
:vartype M3GNet: ParAMS._MachineLearning._M3GNet
:ivar NequIP: Options for NequIP fitting.
:vartype NequIP: ParAMS._MachineLearning._NequIP
:ivar Target: Target values for stopping training. If both the training and validation metrics are smaller than the specified values, the training will stop early. Only supported by the M3GNet backend.
:vartype Target: ParAMS._MachineLearning._Target
"""
[docs] class _Custom(FixedBlock):
r"""
Set up a custom fitting program within ParAMS
:ivar File: Python file containing a function called 'get_fit_job' that returns a subclass of 'FitJob'
:vartype File: str | Path | StringKey
:ivar Arguments: Pass on keyword arguments to the 'get_fit_job' function.
:vartype Arguments: str | Sequence[str] | FreeBlock
"""
[docs] class _Arguments(FreeBlock):
r"""
Pass on keyword arguments to the 'get_fit_job' function.
"""
def __post_init__(self):
pass
def __post_init__(self):
self.File: str | Path | StringKey = PathStringKey(name='File', comment="Python file containing a function called 'get_fit_job' that returns a subclass of 'FitJob'", ispath=True)
self.Arguments: str | Sequence[str] | FreeBlock = self._Arguments(name='Arguments', comment="Pass on keyword arguments to the 'get_fit_job' function.")
[docs] class _LossCoeffs(FixedBlock):
r"""
Modify the coefficients for the machine learning loss function. For backends that support weights, this is on top of the supplied dataset weights and sigmas.
:ivar AverageForcePerAtom: For each force data entry, divide the loss contribution by the number of concomittent atoms. This is the same as the behavior for ParAMS Optimization, but it is turned off by default in Task MachineLearning. For machine learning, setting this to 'No' can be better since larger molecules will contribute more to the loss. For backends that support weights, this is on top of the supplied dataset weights and sigmas.
:vartype AverageForcePerAtom: BoolType | BoolKey
:ivar Energy: Coefficient for the contribution of loss due to the energy. For backends that support weights, this is on top of the supplied dataset weights and sigmas.
:vartype Energy: float | FloatKey
:ivar Forces: Coefficient for the contribution of loss due to the forces. For backends that support weights, this is on top of the supplied dataset weights and sigmas.
:vartype Forces: float | FloatKey
"""
def __post_init__(self):
self.AverageForcePerAtom: BoolType | BoolKey = BoolKey(name='AverageForcePerAtom', comment="For each force data entry, divide the loss contribution by the number of concomittent atoms. This is the same as the behavior for ParAMS Optimization, but it is turned off by default in Task MachineLearning. For machine learning, setting this to 'No' can be better since larger molecules will contribute more to the loss. For backends that support weights, this is on top of the supplied dataset weights and sigmas.", default=False)
self.Energy: float | FloatKey = FloatKey(name='Energy', comment='Coefficient for the contribution of loss due to the energy. For backends that support weights, this is on top of the supplied dataset weights and sigmas.', gui_name='Energy coefficient:', default=10.0)
self.Forces: float | FloatKey = FloatKey(name='Forces', comment='Coefficient for the contribution of loss due to the forces. For backends that support weights, this is on top of the supplied dataset weights and sigmas.', gui_name='Forces coefficient:', default=1.0)
[docs] class _M3GNet(FixedBlock):
r"""
Options for M3GNet fitting.
:ivar LearningRate: Learning rate for the M3GNet weight optimization.
:vartype LearningRate: float | FloatKey
:ivar Model: How to specify the model for the M3GNet backend. Either a Custom model can be made from scratch or an existing model directory can be loaded to obtain the model settings.
:vartype Model: Literal["UniversalPotential", "Custom", "ModelDir"]
:ivar ModelDir: Path to the directory defining the model. This folder should contain the files: 'checkpoint', 'm3gnet.data-00000-of-00001', ' m3gnet.index' and 'm3gnet.json'
:vartype ModelDir: str | Path | StringKey
:ivar Custom: Specify a custom M3GNet model.
:vartype Custom: ParAMS._MachineLearning._M3GNet._Custom
:ivar UniversalPotential: Settings for (transfer) learning with the M3GNet Universal Potential.
:vartype UniversalPotential: ParAMS._MachineLearning._M3GNet._UniversalPotential
"""
[docs] class _Custom(FixedBlock):
r"""
Specify a custom M3GNet model.
:ivar Cutoff: Cutoff radius of the graph
:vartype Cutoff: float | FloatKey
:ivar MaxL: Include spherical components up to order MaxL. Higher gives a better angular resolution, but increases computational cost substantially.
:vartype MaxL: int | IntKey
:ivar MaxN: Include radial components up to the MaxN'th root of the spherical Bessel function. Higher gives a better radial resolution, but increases computational cost substantially.
:vartype MaxN: int | IntKey
:ivar NumBlocks: Number of convolution blocks.
:vartype NumBlocks: int | IntKey
:ivar NumNeurons: Number of neurons in each layer.
:vartype NumNeurons: int | IntKey
:ivar ThreebodyCutoff: Cutoff radius of the three-body interaction.
:vartype ThreebodyCutoff: float | FloatKey
"""
def __post_init__(self):
self.Cutoff: float | FloatKey = FloatKey(name='Cutoff', comment='Cutoff radius of the graph', default=5.0, unit='angstrom')
self.MaxL: int | IntKey = IntKey(name='MaxL', comment='Include spherical components up to order MaxL. Higher gives a better angular resolution, but increases computational cost substantially.', default=3)
self.MaxN: int | IntKey = IntKey(name='MaxN', comment="Include radial components up to the MaxN'th root of the spherical Bessel function. Higher gives a better radial resolution, but increases computational cost substantially.", default=3)
self.NumBlocks: int | IntKey = IntKey(name='NumBlocks', comment='Number of convolution blocks.', gui_name='Number of convolution blocks: ', default=3)
self.NumNeurons: int | IntKey = IntKey(name='NumNeurons', comment='Number of neurons in each layer.', gui_name='Number of neurons per layer:', default=64)
self.ThreebodyCutoff: float | FloatKey = FloatKey(name='ThreebodyCutoff', comment='Cutoff radius of the three-body interaction.', default=4.0, unit='angstrom')
[docs] class _UniversalPotential(FixedBlock):
r"""
Settings for (transfer) learning with the M3GNet Universal Potential.
:ivar Featurizer: Train the Featurizer layer of the M3GNet universal potential.
:vartype Featurizer: BoolType | BoolKey
:ivar Final: Train the Final layer of the M3GNet universal potential.
:vartype Final: BoolType | BoolKey
:ivar GraphLayer1: Train the first Graph layer of the M3GNet universal potential.
:vartype GraphLayer1: BoolType | BoolKey
:ivar GraphLayer2: Train the second Graph layer of the M3GNet universal potential.
:vartype GraphLayer2: BoolType | BoolKey
:ivar GraphLayer3: Train the third Graph layer of the M3GNet universal potential.
:vartype GraphLayer3: BoolType | BoolKey
:ivar ThreeDInteractions1: Train the first ThreeDInteractions (three-body terms) layer of the M3GNet universal potential.
:vartype ThreeDInteractions1: BoolType | BoolKey
:ivar ThreeDInteractions2: Train the second ThreeDInteractions (three-body terms) layer of the M3GNet universal potential.
:vartype ThreeDInteractions2: BoolType | BoolKey
:ivar ThreeDInteractions3: Train the third ThreeDInteractions (three-body terms) layer of the M3GNet universal potential.
:vartype ThreeDInteractions3: BoolType | BoolKey
:ivar Version: Which version of the M3GNet Universal Potential to use.
:vartype Version: Literal["2022"]
"""
def __post_init__(self):
self.Featurizer: BoolType | BoolKey = BoolKey(name='Featurizer', comment='Train the Featurizer layer of the M3GNet universal potential.', gui_name='Train featurizer:', default=False)
self.Final: BoolType | BoolKey = BoolKey(name='Final', comment='Train the Final layer of the M3GNet universal potential.', gui_name='Train final layer:', default=True)
self.GraphLayer1: BoolType | BoolKey = BoolKey(name='GraphLayer1', comment='Train the first Graph layer of the M3GNet universal potential.', gui_name='Train layer 1 - graph:', default=False)
self.GraphLayer2: BoolType | BoolKey = BoolKey(name='GraphLayer2', comment='Train the second Graph layer of the M3GNet universal potential.', gui_name='Train layer 2 - graph:', default=False)
self.GraphLayer3: BoolType | BoolKey = BoolKey(name='GraphLayer3', comment='Train the third Graph layer of the M3GNet universal potential.', gui_name='Train layer 3 - graph:', default=True)
self.ThreeDInteractions1: BoolType | BoolKey = BoolKey(name='ThreeDInteractions1', comment='Train the first ThreeDInteractions (three-body terms) layer of the M3GNet universal potential.', gui_name='Train layer 1 - 3D interactions:', default=False)
self.ThreeDInteractions2: BoolType | BoolKey = BoolKey(name='ThreeDInteractions2', comment='Train the second ThreeDInteractions (three-body terms) layer of the M3GNet universal potential.', gui_name='Train layer 2 - 3D interactions:', default=False)
self.ThreeDInteractions3: BoolType | BoolKey = BoolKey(name='ThreeDInteractions3', comment='Train the third ThreeDInteractions (three-body terms) layer of the M3GNet universal potential.', gui_name='Train layer 3 - 3D interactions:', default=True)
self.Version: Literal["2022"] = MultipleChoiceKey(name='Version', comment='Which version of the M3GNet Universal Potential to use.', hidden=True, default='2022', choices=['2022'])
def __post_init__(self):
self.LearningRate: float | FloatKey = FloatKey(name='LearningRate', comment='Learning rate for the M3GNet weight optimization.', default=0.001)
self.Model: Literal["UniversalPotential", "Custom", "ModelDir"] = MultipleChoiceKey(name='Model', comment='How to specify the model for the M3GNet backend. Either a Custom model can be made from scratch or an existing model directory can be loaded to obtain the model settings.', default='UniversalPotential', choices=['UniversalPotential', 'Custom', 'ModelDir'])
self.ModelDir: str | Path | StringKey = PathStringKey(name='ModelDir', comment="Path to the directory defining the model. This folder should contain the files: 'checkpoint', 'm3gnet.data-00000-of-00001', ' m3gnet.index' and 'm3gnet.json'", ispath=True, gui_type='file')
self.Custom: ParAMS._MachineLearning._M3GNet._Custom = self._Custom(name='Custom', comment='Specify a custom M3GNet model.')
self.UniversalPotential: ParAMS._MachineLearning._M3GNet._UniversalPotential = self._UniversalPotential(name='UniversalPotential', comment='Settings for (transfer) learning with the M3GNet Universal Potential.')
[docs] class _NequIP(FixedBlock):
r"""
Options for NequIP fitting.
:ivar LearningRate: Learning rate for the NequIP weight optimization
:vartype LearningRate: float | FloatKey
:ivar Model: How to specify the model for the NequIP backend. Either a Custom model can be made from scratch or an existing 'model.pth' file can be loaded to obtain the model settings.
:vartype Model: Literal["Custom", "ModelFile"]
:ivar ModelFile: Path to the model.pth file defining the model.
:vartype ModelFile: str | Path | StringKey
:ivar UseRescalingFromLoadedModel: When loading a model with LoadModel or NequiP%ModelFile do not recalculate the dataset rescaling but use the value from the loaded model.
:vartype UseRescalingFromLoadedModel: BoolType | BoolKey
:ivar Custom: Specify a custom NequIP model.
:vartype Custom: ParAMS._MachineLearning._NequIP._Custom
"""
[docs] class _Custom(FixedBlock):
r"""
Specify a custom NequIP model.
:ivar LMax: Maximum L value. 1 is probably high enough.
:vartype LMax: int | IntKey
:ivar MetricsKey: Which metric to use to generate the 'best' model.
:vartype MetricsKey: Literal["training_loss", "validation_loss"]
:ivar NumLayers: Number of interaction layers in the NequIP neural network.
:vartype NumLayers: int | IntKey
:ivar RMax: Distance cutoff for interactions.
:vartype RMax: float | FloatKey
"""
def __post_init__(self):
self.LMax: int | IntKey = IntKey(name='LMax', comment='Maximum L value. 1 is probably high enough.', default=1)
self.MetricsKey: Literal["training_loss", "validation_loss"] = MultipleChoiceKey(name='MetricsKey', comment="Which metric to use to generate the 'best' model.", default='validation_loss', choices=['training_loss', 'validation_loss'])
self.NumLayers: int | IntKey = IntKey(name='NumLayers', comment='Number of interaction layers in the NequIP neural network.', default=4)
self.RMax: float | FloatKey = FloatKey(name='RMax', comment='Distance cutoff for interactions.', gui_name='Distance cutoff:', default=3.5, unit='angstrom')
def __post_init__(self):
self.LearningRate: float | FloatKey = FloatKey(name='LearningRate', comment='Learning rate for the NequIP weight optimization', default=0.005)
self.Model: Literal["Custom", "ModelFile"] = MultipleChoiceKey(name='Model', comment="How to specify the model for the NequIP backend. Either a Custom model can be made from scratch or an existing 'model.pth' file can be loaded to obtain the model settings.", default='Custom', choices=['Custom', 'ModelFile'])
self.ModelFile: str | Path | StringKey = PathStringKey(name='ModelFile', comment='Path to the model.pth file defining the model.', ispath=True, gui_type='file')
self.UseRescalingFromLoadedModel: BoolType | BoolKey = BoolKey(name='UseRescalingFromLoadedModel', comment='When loading a model with LoadModel or NequiP%ModelFile do not recalculate the dataset rescaling but use the value from the loaded model.', default=True)
self.Custom: ParAMS._MachineLearning._NequIP._Custom = self._Custom(name='Custom', comment='Specify a custom NequIP model.')
[docs] class _Target(FixedBlock):
r"""
Target values for stopping training. If both the training and validation metrics are smaller than the specified values, the training will stop early. Only supported by the M3GNet backend.
:ivar Forces: Forces (as reported by the backend)
:vartype Forces: ParAMS._MachineLearning._Target._Forces
"""
[docs] class _Forces(FixedBlock):
r"""
Forces (as reported by the backend)
:ivar Enabled: Whether to use target values for forces.
:vartype Enabled: BoolType | BoolKey
:ivar MAE: MAE for forces (as reported by the backend).
:vartype MAE: float | FloatKey
"""
def __post_init__(self):
self.Enabled: BoolType | BoolKey = BoolKey(name='Enabled', comment='Whether to use target values for forces.', default=True)
self.MAE: float | FloatKey = FloatKey(name='MAE', comment='MAE for forces (as reported by the backend).', default=0.05, unit='eV/angstrom')
def __post_init__(self):
self.Forces: ParAMS._MachineLearning._Target._Forces = self._Forces(name='Forces', comment='Forces (as reported by the backend)')
def __post_init__(self):
self.Backend: Literal["Custom", "M3GNet", "NequIP", "Test"] = MultipleChoiceKey(name='Backend', comment='The backend to use. You must separately install the backend before running a training job.', default='M3GNet', choices=['Custom', 'M3GNet', 'NequIP', 'Test'], hiddenchoices=['Custom', 'Test'], gui_type='literal choices')
self.CommitteeSize: int | IntKey = IntKey(name='CommitteeSize', comment='The number of independently trained ML potentials.', default=1)
self.LoadModel: str | Path | StringKey = PathStringKey(name='LoadModel', comment='Load a previously fitted model from a ParAMS results directory. A ParAMS results directory should contain two subdirectories ``optimization`` and ``settings_and_initial_data``. This option ignores all settings inside model blocks.', ispath=True, gui_type='directory')
self.MaxEpochs: int | IntKey = IntKey(name='MaxEpochs', comment='Set the maximum number of epochs a backend should perform.', default=1000)
self.RunAMSAtEnd: BoolType | BoolKey = BoolKey(name='RunAMSAtEnd', comment='Whether to run the (committee) ML potential through AMS at the end. This will create the energy/forces scatter plots for the final trained model.', gui_name='Run AMS at end:', default=True)
self.Custom: ParAMS._MachineLearning._Custom = self._Custom(name='Custom', comment='Set up a custom fitting program within ParAMS', hidden=True)
self.LossCoeffs: ParAMS._MachineLearning._LossCoeffs = self._LossCoeffs(name='LossCoeffs', comment='Modify the coefficients for the machine learning loss function. For backends that support weights, this is on top of the supplied dataset weights and sigmas.')
self.M3GNet: ParAMS._MachineLearning._M3GNet = self._M3GNet(name='M3GNet', comment='Options for M3GNet fitting.')
self.NequIP: ParAMS._MachineLearning._NequIP = self._NequIP(name='NequIP', comment='Options for NequIP fitting.')
self.Target: ParAMS._MachineLearning._Target = self._Target(name='Target', comment='Target values for stopping training. If both the training and validation metrics are smaller than the specified values, the training will stop early. Only supported by the M3GNet backend.')
[docs] class _Optimizer(FixedBlock):
r"""
An optimizer which may be used during the optimization.
:ivar Type: Name of the optimization algorithm or interface.
Available options:
• Adaptive Rate Monte Carlo
• CMAES (Covariance Matrix Adaptation Evolutionary Strategy)
• Nevergrad (Interface to many other algorithms)
• Scipy (Interface to classic optimization algorithms)
• Random Sampling (Uniform random sampling of the space for later analysis. NOT an actual optimizer)
• Grid Sampling (Gridwise sampling of the parameter space for later analysis. NOT and actual optimizer)
:vartype Type: Literal["AdaptiveRateMonteCarlo", "CMAES", "Nevergrad", "Scipy", "RandomSampling", "GridSampling"]
:ivar AdaptiveRateMonteCarlo: Settings for the Adaptive Rate Monte-Carlo optimizer
:vartype AdaptiveRateMonteCarlo: ParAMS._Optimizer._AdaptiveRateMonteCarlo
:ivar CMAES: Settings for the Covariance Matrix Adaptation Evolutionary Strategy
:vartype CMAES: ParAMS._Optimizer._CMAES
:ivar GridSampling: Settings for grid-wise sampling of the parameter space.
:vartype GridSampling: ParAMS._Optimizer._GridSampling
:ivar Nevergrad: Settings for the Nevergrad wrapper which gives access to many optimization algorithms.
:vartype Nevergrad: ParAMS._Optimizer._Nevergrad
:ivar RandomSampling: Settings for a totally random sampler of the parameter space.
:vartype RandomSampling: ParAMS._Optimizer._RandomSampling
:ivar Scipy: Settings for the Scipy wrapper which gives access to many optimization algorithms. For parallel optimizations, Nevergrad is preferred as it provides better control options.
:vartype Scipy: ParAMS._Optimizer._Scipy
"""
[docs] class _AdaptiveRateMonteCarlo(FixedBlock):
r"""
Settings for the Adaptive Rate Monte-Carlo optimizer
:ivar AcceptanceTarget: The target acceptance rate αₜ
:vartype AcceptanceTarget: float | FloatKey
:ivar Gamma: γ parameter.
:vartype Gamma: float | FloatKey
:ivar MoveRange: List of allowed move sizes.
:vartype MoveRange: Iterable[float] | FloatListKey
:ivar Phi: ϕ parameter.
:vartype Phi: float | FloatKey
:ivar SubIterations: Number of sub-iterations (ω)
:vartype SubIterations: int | IntKey
:ivar SuperIterations: Number of super-iterations (κ)
:vartype SuperIterations: int | IntKey
"""
def __post_init__(self):
self.AcceptanceTarget: float | FloatKey = FloatKey(name='AcceptanceTarget', comment='The target acceptance rate αₜ', gui_name='Target acceptance (αₜ)', default=0.25)
self.Gamma: float | FloatKey = FloatKey(name='Gamma', comment='γ parameter.', gui_name='γ: ', default=2.0)
self.MoveRange: Iterable[float] | FloatListKey = FloatListKey(name='MoveRange', comment='List of allowed move sizes.', gui_name='List of allowed move sizes: ', default=[0.9, 0.905, 0.91, 0.915, 0.92, 0.925, 0.93, 0.935, 0.94, 0.945, 0.95, 0.955, 0.96, 0.965, 0.97, 0.975, 0.98, 0.985, 0.99, 0.995, 1.005, 1.01, 1.015, 1.02, 1.025, 1.03, 1.035, 1.04, 1.045, 1.05, 1.055, 1.06, 1.065, 1.07, 1.075, 1.08, 1.085, 1.09, 1.095, 1.1])
self.Phi: float | FloatKey = FloatKey(name='Phi', comment='ϕ parameter.', gui_name='ϕ: ', default=1.0)
self.SubIterations: int | IntKey = IntKey(name='SubIterations', comment='Number of sub-iterations (ω)', gui_name='Number of sub-iterations (ω): ', default=100)
self.SuperIterations: int | IntKey = IntKey(name='SuperIterations', comment='Number of super-iterations (κ)', gui_name='Number of super-iterations (κ): ', default=1000)
[docs] class _CMAES(FixedBlock):
r"""
Settings for the Covariance Matrix Adaptation Evolutionary Strategy
:ivar ForceInjections: If Yes, injections of parameter vectors into the solver will be exact, guaranteeing that that solution will be in the next iteration's population.
If No, the injection will result in a direction relative nudge towards the vector. Forcing the injecting can limit global exploration but non-forced injections may have little effect.
See also glompo.optimizer.cmaes.injectioninterval
:vartype ForceInjections: BoolType | BoolKey
:ivar InjectionInterval: Number of iterations between injections of the incumbent solution into the sampled population. Defaults to 0, i.e., no injections are performed.
Injections can be helpful in increasing the convergence speed and nudging optimizers toward a solution. This is a form of elitism and will limit exploration.
Of particular interest is pairing this key with glompo.ShareBestEvaluationBetweenOptimizers. In this case the incumbent that is injected comes from other optimizers run in parallel. This can help nudge optimizers towards better solutions while dramatically improving convergence speed.
See also glompo.optimizer.cmaes.forceinjections
:vartype InjectionInterval: int | IntKey
:ivar KeepFiles: Keep CMA specific history files about the state of the covariance matrix and other algorithm variables.
:vartype KeepFiles: BoolType | BoolKey
:ivar MinSigma: Convergence condition to terminate the optimization when the standard deviation of the sampling distribution falls below this value.
:vartype MinSigma: float | FloatKey
:ivar Popsize: Number of function evaluations per CMA-ES iteration.
If not specified or -1, then the population size will equal the number of workers available to the optimizer. This is computationally efficient but not algorithmically efficient.
A value of zero or one will set the population size to the value suggested by CMA based on the dimensionality of the problem. This produces the best algorithm performance but may be computationally inefficient if resources are left idling while waiting for other evaluations to complete.
Note: a population of one is not allowed by CMA, therefore, it is changed to the algorithm default. This also means that giving CMA only one worker will also change the popsize to the algorithm default.
:vartype Popsize: int | IntKey
:ivar Sampler: Choice of full or restricted Gaussian sampling procedures.
Options:
• full: Full sampling procedure
• vd: Restricted sampler for VD-CMA (Linear time/space comparison-based natural gradient optimization)
• vkd: Restricted sampler for VkD-CMA (Time/space variant of CMA-ES)
:vartype Sampler: Literal["full", "vd", "vkd"]
:ivar Sigma0: Initial standard deviation of the multivariate normal distribution from which trials are drawn.
The recommended range of values for is between 0.01 and 0.5. Lower values sample very locally and converge quickly, higher values sample broadly but will take a long time to converge.
:vartype Sigma0: float | FloatKey
:ivar Verbose: Produce a printstream of results from within the optimizer itself.
:vartype Verbose: BoolType | BoolKey
:ivar Settings: 'argument value' pairs for extra CMA specific configuration arguments.
See CMAOptimizer API documentation for more details on which options are possible.
:vartype Settings: str | Sequence[str] | FreeBlock
"""
[docs] class _Settings(FreeBlock):
r"""
'argument value' pairs for extra CMA specific configuration arguments.
See CMAOptimizer API documentation for more details on which options are possible.
"""
def __post_init__(self):
pass
def __post_init__(self):
self.ForceInjections: BoolType | BoolKey = BoolKey(name='ForceInjections', comment="If Yes, injections of parameter vectors into the solver will be exact, guaranteeing that that solution will be in the next iteration's population.\n\nIf No, the injection will result in a direction relative nudge towards the vector. Forcing the injecting can limit global exploration but non-forced injections may have little effect.\n\nSee also glompo.optimizer.cmaes.injectioninterval", default=False)
self.InjectionInterval: int | IntKey = IntKey(name='InjectionInterval', comment='Number of iterations between injections of the incumbent solution into the sampled population. Defaults to 0, i.e., no injections are performed.\n\nInjections can be helpful in increasing the convergence speed and nudging optimizers toward a solution. This is a form of elitism and will limit exploration.\n\nOf particular interest is pairing this key with glompo.ShareBestEvaluationBetweenOptimizers. In this case the incumbent that is injected comes from other optimizers run in parallel. This can help nudge optimizers towards better solutions while dramatically improving convergence speed. \n\nSee also glompo.optimizer.cmaes.forceinjections', default=0)
self.KeepFiles: BoolType | BoolKey = BoolKey(name='KeepFiles', comment='Keep CMA specific history files about the state of the covariance matrix and other algorithm variables.', gui_name='Store covariance history: ', default=False)
self.MinSigma: float | FloatKey = FloatKey(name='MinSigma', comment='Convergence condition to terminate the optimization when the standard deviation of the sampling distribution falls below this value.')
self.Popsize: int | IntKey = IntKey(name='Popsize', comment='Number of function evaluations per CMA-ES iteration.\n\nIf not specified or -1, then the population size will equal the number of workers available to the optimizer. This is computationally efficient but not algorithmically efficient.\n\nA value of zero or one will set the population size to the value suggested by CMA based on the dimensionality of the problem. This produces the best algorithm performance but may be computationally inefficient if resources are left idling while waiting for other evaluations to complete.\n\nNote: a population of one is not allowed by CMA, therefore, it is changed to the algorithm default. This also means that giving CMA only one worker will also change the popsize to the algorithm default.')
self.Sampler: Literal["full", "vd", "vkd"] = MultipleChoiceKey(name='Sampler', comment='Choice of full or restricted Gaussian sampling procedures.\n\nOptions:\n• full: Full sampling procedure\n• vd: Restricted sampler for VD-CMA (Linear time/space comparison-based natural gradient optimization)\n• vkd: Restricted sampler for VkD-CMA (Time/space variant of CMA-ES)', default='full', choices=['full', 'vd', 'vkd'])
self.Sigma0: float | FloatKey = FloatKey(name='Sigma0', comment='Initial standard deviation of the multivariate normal distribution from which trials are drawn.\n\nThe recommended range of values for is between 0.01 and 0.5. Lower values sample very locally and converge quickly, higher values sample broadly but will take a long time to converge.', gui_name='σ₀: ', default=0.05)
self.Verbose: BoolType | BoolKey = BoolKey(name='Verbose', comment='Produce a printstream of results from within the optimizer itself.', default=False)
self.Settings: str | Sequence[str] | FreeBlock = self._Settings(name='Settings', comment="'argument value' pairs for extra CMA specific configuration arguments.\n\nSee CMAOptimizer API documentation for more details on which options are possible.", gui_name='Extra settings: ')
[docs] class _GridSampling(FixedBlock):
r"""
Settings for grid-wise sampling of the parameter space.
:ivar NumberOfDivisions: Number of equal sub-divisions to take for each parameter. Total calls is n_steps^n_parameters.
:vartype NumberOfDivisions: int | IntKey
"""
def __post_init__(self):
self.NumberOfDivisions: int | IntKey = IntKey(name='NumberOfDivisions', comment='Number of equal sub-divisions to take for each parameter. Total calls is n_steps^n_parameters.', gui_name='Step in each dimension: ', default=10)
[docs] class _Nevergrad(FixedBlock):
r"""
Settings for the Nevergrad wrapper which gives access to many optimization algorithms.
:ivar Algorithm: Optimization strategy to use.
See Nevergrad documentation for details of the available methods.
:vartype Algorithm: Literal["ASCMA2PDEthird", "ASCMADEQRthird", "ASCMADEthird", "AlmostRotationInvariantDE", "AvgMetaLogRecentering", "AvgMetaRecentering", "BO", "CM", "CMA", "CMandAS", "CMandAS2", "CMandAS3", "CauchyLHSSearch", "CauchyOnePlusOne", "CauchyScrHammersleySearch", "Cobyla", "DE", "DiagonalCMA", "DiscreteOnePlusOne", "DoubleFastGADiscreteOnePlusOne", "EDA", "ES", "FCMA", "HaltonSearch", "HaltonSearchPlusMiddlePoint", "HammersleySearch", "HammersleySearchPlusMiddlePoint", "LHSSearch", "LargeHaltonSearch", "LhsDE", "MEDA", "MPCEDA", "MetaLogRecentering", "MetaRecentering", "MixES", "MultiCMA", "MultiScaleCMA", "MutDE", "NGO", "NaiveIsoEMNA", "NaiveTBPSA", "NelderMead", "NoisyBandit", "NoisyDE", "NoisyDiscreteOnePlusOne", "NoisyOnePlusOne", "OAvgMetaLogRecentering", "ORandomSearch", "OScrHammersleySearch", "OnePlusOne", "OptimisticDiscreteOnePlusOne", "OptimisticNoisyOnePlusOne", "PBIL", "PCEDA", "PSO", "ParaPortfolio", "Portfolio", "Powell", "QORandomSearch", "QOScrHammersleySearch", "QrDE", "RCobyla", "RPowell", "RSQP", "RandomSearch", "RandomSearchPlusMiddlePoint", "RealSpacePSO", "RecES", "RecMixES", "RecMutDE", "RecombiningPortfolioOptimisticNoisyDiscreteOnePlusOne", "RotationInvariantDE", "SPSA", "SQP", "SQPCMA", "ScrHaltonSearch", "ScrHaltonSearchPlusMiddlePoint", "ScrHammersleySearch", "ScrHammersleySearchPlusMiddlePoint", "Shiva", "SplitOptimizer", "TBPSA", "TripleCMA", "TwoPointsDE", "cGA", "chainCMAPowel"]
:ivar Zero: Function value below which the algorithm will terminate.
:vartype Zero: float | FloatKey
:ivar Settings: 'argument value' pairs for extra algorithm-specific configuration arguments.
See Nevergrad API documentation for more details on which options are possible.
:vartype Settings: str | Sequence[str] | FreeBlock
"""
[docs] class _Settings(FreeBlock):
r"""
'argument value' pairs for extra algorithm-specific configuration arguments.
See Nevergrad API documentation for more details on which options are possible.
"""
def __post_init__(self):
pass
def __post_init__(self):
self.Algorithm: Literal["ASCMA2PDEthird", "ASCMADEQRthird", "ASCMADEthird", "AlmostRotationInvariantDE", "AvgMetaLogRecentering", "AvgMetaRecentering", "BO", "CM", "CMA", "CMandAS", "CMandAS2", "CMandAS3", "CauchyLHSSearch", "CauchyOnePlusOne", "CauchyScrHammersleySearch", "Cobyla", "DE", "DiagonalCMA", "DiscreteOnePlusOne", "DoubleFastGADiscreteOnePlusOne", "EDA", "ES", "FCMA", "HaltonSearch", "HaltonSearchPlusMiddlePoint", "HammersleySearch", "HammersleySearchPlusMiddlePoint", "LHSSearch", "LargeHaltonSearch", "LhsDE", "MEDA", "MPCEDA", "MetaLogRecentering", "MetaRecentering", "MixES", "MultiCMA", "MultiScaleCMA", "MutDE", "NGO", "NaiveIsoEMNA", "NaiveTBPSA", "NelderMead", "NoisyBandit", "NoisyDE", "NoisyDiscreteOnePlusOne", "NoisyOnePlusOne", "OAvgMetaLogRecentering", "ORandomSearch", "OScrHammersleySearch", "OnePlusOne", "OptimisticDiscreteOnePlusOne", "OptimisticNoisyOnePlusOne", "PBIL", "PCEDA", "PSO", "ParaPortfolio", "Portfolio", "Powell", "QORandomSearch", "QOScrHammersleySearch", "QrDE", "RCobyla", "RPowell", "RSQP", "RandomSearch", "RandomSearchPlusMiddlePoint", "RealSpacePSO", "RecES", "RecMixES", "RecMutDE", "RecombiningPortfolioOptimisticNoisyDiscreteOnePlusOne", "RotationInvariantDE", "SPSA", "SQP", "SQPCMA", "ScrHaltonSearch", "ScrHaltonSearchPlusMiddlePoint", "ScrHammersleySearch", "ScrHammersleySearchPlusMiddlePoint", "Shiva", "SplitOptimizer", "TBPSA", "TripleCMA", "TwoPointsDE", "cGA", "chainCMAPowel"] = MultipleChoiceKey(name='Algorithm', comment='Optimization strategy to use.\n\nSee Nevergrad documentation for details of the available methods.', default='TBPSA', choices=['ASCMA2PDEthird', 'ASCMADEQRthird', 'ASCMADEthird', 'AlmostRotationInvariantDE', 'AvgMetaLogRecentering', 'AvgMetaRecentering', 'BO', 'CM', 'CMA', 'CMandAS', 'CMandAS2', 'CMandAS3', 'CauchyLHSSearch', 'CauchyOnePlusOne', 'CauchyScrHammersleySearch', 'Cobyla', 'DE', 'DiagonalCMA', 'DiscreteOnePlusOne', 'DoubleFastGADiscreteOnePlusOne', 'EDA', 'ES', 'FCMA', 'HaltonSearch', 'HaltonSearchPlusMiddlePoint', 'HammersleySearch', 'HammersleySearchPlusMiddlePoint', 'LHSSearch', 'LargeHaltonSearch', 'LhsDE', 'MEDA', 'MPCEDA', 'MetaLogRecentering', 'MetaRecentering', 'MixES', 'MultiCMA', 'MultiScaleCMA', 'MutDE', 'NGO', 'NaiveIsoEMNA', 'NaiveTBPSA', 'NelderMead', 'NoisyBandit', 'NoisyDE', 'NoisyDiscreteOnePlusOne', 'NoisyOnePlusOne', 'OAvgMetaLogRecentering', 'ORandomSearch', 'OScrHammersleySearch', 'OnePlusOne', 'OptimisticDiscreteOnePlusOne', 'OptimisticNoisyOnePlusOne', 'PBIL', 'PCEDA', 'PSO', 'ParaPortfolio', 'Portfolio', 'Powell', 'QORandomSearch', 'QOScrHammersleySearch', 'QrDE', 'RCobyla', 'RPowell', 'RSQP', 'RandomSearch', 'RandomSearchPlusMiddlePoint', 'RealSpacePSO', 'RecES', 'RecMixES', 'RecMutDE', 'RecombiningPortfolioOptimisticNoisyDiscreteOnePlusOne', 'RotationInvariantDE', 'SPSA', 'SQP', 'SQPCMA', 'ScrHaltonSearch', 'ScrHaltonSearchPlusMiddlePoint', 'ScrHammersleySearch', 'ScrHammersleySearchPlusMiddlePoint', 'Shiva', 'SplitOptimizer', 'TBPSA', 'TripleCMA', 'TwoPointsDE', 'cGA', 'chainCMAPowel'])
self.Zero: float | FloatKey = FloatKey(name='Zero', comment='Function value below which the algorithm will terminate.', gui_name='Termination function value: ', default=0.0)
self.Settings: str | Sequence[str] | FreeBlock = self._Settings(name='Settings', comment="'argument value' pairs for extra algorithm-specific configuration arguments.\n\nSee Nevergrad API documentation for more details on which options are possible.", gui_name='Extra settings: ')
[docs] class _RandomSampling(FixedBlock):
r"""
Settings for a totally random sampler of the parameter space.
:ivar NumberOfSamples: Number of samples to generate.
:vartype NumberOfSamples: int | IntKey
:ivar RandomSeed: Random seed to use for the generator. Useful for reproducibility.
:vartype RandomSeed: int | IntKey
"""
def __post_init__(self):
self.NumberOfSamples: int | IntKey = IntKey(name='NumberOfSamples', comment='Number of samples to generate.', default=100)
self.RandomSeed: int | IntKey = IntKey(name='RandomSeed', comment='Random seed to use for the generator. Useful for reproducibility.')
[docs] class _Scipy(FixedBlock):
r"""
Settings for the Scipy wrapper which gives access to many optimization algorithms. For parallel optimizations, Nevergrad is preferred as it provides better control options.
:ivar Algorithm: Optimization strategy to use.
See Scipy documentation for details of the available methods.
:vartype Algorithm: Literal["Nelder-Mead", "Powell", "CG", "BFGS", "Newton-CG", "L-BFGS-B", "TNC", "COBYLA", "SLSQP", "trust-constr", "dogleg", "trust-ncg", "trust-exact", "trust-krylov"]
:ivar Hessian: Choice of the Hessian estimation method. Only for Newton-CG, dogleg, trust-ncg, trust-krylov, trust-exact and trust-constr.
:vartype Hessian: Literal["2-point", "3-point", "cs"]
:ivar Jacobian: Choice of gradient estimation method. Only for CG, BFGS, Newton-CG, L-BFGS-B, TNC, SLSQP, dogleg, trust-ncg, trust-krylov, trust-exact and trust-constr.
:vartype Jacobian: Literal["2-point", "3-point", "cs"]
:ivar Tolerance: Tolerance for termination. Interpretation linked to chosen method. See Scipy docs.
:vartype Tolerance: float | FloatKey
:ivar Settings: 'argument value' pairs for extra algorithm-specific configuration arguments.
See Scipy API documentation for more details on which options are possible.
:vartype Settings: str | Sequence[str] | FreeBlock
"""
[docs] class _Settings(FreeBlock):
r"""
'argument value' pairs for extra algorithm-specific configuration arguments.
See Scipy API documentation for more details on which options are possible.
"""
def __post_init__(self):
pass
def __post_init__(self):
self.Algorithm: Literal["Nelder-Mead", "Powell", "CG", "BFGS", "Newton-CG", "L-BFGS-B", "TNC", "COBYLA", "SLSQP", "trust-constr", "dogleg", "trust-ncg", "trust-exact", "trust-krylov"] = MultipleChoiceKey(name='Algorithm', comment='Optimization strategy to use.\n\nSee Scipy documentation for details of the available methods.', default='Nelder-Mead', choices=['Nelder-Mead', 'Powell', 'CG', 'BFGS', 'Newton-CG', 'L-BFGS-B', 'TNC', 'COBYLA', 'SLSQP', 'trust-constr', 'dogleg', 'trust-ncg', 'trust-exact', 'trust-krylov'])
self.Hessian: Literal["2-point", "3-point", "cs"] = MultipleChoiceKey(name='Hessian', comment='Choice of the Hessian estimation method. Only for Newton-CG, dogleg, trust-ncg, trust-krylov, trust-exact and trust-constr.', default='2-point', choices=['2-point', '3-point', 'cs'])
self.Jacobian: Literal["2-point", "3-point", "cs"] = MultipleChoiceKey(name='Jacobian', comment='Choice of gradient estimation method. Only for CG, BFGS, Newton-CG, L-BFGS-B, TNC, SLSQP, dogleg, trust-ncg, trust-krylov, trust-exact and trust-constr.', default='2-point', choices=['2-point', '3-point', 'cs'])
self.Tolerance: float | FloatKey = FloatKey(name='Tolerance', comment='Tolerance for termination. Interpretation linked to chosen method. See Scipy docs.', default=1e-06)
self.Settings: str | Sequence[str] | FreeBlock = self._Settings(name='Settings', comment="'argument value' pairs for extra algorithm-specific configuration arguments.\n\nSee Scipy API documentation for more details on which options are possible.", gui_name='Extra settings: ')
def __post_init__(self):
self.Type: Literal["AdaptiveRateMonteCarlo", "CMAES", "Nevergrad", "Scipy", "RandomSampling", "GridSampling"] = MultipleChoiceKey(name='Type', comment='Name of the optimization algorithm or interface.\n\nAvailable options:\n• Adaptive Rate Monte Carlo\n• CMAES (Covariance Matrix Adaptation Evolutionary Strategy)\n• Nevergrad (Interface to many other algorithms)\n• Scipy (Interface to classic optimization algorithms)\n• Random Sampling (Uniform random sampling of the space for later analysis. NOT an actual optimizer)\n• Grid Sampling (Gridwise sampling of the parameter space for later analysis. NOT and actual optimizer)', default='CMAES', choices=['AdaptiveRateMonteCarlo', 'CMAES', 'Nevergrad', 'Scipy', 'RandomSampling', 'GridSampling'])
self.AdaptiveRateMonteCarlo: ParAMS._Optimizer._AdaptiveRateMonteCarlo = self._AdaptiveRateMonteCarlo(name='AdaptiveRateMonteCarlo', comment='Settings for the Adaptive Rate Monte-Carlo optimizer', gui_name='Adaptive Rate Monte-Carlo')
self.CMAES: ParAMS._Optimizer._CMAES = self._CMAES(name='CMAES', comment='Settings for the Covariance Matrix Adaptation Evolutionary Strategy', gui_name='CMA-ES')
self.GridSampling: ParAMS._Optimizer._GridSampling = self._GridSampling(name='GridSampling', comment='Settings for grid-wise sampling of the parameter space.', gui_name='Grid-wise Sampler')
self.Nevergrad: ParAMS._Optimizer._Nevergrad = self._Nevergrad(name='Nevergrad', comment='Settings for the Nevergrad wrapper which gives access to many optimization algorithms.', gui_name='Nevergrad')
self.RandomSampling: ParAMS._Optimizer._RandomSampling = self._RandomSampling(name='RandomSampling', comment='Settings for a totally random sampler of the parameter space.', gui_name='Random Sampler')
self.Scipy: ParAMS._Optimizer._Scipy = self._Scipy(name='Scipy', comment='Settings for the Scipy wrapper which gives access to many optimization algorithms. For parallel optimizations, Nevergrad is preferred as it provides better control options.', gui_name='Scipy (Minimize)')
[docs] class _OptimizerSelector(FixedBlock):
r"""
If multiple Optimizers are included, then this block must be included and configures the Selector which will choose between them.
:ivar Type: Name of the algorithm selecting between optimizers.
Available options:
• Cycle (Default): Sequential loop through available optimizers.
• Random: Optimizers are selected randomly from the available options.
• Chain: Time-based progression through the list of available optimizers. The next option will be started once a certain number of loss function evaluations have been used.
:vartype Type: Literal["Cycle", "Random", "Chain"]
:ivar Chain: Start different optimizers at different points in time based on the number of function evaluations used.
:vartype Chain: ParAMS._OptimizerSelector._Chain
"""
[docs] class _Chain(FixedBlock):
r"""
Start different optimizers at different points in time based on the number of function evaluations used.
:ivar Thresholds: List of loss function evaluation thresholds which switch to the next optimizer in the list. If there are n optimizers available, then this list should have n-1 thresholds.
:vartype Thresholds: Iterable[int] | IntListKey
"""
def __post_init__(self):
self.Thresholds: Iterable[int] | IntListKey = IntListKey(name='Thresholds', comment='List of loss function evaluation thresholds which switch to the next optimizer in the list. If there are n optimizers available, then this list should have n-1 thresholds.')
def __post_init__(self):
self.Type: Literal["Cycle", "Random", "Chain"] = MultipleChoiceKey(name='Type', comment='Name of the algorithm selecting between optimizers.\n\nAvailable options:\n• Cycle (Default): Sequential loop through available optimizers.\n• Random: Optimizers are selected randomly from the available options.\n• Chain: Time-based progression through the list of available optimizers. The next option will be started once a certain number of loss function evaluations have been used.', gui_name='Select optimizer by:', default='Cycle', choices=['Cycle', 'Random', 'Chain'])
self.Chain: ParAMS._OptimizerSelector._Chain = self._Chain(name='Chain', comment='Start different optimizers at different points in time based on the number of function evaluations used.')
[docs] class _ParallelLevels(FixedBlock):
r"""
Distribution of threads/processes between the parallelization levels.
:ivar CommitteeMembers: Maximum number of committee member optimizations to run in parallel. If set to zero will take the minimum of MachineLearning%CommitteeSize and the number of available cores (NSCM)
:vartype CommitteeMembers: int | IntKey
:ivar Cores: Number of cores to use per committee member optimization. By default (0) the available cores (NSCM) divided equally among committee members. When using GPU offloading, consider setting this to 1.
:vartype Cores: int | IntKey
:ivar Jobs: Number of JobCollection jobs to run in parallel for each loss function evaluation.
:vartype Jobs: int | IntKey
:ivar Optimizations: Number of independent optimizers to run in parallel.
:vartype Optimizations: int | IntKey
:ivar ParameterVectors: Number of parameter vectors to try in parallel for each optimizer iteration. This level of parallelism can only be used with optimizers that support parallel optimization!
Default (0) will set this value to the number of cores on the system divided by the number of optimizers run in parallel, i.e., each optimizer will be given an equal share of the resources.
:vartype ParameterVectors: int | IntKey
:ivar Processes: Number of processes (MPI ranks) to spawn for each JobCollection job. This effectively sets the NSCM environment variable for each job.
A value of `-1` will disable explicit setting of related variables. We recommend a value of `1` in almost all cases. A value greater than 1 would only be useful if you parametrize DFTB with a serial optimizer and have very few jobs in the job collection.
:vartype Processes: int | IntKey
:ivar Threads: Number of threads to use for each of the processes. This effectively set the OMP_NUM_THREADS environment variable.
Note that the DFTB engine does not use threads, so the value of this variable would not have any effect. We recommend always leaving it at the default value of 1. Please consult the manual of the engine you are parameterizing.
A value of `-1` will disable explicit setting of related variables.
:vartype Threads: int | IntKey
"""
def __post_init__(self):
self.CommitteeMembers: int | IntKey = IntKey(name='CommitteeMembers', comment='Maximum number of committee member optimizations to run in parallel. If set to zero will take the minimum of MachineLearning%CommitteeSize and the number of available cores (NSCM)', gui_name='Number of parallel committee members:', default=1)
self.Cores: int | IntKey = IntKey(name='Cores', comment='Number of cores to use per committee member optimization. By default (0) the available cores (NSCM) divided equally among committee members. When using GPU offloading, consider setting this to 1.', gui_name='Processes (per Job):', default=0)
self.Jobs: int | IntKey = IntKey(name='Jobs', comment='Number of JobCollection jobs to run in parallel for each loss function evaluation.', gui_name='Jobs (per loss function evaluation):', default=0)
self.Optimizations: int | IntKey = IntKey(name='Optimizations', comment='Number of independent optimizers to run in parallel.', gui_name='Number of parallel optimizers:', default=1)
self.ParameterVectors: int | IntKey = IntKey(name='ParameterVectors', comment='Number of parameter vectors to try in parallel for each optimizer iteration. This level of parallelism can only be used with optimizers that support parallel optimization!\n\nDefault (0) will set this value to the number of cores on the system divided by the number of optimizers run in parallel, i.e., each optimizer will be given an equal share of the resources.', gui_name='Loss function evaluations (per optimizer):', default=0)
self.Processes: int | IntKey = IntKey(name='Processes', comment='Number of processes (MPI ranks) to spawn for each JobCollection job. This effectively sets the NSCM environment variable for each job.\n\nA value of `-1` will disable explicit setting of related variables. We recommend a value of `1` in almost all cases. A value greater than 1 would only be useful if you parametrize DFTB with a serial optimizer and have very few jobs in the job collection.', gui_name='Processes (per Job):', default=1)
self.Threads: int | IntKey = IntKey(name='Threads', comment='Number of threads to use for each of the processes. This effectively set the OMP_NUM_THREADS environment variable.\nNote that the DFTB engine does not use threads, so the value of this variable would not have any effect. We recommend always leaving it at the default value of 1. Please consult the manual of the engine you are parameterizing.\n\nA value of `-1` will disable explicit setting of related variables.', gui_name='Threads (per Process):', default=1)
[docs] class _ParametersKernel(FixedBlock):
r"""
Kernel applied to the parameters for which sensitivity is being measured.
:ivar Alpha: Cut-off parameter for the Threshold kernel between zero and one.
All loss values are scaled by taking the logarithm and then adjusted to a range between zero and one. This parameter is a value within this scaled space.
:vartype Alpha: float | FloatKey
:ivar Gamma: Bandwidth parameter for the conjunctive-Gaussian kernel.
:vartype Gamma: float | FloatKey
:ivar Sigma: Bandwidth parameter for the Gaussian kernel.
:vartype Sigma: float | FloatKey
:ivar Type: Name of the kernel to applied to the parameters for which sensitivity is being measured.
:vartype Type: Literal["Gaussian", "ConjunctiveGaussian", "Threshold", "Polynomial", "Linear"]
:ivar Polynomial: Settings for the Polynomial kernel.
:vartype Polynomial: ParAMS._ParametersKernel._Polynomial
"""
[docs] class _Polynomial(FixedBlock):
r"""
Settings for the Polynomial kernel.
:ivar Order: Maximum order of the polynomial.
:vartype Order: int | IntKey
:ivar Shift: Free parameter (≥ 0) trading off higher-order versus lower-order effects.
:vartype Shift: float | FloatKey
"""
def __post_init__(self):
self.Order: int | IntKey = IntKey(name='Order', comment='Maximum order of the polynomial.', default=1)
self.Shift: float | FloatKey = FloatKey(name='Shift', comment='Free parameter (≥ 0) trading off higher-order versus lower-order effects.', default=0.0)
def __post_init__(self):
self.Alpha: float | FloatKey = FloatKey(name='Alpha', comment='Cut-off parameter for the Threshold kernel between zero and one.\n\nAll loss values are scaled by taking the logarithm and then adjusted to a range between zero and one. This parameter is a value within this scaled space. ')
self.Gamma: float | FloatKey = FloatKey(name='Gamma', comment='Bandwidth parameter for the conjunctive-Gaussian kernel.', default=0.1)
self.Sigma: float | FloatKey = FloatKey(name='Sigma', comment='Bandwidth parameter for the Gaussian kernel.', default=0.3)
self.Type: Literal["Gaussian", "ConjunctiveGaussian", "Threshold", "Polynomial", "Linear"] = MultipleChoiceKey(name='Type', comment='Name of the kernel to applied to the parameters for which sensitivity is being measured.', default='Gaussian', choices=['Gaussian', 'ConjunctiveGaussian', 'Threshold', 'Polynomial', 'Linear'])
self.Polynomial: ParAMS._ParametersKernel._Polynomial = self._Polynomial(name='Polynomial', comment='Settings for the Polynomial kernel.')
[docs] class _Stopper(FixedBlock):
r"""
A Stopper used to terminate optimizers early.
:ivar MaxFunctionCalls: Returns True after an optimizer has evaluated at least n points.
Use in an AND combination with other stoppers to keep optimizers alive for at least some period of time, or alone or in an OR combination to shut them down after some period of time.
:vartype MaxFunctionCalls: int | IntKey
:ivar MaxSequentialInvalidPoints: Return True if the optimizer has failed to find a finite function value in the last n iterations. Usually indicates a lost optimizer with poor starting point.
:vartype MaxSequentialInvalidPoints: int | IntKey
:ivar OptimizerType: Intended for use with other stoppers to allow for specific hunting conditions based on the type of optimizer.
:vartype OptimizerType: Literal["AdaptiveRateMonteCarlo", "CMAES", "Nevergrad", "Scipy", "RandomSampling", "GridSampling"]
:ivar Type: Conditions used to stop individual optimizers. Only optimizers which have not seen the current best value will be hunted. The 'best' optimizer will always be left alive.
Available options:
• Best Function Value Unmoving: Shutdown optimizers who haven't improved their best point in a long time.
• Current Function Value Unmoving: Shutdown optimizers which have not changed their currently explored loss function value a lot in a long time.
• Max Sequential Invalid Points: Stop optimizers that are stuck in areas of parameter space which return non-finite loss function values.
• Max Function Calls: Stop optimizers when they have evaluated the loss function a certain number of times.
• Max Interoptimizer Distance: Stop optimizers too close together in parameter space.
• Min Step Size: Stop optimizers who are taking very small steps between iterations.
• Time Annealing: For use with other conditions. Keep an optimizer alive for longer if it's relatively new.
• Optimizer Type: For use with other conditions. Shutdown optimizers of a specific type.
• Value Annealing: For use with other conditions. Keep an optimizer alive if it looks likely to overtake the best optimizer.
• Validation Worsening: Stops optimizers which are evaluating points where the validation set loss value is worse than the average of several previously seen ones.
:vartype Type: Literal["BestFunctionValueUnmoving", "CurrentFunctionValueUnmoving", "MaxSequentialInvalidPoints", "MaxFunctionCalls", "MaxInteroptimizerDistance", "MinStepSize", "TimeAnnealing", "OptimizerType", "ValueAnnealing", "ValidationWorsening"]
:ivar BestFunctionValueUnmoving: Return True if an optimizer's best value has not improved significantly in a given amount of time.
:vartype BestFunctionValueUnmoving: ParAMS._Stopper._BestFunctionValueUnmoving
:ivar CurrentFunctionValueUnmoving: Return True if an optimizer's current function value has not improved significantly in a given amount of time.
:vartype CurrentFunctionValueUnmoving: ParAMS._Stopper._CurrentFunctionValueUnmoving
:ivar MaxInteroptimizerDistance: Return True if two optimizers are evaluating points too close to one another in parameter space.
:vartype MaxInteroptimizerDistance: ParAMS._Stopper._MaxInteroptimizerDistance
:ivar MinStepSize: Return True if an optimizer's step size between evaluations is too small.
:vartype MinStepSize: ParAMS._Stopper._MinStepSize
:ivar TimeAnnealing: Keeps optimizers alive based on how long they have been alive. Randomly keeps optimizers alive based on how long (in function evaluations) they have been active. The newer an optimizer is, the more likely it will pass the test and be kept alive. Used to temper very strict termination conditions.
:vartype TimeAnnealing: ParAMS._Stopper._TimeAnnealing
:ivar ValidationWorsening: Return True if the loss value of the validation set is increasing.
:vartype ValidationWorsening: ParAMS._Stopper._ValidationWorsening
:ivar ValueAnnealing: Intended for use with other stoppers. This condition is unlikely to stop a victim which has a loss function value very near the best seen thus far, but the probability of stopping increases with the difference between them.
:vartype ValueAnnealing: ParAMS._Stopper._ValueAnnealing
"""
[docs] class _BestFunctionValueUnmoving(FixedBlock):
r"""
Return True if an optimizer's best value has not improved significantly in a given amount of time.
:ivar NumberOfFunctionCalls: Number of function evaluations between comparison points.
:vartype NumberOfFunctionCalls: int | IntKey
:ivar Tolerance: Function tolerance fraction between 0 and 1.
:vartype Tolerance: float | FloatKey
"""
def __post_init__(self):
self.NumberOfFunctionCalls: int | IntKey = IntKey(name='NumberOfFunctionCalls', comment='Number of function evaluations between comparison points.')
self.Tolerance: float | FloatKey = FloatKey(name='Tolerance', comment='Function tolerance fraction between 0 and 1.', default=0.0)
[docs] class _CurrentFunctionValueUnmoving(FixedBlock):
r"""
Return True if an optimizer's current function value has not improved significantly in a given amount of time.
:ivar NumberOfFunctionCalls: Number of function evaluations between comparison points.
:vartype NumberOfFunctionCalls: int | IntKey
:ivar Tolerance: Function tolerance fraction between 0 and 1.
:vartype Tolerance: float | FloatKey
"""
def __post_init__(self):
self.NumberOfFunctionCalls: int | IntKey = IntKey(name='NumberOfFunctionCalls', comment='Number of function evaluations between comparison points.')
self.Tolerance: float | FloatKey = FloatKey(name='Tolerance', comment='Function tolerance fraction between 0 and 1.', default=0.0)
[docs] class _MaxInteroptimizerDistance(FixedBlock):
r"""
Return True if two optimizers are evaluating points too close to one another in parameter space.
:ivar CompareAllOptimizers: If Yes the distance between all optimizer combinations are compared, otherwise worse optimizers are only compared to the best one.
:vartype CompareAllOptimizers: BoolType | BoolKey
:ivar MaxRelativeDistance: Fraction (between 0 and 1) of the maximum distance in the space (from the point at all lower bounds to the point at all upper bounds) below which the optimizers are deemed too close and the victim will be stopped.
:vartype MaxRelativeDistance: float | FloatKey
"""
def __post_init__(self):
self.CompareAllOptimizers: BoolType | BoolKey = BoolKey(name='CompareAllOptimizers', comment='If Yes the distance between all optimizer combinations are compared, otherwise worse optimizers are only compared to the best one.', default=False)
self.MaxRelativeDistance: float | FloatKey = FloatKey(name='MaxRelativeDistance', comment='Fraction (between 0 and 1) of the maximum distance in the space (from the point at all lower bounds to the point at all upper bounds) below which the optimizers are deemed too close and the victim will be stopped.')
[docs] class _MinStepSize(FixedBlock):
r"""
Return True if an optimizer's step size between evaluations is too small.
:ivar NumberOfFunctionCalls: Number of function evaluations between comparison points.
:vartype NumberOfFunctionCalls: int | IntKey
:ivar Tolerance: Fraction (between 0 and 1) of the maximum distance in the space (from the point at all lower bounds to the point at all upper bounds)
:vartype Tolerance: float | FloatKey
"""
def __post_init__(self):
self.NumberOfFunctionCalls: int | IntKey = IntKey(name='NumberOfFunctionCalls', comment='Number of function evaluations between comparison points.')
self.Tolerance: float | FloatKey = FloatKey(name='Tolerance', comment='Fraction (between 0 and 1) of the maximum distance in the space (from the point at all lower bounds to the point at all upper bounds)', default=0.0)
[docs] class _TimeAnnealing(FixedBlock):
r"""
Keeps optimizers alive based on how long they have been alive. Randomly keeps optimizers alive based on how long (in function evaluations) they have been active. The newer an optimizer is, the more likely it will pass the test and be kept alive. Used to temper very strict termination conditions.
:ivar CriticalRatio: Critical ratio of function calls between stopper and victim above which the victim is guaranteed to survive. Values lower than one are looser and allow the victim to survive even if it has been in operation longer than the stopper. Values higher than one are stricter and may stop the victim even if it has iterated fewer times than the stopper.
:vartype CriticalRatio: float | FloatKey
"""
def __post_init__(self):
self.CriticalRatio: float | FloatKey = FloatKey(name='CriticalRatio', comment='Critical ratio of function calls between stopper and victim above which the victim is guaranteed to survive. Values lower than one are looser and allow the victim to survive even if it has been in operation longer than the stopper. Values higher than one are stricter and may stop the victim even if it has iterated fewer times than the stopper.', default=1.0)
[docs] class _ValidationWorsening(FixedBlock):
r"""
Return True if the loss value of the validation set is increasing.
:ivar NumberOfFunctionCalls: Number of function evaluations between comparison points.
:vartype NumberOfFunctionCalls: int | IntKey
:ivar Tolerance: The loss value (f) is modified by: f * (1 + `tol`). This can also be used to ignore the effects of mild fluctuations.
:vartype Tolerance: float | FloatKey
"""
def __post_init__(self):
self.NumberOfFunctionCalls: int | IntKey = IntKey(name='NumberOfFunctionCalls', comment='Number of function evaluations between comparison points.', default=1)
self.Tolerance: float | FloatKey = FloatKey(name='Tolerance', comment='The loss value (f) is modified by: f * (1 + `tol`). This can also be used to ignore the effects of mild fluctuations.', default=0.0)
[docs] class _ValueAnnealing(FixedBlock):
r"""
Intended for use with other stoppers. This condition is unlikely to stop a victim which has a loss function value very near the best seen thus far, but the probability of stopping increases with the difference between them.
:ivar CriticalStopChance: Value is the probability of stopping a victim which is twice as large as the stopper in absolute value. The default is 50%.
:vartype CriticalStopChance: float | FloatKey
"""
def __post_init__(self):
self.CriticalStopChance: float | FloatKey = FloatKey(name='CriticalStopChance', comment='Value is the probability of stopping a victim which is twice as large as the stopper in absolute value. The default is 50%.', default=0.5)
def __post_init__(self):
self.MaxFunctionCalls: int | IntKey = IntKey(name='MaxFunctionCalls', comment='Returns True after an optimizer has evaluated at least n points.\n\nUse in an AND combination with other stoppers to keep optimizers alive for at least some period of time, or alone or in an OR combination to shut them down after some period of time.')
self.MaxSequentialInvalidPoints: int | IntKey = IntKey(name='MaxSequentialInvalidPoints', comment='Return True if the optimizer has failed to find a finite function value in the last n iterations. Usually indicates a lost optimizer with poor starting point.', default=1)
self.OptimizerType: Literal["AdaptiveRateMonteCarlo", "CMAES", "Nevergrad", "Scipy", "RandomSampling", "GridSampling"] = MultipleChoiceKey(name='OptimizerType', comment='Intended for use with other stoppers to allow for specific hunting conditions based on the type of optimizer.', choices=['AdaptiveRateMonteCarlo', 'CMAES', 'Nevergrad', 'Scipy', 'RandomSampling', 'GridSampling'])
self.Type: Literal["BestFunctionValueUnmoving", "CurrentFunctionValueUnmoving", "MaxSequentialInvalidPoints", "MaxFunctionCalls", "MaxInteroptimizerDistance", "MinStepSize", "TimeAnnealing", "OptimizerType", "ValueAnnealing", "ValidationWorsening"] = MultipleChoiceKey(name='Type', comment="Conditions used to stop individual optimizers. Only optimizers which have not seen the current best value will be hunted. The 'best' optimizer will always be left alive.\n\nAvailable options:\n• Best Function Value Unmoving: Shutdown optimizers who haven't improved their best point in a long time.\n• Current Function Value Unmoving: Shutdown optimizers which have not changed their currently explored loss function value a lot in a long time.\n• Max Sequential Invalid Points: Stop optimizers that are stuck in areas of parameter space which return non-finite loss function values.\n• Max Function Calls: Stop optimizers when they have evaluated the loss function a certain number of times.\n• Max Interoptimizer Distance: Stop optimizers too close together in parameter space.\n• Min Step Size: Stop optimizers who are taking very small steps between iterations.\n• Time Annealing: For use with other conditions. Keep an optimizer alive for longer if it's relatively new.\n• Optimizer Type: For use with other conditions. Shutdown optimizers of a specific type.\n• Value Annealing: For use with other conditions. Keep an optimizer alive if it looks likely to overtake the best optimizer.\n• Validation Worsening: Stops optimizers which are evaluating points where the validation set loss value is worse than the average of several previously seen ones.", choices=['BestFunctionValueUnmoving', 'CurrentFunctionValueUnmoving', 'MaxSequentialInvalidPoints', 'MaxFunctionCalls', 'MaxInteroptimizerDistance', 'MinStepSize', 'TimeAnnealing', 'OptimizerType', 'ValueAnnealing', 'ValidationWorsening'])
self.BestFunctionValueUnmoving: ParAMS._Stopper._BestFunctionValueUnmoving = self._BestFunctionValueUnmoving(name='BestFunctionValueUnmoving', comment="Return True if an optimizer's best value has not improved significantly in a given amount of time.")
self.CurrentFunctionValueUnmoving: ParAMS._Stopper._CurrentFunctionValueUnmoving = self._CurrentFunctionValueUnmoving(name='CurrentFunctionValueUnmoving', comment="Return True if an optimizer's current function value has not improved significantly in a given amount of time.", gui_name='Current function value unmoving: ')
self.MaxInteroptimizerDistance: ParAMS._Stopper._MaxInteroptimizerDistance = self._MaxInteroptimizerDistance(name='MaxInteroptimizerDistance', comment='Return True if two optimizers are evaluating points too close to one another in parameter space.')
self.MinStepSize: ParAMS._Stopper._MinStepSize = self._MinStepSize(name='MinStepSize', comment="Return True if an optimizer's step size between evaluations is too small.")
self.TimeAnnealing: ParAMS._Stopper._TimeAnnealing = self._TimeAnnealing(name='TimeAnnealing', comment='Keeps optimizers alive based on how long they have been alive. Randomly keeps optimizers alive based on how long (in function evaluations) they have been active. The newer an optimizer is, the more likely it will pass the test and be kept alive. Used to temper very strict termination conditions.')
self.ValidationWorsening: ParAMS._Stopper._ValidationWorsening = self._ValidationWorsening(name='ValidationWorsening', comment='Return True if the loss value of the validation set is increasing.')
self.ValueAnnealing: ParAMS._Stopper._ValueAnnealing = self._ValueAnnealing(name='ValueAnnealing', comment='Intended for use with other stoppers. This condition is unlikely to stop a victim which has a loss function value very near the best seen thus far, but the probability of stopping increases with the difference between them.')
def __post_init__(self):
self.ApplyStoppersToBestOptimizer: BoolType | BoolKey = BoolKey(name='ApplyStoppersToBestOptimizer', comment='By default the stoppers are not applied to the best optimizer (the one who has seen the best value thus far). This is because many stoppers are based on comparisons to the best optimizer, and in most scenarios one would like to keep a well-performing optimizer alive. For some stopper configurations this paradigm does not make sense and we would prefer to apply the stoppers equally to all optimizers.', default=False)
self.CheckStopperInterval: int | IntKey = IntKey(name='CheckStopperInterval', comment='Number of loss function evaluations between evaluations of the stopper conditions.', default=1)
self.EndTimeout: float | FloatKey = FloatKey(name='EndTimeout', comment='The amount of time the manager will wait trying to smoothly join each optimizer at the end of the run. If exceeded the manager will abandon the optimizer and shutdown. This can raise errors from the abandoned threads, but may be needed to ensure the manager closes and does not hang.\n\nThis option is often needed if the Scipy optimizers are being used and should be set to a low value.', gui_name='Optimizer wait time on end (s): ', default=10.0)
self.EngineCollection: str | StringKey = StringKey(name='EngineCollection', comment='Path to (optional) JobCollection Engines YAML file.', default='job_collection_engines.yaml')
self.EvaluateLoss: BoolType | BoolKey = BoolKey(name='EvaluateLoss', comment='Evaluate the loss function based on the job results. This will produce the same output files as Task Optimization. \nIf No, this will be skipped and only the jobs will be run (and saved).\n\nWarning: If both Store Jobs and Evaluate Loss are No then this task will not produce any output.', default=True)
self.ExitConditionBooleanCombination: str | StringKey = StringKey(name='ExitConditionBooleanCombination', comment='If multiple ExitConditions are used, this key indicates how their evaluations relate to one another.\n\nUse an integer to refer to a exit condition (defined by order in input file).\n\nRecognizes the symbols: ( ) & |\n\nE.g. (1 & 2) | 3.\n\nDefaults to an OR combination of all selected exit conditions.', gui_name='Combine exit conditions:')
self.FilterInfiniteValues: BoolType | BoolKey = BoolKey(name='FilterInfiniteValues', comment='If Yes, removes points from the calculation with non-finite loss values.\n\nNon-finite points can cause numerical issues in the sensitivity calculation.', default=True)
self.GlompoLogging: BoolType | BoolKey = BoolKey(name='GlompoLogging', comment='Include status and progress information from the optimization manager in the printstreams.', gui_name='Print manager logging messages: ', default=True)
self.GlompoSummaryFiles: Literal["None", "1", "2", "3", "4"] = MultipleChoiceKey(name='GlompoSummaryFiles', comment='Indicates what GloMPO-style outputs you would like saved to disk. Higher values also save all lower level information.\n\nAvailable options:\n• None: Nothing is saved.\n• 1: YAML file with summary info about the optimization settings, performance and the result.\n• 2: PNG file showing the trajectories of the optimizers.\n• 3: HDF5 file containing iteration history for each optimizer.\n• 4: More detailed HDF5 log including the residual results for each optimizer, data set and iteration.', gui_name='GloMPO summary files: ', default='None', choices=['None', '1', '2', '3', '4'])
self.JobCollection: str | StringKey = StringKey(name='JobCollection', comment='Path to JobCollection YAML file.', default='job_collection.yaml')
self.MoreExtractorsPath: str | Path | StringKey = PathStringKey(name='MoreExtractorsPath', comment='Path to directory with extractors.', default='extractors', ispath=True)
self.NumberBootstraps: int | IntKey = IntKey(name='NumberBootstraps', comment='Number of repeats of the calculation with different sub-samples.\n\nA small spread from a large number of bootstraps provides confidence on the estimation of the sensitivity.', gui_name='Repeat calculation n times: ', default=1)
self.NumberCalculationSamples: int | IntKey = IntKey(name='NumberCalculationSamples', comment='Number of samples from the full set available to use in the calculation.\n\nIf not specified or -1, uses all available points. For the sensitivity calculation, this will be redrawn for every bootstrap.', gui_name='Number of samples per repeat: ')
self.NumberSamples: int | IntKey = IntKey(name='NumberSamples', comment='Number of samples to generate during the sampling procedure.', gui_name='Generate n samples: ', default=1000)
self.PLAMSWorkingDirectory: str | Path | StringKey = PathStringKey(name='PLAMSWorkingDirectory', comment='Path to PLAMS working directory to temporarily hold Job results files.', default='', ispath=True)
self.ParameterInterface: str | StringKey = StringKey(name='ParameterInterface', comment='Path to parameter interface YAML file.', default='parameter_interface.yaml')
self.PrintStatusInterval: float | FloatKey = FloatKey(name='PrintStatusInterval', comment='Number of seconds between printing of a status summary.', default=600.0, unit='s')
self.RandomSeed: int | IntKey = IntKey(name='RandomSeed', comment='Random seed to use during the sampling procedure (for reproducibility).')
self.RestartDirectory: str | Path | StringKey = PathStringKey(name='RestartDirectory', comment="Specify a directory to continue interrupted GenerateReference or SinglePoint calculations. The directory depends on the task: \n\nGenerateReference: results/reference_jobs\nSinglePoint: results/single_point/jobs\n\nNote: If you use the GUI this directory will be COPIED into the results folder and the name will be prepended with 'dep-'. This can take up a lot of disk space, so you may want to remove the 'dep-' folder after the job has finished.", gui_name='Load jobs from: ', default='', ispath=True, gui_type='directory')
self.ResultsDirectory: str | Path | StringKey = PathStringKey(name='ResultsDirectory', comment='Directory in which output files will be created.', gui_name='Working directory: ', default='results', ispath=True)
self.ResumeCheckpoint: str | Path | StringKey = PathStringKey(name='ResumeCheckpoint', comment='Path to checkpoint file from which a previous optimization can be resumed.', ispath=True)
self.RunReweightCalculation: BoolType | BoolKey = BoolKey(name='RunReweightCalculation', comment='Run a more expensive sensitivity calculation that will also return suggested weights for the training set which will produce more balanced sensitivities between all the parameters.\n\nNote: The Gaussian kernel is recommended for the loss values kernel in this case.', default=False)
self.RunSampling: BoolType | BoolKey = BoolKey(name='RunSampling', comment='Produce a set of samples of the loss function and active parameters. Samples from the parameter space are drawn from a uniform random distribution.\n\nSuch a set of samples serves as the input to the sensitivity calculation.', default=False)
self.SampleWithReplacement: BoolType | BoolKey = BoolKey(name='SampleWithReplacement', comment='Sample from the available data with or without replacement.\n\nThis only has an effect if the number of samples for the calculation is less than the total number available otherwise replace is Yes by necessity.', default=True)
self.SamplesDirectory: str | Path | StringKey = PathStringKey(name='SamplesDirectory', comment="Path to an 'optimization' directory containing the results of a previously run sampling.\n\nFirst looks for a 'glompo_log.h5' file. If not found, will look for 'running_loss.txt' and 'running_active_parameters.txt' in a sub-directory. The sub-directory used will depend on the DataSet Name.\n\nFor the Reweight calculation only a 'glompo_log.h5' file (with residuals) may be used.", default='', ispath=True, gui_type='directory')
self.SaveResiduals: BoolType | BoolKey = BoolKey(name='SaveResiduals', comment='During the sampling, save the individual difference between reference and predicted values for every sample and training set item.\nRequired for the Reweight calculation, and will be automatically activated if the reweight calculation is requested.\n\nSaving and analyzing the residuals can provide valuable insight into your training set, but can quickly occupy a large amount of disk space. Only save the residuals if you would like to run the reweight calculation or have a particular reason to do so.', default=False)
self.Scaler: Literal["Linear", "Std", "None", "Optimizers"] = MultipleChoiceKey(name='Scaler', comment="Type of scaling applied to the parameters. A scaled input space is needed by many optimization algorithms.\n\nAvailable options:\n• Linear: Scale all parameters between 0 and 1.\n• Std: Scale all parameters between -1 and 1.\n• None: Applies no scaling.\n• Optimizers (Default): Does not specify a scaling at the manager level, but allows the selection to be governed by the optimizer/s. If they do not require any particular scaler, then 'linear' is selected as the ultimate fallback.", gui_name='', default='Optimizers', choices=['Linear', 'Std', 'None', 'Optimizers'])
self.SetToAnalyze: Literal["TrainingSet", "ValidationSet"] = MultipleChoiceKey(name='SetToAnalyze', comment='Name of the data set to use for the sensitivity analysis.', gui_name='Analyze: ', default='TrainingSet', choices=['TrainingSet', 'ValidationSet'])
self.ShareBestEvaluationBetweenOptimizers: BoolType | BoolKey = BoolKey(name='ShareBestEvaluationBetweenOptimizers', comment='Share new best evaluations from one optimizer to another.\n\nSome algorithms can use this information to accelerate their own convergence. However, optimizers typically have to be configured to receive and handle the information.\n\nThis option can work very well with CMA-ES injections.', default=False)
self.SkipX0: BoolType | BoolKey = BoolKey(name='SkipX0', comment='Do not evaluate the initial parameters before starting the optimization.\n\nIf the initial parameters evaluated and do not return a finite loss function value, the optimization will abort. A non-infinite value typically indicates crashed jobs.', gui_name='Skip initial parameter evaluation: ', default=False)
self.SplitPrintstreams: BoolType | BoolKey = BoolKey(name='SplitPrintstreams', comment='Split print statements from each optimizer to separate files.', gui_name='Split optimizer printstreams: ', default=True)
self.StopperBooleanCombination: str | StringKey = StringKey(name='StopperBooleanCombination', comment='If multiple Stoppers are used this is required to indicate how their evaluations relate to one another.\n\nUse an integer to refer to a stopper (defined by order in input file).\n\nRecognizes the symbols: ( ) & |\n\nE.g. (1 & 2) | 3.\n\nDefaults to an OR combination of all selected stoppers.', gui_name='Combine stoppers:')
self.StoreJobs: Literal["Auto", "Yes", "No"] = MultipleChoiceKey(name='StoreJobs', comment='Keeps the results files for each of the jobs.\nIf No, all pipeable jobs will be run through the AMS Pipe and no files will be saved (not even the ones not run through the pipe). If Auto, the pipeable jobs are run through the pipe and the results of nonpipeable jobs are saved to disk. If Yes, no jobs are run through the pipe and all job results are stored on disk. \n\nWarning: If both Store Jobs and Evaluate Loss are No then task SinglePoint will not produce any output.', default='Auto', choices=['Auto', 'Yes', 'No'])
self.Task: Literal["Optimization", "GenerateReference", "SinglePoint", "Sensitivity", "MachineLearning"] = MultipleChoiceKey(name='Task', comment='Task to run.\n\nAvailable options:\n•MachineLearning: Optimization for machine learning models.\n•Optimization: Global optimization powered by GloMPO\n•Generate Reference: Run jobs with reference engine to get reference values\n•Single Point: Evaluate the current configuration of jobs, training data, and parameters\n•Sensitivity: Measure the sensitivity of the loss function to each of the active parameters', default='Optimization', choices=['Optimization', 'GenerateReference', 'SinglePoint', 'Sensitivity', 'MachineLearning'])
self.Validation: float | FloatKey = FloatKey(name='Validation', comment='Fraction of the training set to be used as a validation set. Will be ignored if a validation set has been explicitly defined.')
self.CheckpointControl: ParAMS._CheckpointControl = self._CheckpointControl(name='CheckpointControl', comment='Settings to control the production of checkpoints from which the optimization can be resumed.', gui_name='Checkpointing options: ')
self.Constraints: str | Sequence[str] | FreeBlock = self._Constraints(name='Constraints', comment="Parameter constraint rules to apply to the loss function. One per line. Use 'p' to reference the parameter set. You may use indices or names in square brackets to refer to a specific variable.\n\nNote, that indices are absolute i.e., they do not reference the active subset of parameters.\n\nEg:\np[0]>=p[2]\np['O:p_boc2']==p['H:p_boc2']")
self.ControlOptimizerSpawning: ParAMS._ControlOptimizerSpawning = self._ControlOptimizerSpawning(name='ControlOptimizerSpawning', comment='Control the spawning of optimizers. Note, this is different from ExitConditions. These simply stop new optimizers from spawning, but leave existing ones untouched. ExitConditions shutdown active optimizers and stop the optimization.')
self.DataSet: ParAMS._DataSet = self._DataSet(name='DataSet', comment='Configuration settings for each data set in the optimization.', unique=False, gui_type='Repeat at least once')
self.Engine: EngineBlock = self._Engine(name='Engine', comment='If set, use this engine for the ParAMS SinglePoint. Mutually exclusive with EngineCollection. ', header=True)
self.ExitCondition: ParAMS._ExitCondition = self._ExitCondition(name='ExitCondition', comment='A condition used to stop the optimization when it returns true.', unique=False)
self.Generator: ParAMS._Generator = self._Generator(name='Generator', comment='A Generator used to produce x0 starting points for the optimizers.', gui_name='Starting point generator: ')
self.LoggingInterval: ParAMS._LoggingInterval = self._LoggingInterval(name='LoggingInterval', comment='Number of function evaluations between every log to file.', gui_type='Repeat at least once')
self.LossValuesKernel: ParAMS._LossValuesKernel = self._LossValuesKernel(name='LossValuesKernel', comment='Kernel applied to the parameters for which sensitivity is being measured.')
self.MachineLearning: ParAMS._MachineLearning = self._MachineLearning(name='MachineLearning', comment='Options for Task MachineLearning.')
self.Optimizer: ParAMS._Optimizer = self._Optimizer(name='Optimizer', comment='An optimizer which may be used during the optimization.', unique=False, gui_name='Optimization algorithm: ', gui_type='Repeat at least once')
self.OptimizerSelector: ParAMS._OptimizerSelector = self._OptimizerSelector(name='OptimizerSelector', comment='If multiple Optimizers are included, then this block must be included and configures the Selector which will choose between them.', gui_name='Selection between optimizer types: ')
self.ParallelLevels: ParAMS._ParallelLevels = self._ParallelLevels(name='ParallelLevels', comment='Distribution of threads/processes between the parallelization levels.', gui_name='Parallelization distribution: ')
self.ParametersKernel: ParAMS._ParametersKernel = self._ParametersKernel(name='ParametersKernel', comment='Kernel applied to the parameters for which sensitivity is being measured.')
self.Stopper: ParAMS._Stopper = self._Stopper(name='Stopper', comment='A Stopper used to terminate optimizers early.', unique=False)