3.11. Loss Functions

Loss Functions are metrics that evaluate the residuals vector between reference and predicted properties \((\boldsymbol{w}/\boldsymbol{\sigma})(\boldsymbol{y} - \boldsymbol{\hat{y}})\), which is generated every time DataSet.evaluate() is called (note that although DataSet.evaluate() returns the non-weighted residuals, the Loss Function always receives a residuals vector that is weighted by \(w/\sigma\)).

By default, the following string keywords are recognized as loss functions
  • lad, lae : Least Absolute Error
  • rmsd, rmse : Root-Mean-Square Deviation
  • mad, mae : Mean Absolute Deviation
  • sse, rss : Sum of Squared Errors (this is the default optimization loss)

and can be passed to an Optimization in one of the following ways:

my_optimization = Optimization(*args, loss='mae') # As the string keyword

from scm.params.core.lossfunctions import MAE # Loss functions are not imported automatically
my_optimization = Optimization(*args, loss=MAE()) # Or directly

After calling my_optimization.optimize(), generated properties will consequently be compared with the MAE. A loss function can also be passed to DataSet.evaluate() in the same manner.

3.11.1. Least Absolute Error

class LAE(initial_fx=0)

Least absolute error (LAE), least absolute deviations (LAD) loss.

(3.5)\[L_\mathrm{LAE} = \sum_{i=1}^N | y_i - \hat{y}_i |\]

Accessible with the strings 'lae', 'lad'.

3.11.2. Mean Absolute Error

class MAE(initial_fx=0)

Mean Absolute Error (MAE, MAD) loss.

(3.6)\[L_\mathrm{MAE} = \frac{1}{N} \sum_{i=1}^N | y_i - \hat{y}_i |\]

Accessible with the strings 'mae', 'mad'.

3.11.3. Root-Mean-Square Error

class RMSE(initial_fx=0)

Root-Mean-Square Error (RMSE, RMSD) loss.

(3.7)\[L_\mathrm{RMSE} = \sqrt{ \frac{1}{N} \sum_{i=1}^N \big( (y_i - \hat{y}_i) \big)^2 }\]

Accessible with the strings 'rmse', 'rmsd'.

3.11.4. Sum of Squares Error

class SSE(initial_fx=0)

Residual Sum of Squares or Sum of Squared Error loss. This loss function is commonly used for ReaxFF parameter fitting.

(3.8)\[L_\mathrm{SSE} = \sum_{i=1}^N (y_i - \hat{y}_i)^2\]

Accessible with the strings 'sse', 'rss'.

3.11.5. Loss Function API

User-specific loss functions can be defined by inheriting from the base class below. Please make sure that your loss defines the attributes fx and contribution. The latter should contain a percentual per-element contribution of residuals to the overall loss function value.

Note that although the residuals are depicted as a single vector throughout the documentation, the data structure that a Loss receives is a List[1d array], where every element in the list stores the (weighted) residuals vector of the respective Data Set entry.

class Loss(initial_fx=0)

Base class for the mathematical definition of a loss function.

Attributes:

self.fx : float
The return value of __call__(), i.e., the total cost function value after the evaluation.
contribution : list or ndarray of floats
Following the number of input entries to __call__(), stores the per-entry contribution to fx.
__init__(initial_fx=0)

Initialize the loss with a starting value of initial_fx

__call__(residuals: List[numpy.ndarray]) → float

When DataSet.evaluate() is called, reference and predicted values are extracted for each entry and combined into a weighted list of residuals where every entry represents \((w_i/\sigma_i)(y_i-\hat{y}_i)\). The loss computes a metric given this residuals vector.

The contribution attribute should be calculated in this method.

Parameters:
residuals : List of 1d arrays
List of \((w_i/\sigma_i)(y_i-\hat{y}_i)\) elements.
Returns:
float
Total calculated loss
__repr__()

Allow string representations of built-in losses. Modify this method in your child class if it requires additional arguments at __init__.