The metrica package contains +40 functions. Two
arguments are always required: observed
(Oi; a.k.a. actual,
measured, truth, target) and predicted
(Pi; a.k.a.
simulated, fitted) values. Also, there is an optional data
arg. that allows to call an existing data frame containing both observed
and predicted vectors.
Some functions, also require to define axis orientation
,
such as the slope of linear regression describing the bivariate scatter.
Current included functions cover the world of “regression error” metrics
(i.e. prediction performance for continuous variables). Classification
error metrics coming soon.
Always keep in mind that predicted values should come from out-of-bag
samples (unseen by training set) to avoid overestimation of prediction
performance.
# | Metric | Definition | Details | Formula |
---|---|---|---|---|
A | RSS |
Residual sum of squares (a.k.a. as sum of squares) | The sum of squared differences between predicted and observed values. It represents the base of many error metrics using squared scale such as the MSE | |
B | TSS |
Total sum of squares | The sum of the squared differences between the observations and its mean. It is used as a reference error, for example, to estimate explained variance | |
C | var_u |
Sample variance, uncorrected | The mean of sum of squared differences between values of an
x and its mean (divided by n, not n-1) |
|
D | uSD |
Sample standard deviation, uncorrected | The square root of the mean of sum of squared differences between
values of an x and its mean (divided by n, not n-1) |
|
1 | B0 |
Intercept of SMA regression | SMA is a symmetric linear regression (invariant results/interpretation to axis orientation) recommended to describe the bivariate scatter instead of OLS regression (classic linear model, which results vary with the axis orientation). B0 could be used to test agreement along with B1 (H0: B0 = 0, B1 = 1) . Warton et al. (2006) | |
2 | B1 |
Slope of SMA regression | SMA is a symmetric linear regression (invariant results/interpretation to axis orientation) recommended to describe the bivariate scatter instead of OLS regression (classic linear model, which results vary with the axis orientation). B1 could be used to test isometry of the PO scatter (H0: B1 = 1). B1 also represents the ratio of standard deviations (So and Sp). Warton et al. (2006) | |
3 | r |
Pearson’s correlation coefficient | Strength of linear association between P and O. However, it measures “precision” but no accuracy. Kirch (2008) | |
4 | R2 |
Coefficient of determination | Strength of linear association between P and O. However, it measures “precision” but no accuracy | |
5 | Xa |
Accuracy coefficient | Measures accuracy. Used to adjust the precision measured by
r to estimate agreement |
|
6 | CCC |
Concordance correlation coefficient | Tests agreement. It presents both precision (r) and accuracy (Xa) components. Easy to interpret. Lin (1989) | |
7 | MAE |
Mean Absolute Error | Measures both lack of accuracy and precision in absolute scale. It keeps the same units than the response variable. Less sensitive to outliers than the MSE or RMSE. Willmott & Matsuura (2005) | |
8 | RMAE |
Relative Mean Absolute Error | Normalizes the MAE with respect to the mean of observations | |
9 | MAPE |
Mean Absolute Percentage Error | Percentage units (independent scale). Easy to explain and to compare performance across models with different response variables. Asymmetric and unbounded. | |
10 | SMAPE |
Symmetric Mean Absolute Percentage Error | SMAPE tackles the asymmetry issues of MAPE and includes lower (0%) and upper (200%) bounds. Makridakis (1993) | |
11 | RAE |
Relative Absolute Error | RAE normalizes MAE with respect to the total absolute error. Lower bound at 0 (perfect fit) and no upper bound (infinity) | |
12 | RSE |
Relative Squared Error | Proportion of the total sum of squares that corresponds to differences between predictions and observations (residual sum of squares) | |
13 | MBE |
Mean Bias Error | Main bias error metric. Same units as the response variable. Related to differences between means of predictions and observations. Negative values indicate overestimation. Positive values indicate underestimation. Unbounded. Also known as average error. Janssen & Heuberger (1995) | |
14 | PBE |
Percentage Bias Error | Useful to identify systematic over or under predictions. Percentage units. As the MBE, PBE negative values indicate overestimation, while positive values indicate underestimation. Unbounded. Gupta et al. (1999) | |
15 | PAB |
Percentage Additive Bias | Percentage of the MSE related to systematic additive issues on the predictions. Related to difference of the means of predictions and observations | |
16 | PPB |
Percentage Proportional Bias | Percentage of the MSE related to systematic proportionality issues on the predictions. Related to slope of regression line describing the bivariate scatter | |
17 | MSE |
Mean Squared Error | Comprises both accuracy and precision. High sensitivity to outliers | |
18 | RMSE |
Root Mean Squared Error | Comprises both precision and accuracy, has the same units than the variable of interest. Very sensitive to outliers | |
19 | RRMSE |
Relative Root Mean Squared Error | RMSE normalized by the mean of observations | |
20 | RSR |
Root Mean Standard Deviation Ratio | RMSE normalized by the standard deviation of observations. Moriasi et al. (2007) | |
21 | iqRMSE |
Inter-quartile Normalized Root Mean Squared Error | RMSE normalized by the interquartile range length (between percentiles 25th and 75th) | |
22 | MLA |
Mean Lack of Accuracy | Bias component of MSE decomposition. Correndo et al. (2021) | |
23 | MLP |
Mean Lack of Precision | Variance component of MSE decomposition. Correndo et al. (2021) | |
24 | PLA |
Percentage Lack of Accuracy | Percentage of the MSE related to lack of accuracy (systematic differences) on the predictions. Correndo et al. (2021) | |
25 | PLP |
Percentage Lack of Precision | Percentage of the MSE related to lack of precision (unsystematic differences) on the predictions. Correndo et al. (2021) | |
26 | SB |
Squared Bias | Additive bias component, MSE decomposition. Kobayashi and Salam (2000) | |
27 | SDSD |
Product of Standard Deviations | Proportional bias component, MSE decomposition. Kobayashi and Salam (2000) | |
28 | LCS |
Lack of Correlation | Random error component, MSE decomposition. Kobayashi and Salam (2000) | |
29 | Ue |
Random error proportion | The Ue estimates the proportion of the total sum of squares related to the random error (unsystematic error or variance) following the sum of squares decomposition suggested by Smith and Rose (1995) also known as Theil’s partial inequalities | |
30 | Uc |
Lack of Consistency error proportion | The Uc estimates the proportion of the total sum of squares related to the lack of consistency (proportional bias) following the sum of squares decomposition suggested by Smith and Rose (1995) also known as Theil’s partial inequalities | |
31 | Ub |
Mean Bias error proportion | The Ub estimates the proportion of the total sum of squares related to the mean bias following the sum of squares decomposition suggested by Smith and Rose (1995) also known as Theil’s partial inequalities | |
32 | NSE |
Nash and Sutcliffe’s Model Efficiency | Model efficiency using squared residuals normalized by the variance of observations. Nash and Sutcliffe (1970) | |
33 | E1 |
Absolute Model Efficiency | Model efficiency. Modification of NSE using absolute residuals instead of squared residuals. Legates and McCabe (1999) | |
34 | Erel |
Relative Model Efficiency | Compared to the NSE, the Erel is suggested as more sensitive to systematic over- or under-predictions. Krause et al. (2005) | |
35 | KGE |
Kling-Gupta Model Efficiency | Model efficiency with accuracy, precision, and consistency components. Kling et al. (2012) | |
36 | d |
Index of Agreement | Measures accuracy and precision using squared residuals. Dimensionless (normalized). Bounded [0;1]. Asymmetric Willmott (1981) | |
37 | d1 |
Modified Index of Agreement | Measures accuracy and precision using absolute residuals(1). Dimensionless (normalized). Bounded [0;1]. Asymmetric Willmott et al. (1985) | |
38 | d1r |
Refined Index of Agreement | Refines d1 by a modification on the denominator (potential error) to normalize absolute error. Willmott et al. (2012) | |
39 | RAC |
Robinson’s Agreement Coefficient | RAC measures both accuracy and precision (general agreement). Dimensionless (normalized). Bounded [0;1]. Symmetric. Robinson (1957; 1959) | where |
40 | AC |
Ji and Gallo’s Agreement Coefficient | AC measures both accuracy and precision (general agreement). Dimensionless (normalized). Positively bounded [-infinity;1]. Symmetric. Ji and Gallo (2006) | |
41 | lambda |
Duveiller’s Lambda Coefficient | lambda measures both accuracy and precision.
Dimensionless (normalized). Bounded [-1;1]. Symmetric. Equivalent to CCC
when r is greater or equal to 0. Duveiller et
al. (2016) |
where otherwise |
Correndo et al. (2021). Revisiting linear regression to test
agreement in continuous predicted-observed datasets. Agric. Syst.
192, 103194.
Duveiller et al. (2016). Revisiting the concept of a symmetric
index of agreement for continuous datasets. Sci. Rep. 6, 1-14.
Gupta et al. (1999). Status of automatic calibration for
hydrologic models: Comparison with multilevel expert calibration. J.
Hydrologic Eng. 4(2): 135-143.
Janssen & Heuberger (1995). Calibration of process-oriented
models. Ecol. Modell. 83, 55-66.
Ji & Gallo (2006). An agreement coefficient for image
comparison. Photogramm. Eng. Remote Sensing 7, 823–833.
Kling et al. (2012). Runoff conditions in the upper Danube basin
under an ensemble of climate change scenarios. J. Hydrol., 424-425,
264-277.
Kirch (2008). Pearson’s Correlation Coefficient. In: Kirch W.
(eds) Encyclopedia of Public Health. Springer, Dordrecht.
Krause et al. (2005). Comparison of different efficiency criteria
for hydrological model assessment. Adv. Geosci. 5, 89–97.
Kobayashi & Salam (2000). Comparing simulated and measured
values using mean squared deviation and its components. Agron. J.
92, 345–352.
Legates & McCabe (1999). Evaluating the use of
“goodness-of-fit” measures in hydrologic and hydroclimatic model
validation. Water Resour. Res.
Lin (1989). A concordance correlation coefficient to evaluate
reproducibility. Biometrics 45 (1), 255–268.
Makridakis (1993). Accuracy measures: theoretical and practical
concerns. Int. J. Forecast. 9, 527-529.
Moriasi et al. (2007). Model Evaluation Guidelines for Systematic
Quantification of Accuracy in Watershed Simulations. Trans. ASABE
50, 885–900.
Nash & Sutcliffe (1970). River flow forecasting through
conceptual models part I - A discussion of principles. J. Hydrol.
10(3), 292-290.
Robinson (1957). The statistical measurement of agreement.
Am. Sociol. Rev. 22(1), 17-25.
Robinson (1959). The geometric interpretation of agreement.
Am. Sociol. Rev. 24(3), 338-345.
Smith & Rose (1995). Model goodness-of-fit analysis using
regression and related techniques. Ecol. Model. 77, 49–64.
Warton et al. (2006). Bivariate line-fitting methods for
allometry. Biol. Rev. Camb. Philos. Soc. 81, 259–291.
Willmott (1981). On the validation of models. Phys. Geogr. 2,
184–194.
Willmott et al. (1985). Statistics for the evaluation and
comparison of models. J. Geophys. Res. 90, 8995.
Willmott & Matsuura (2005). Advantages of the mean absolute
error (MAE) over the root mean square error (RMSE) in assessing average
model performance. Clim. Res. 30, 79–82.
Willmott et al. (2012). A refined index of model performance.
Int. J. Climatol. 32, 2088–2094.
Yang et al. (2014). An evaluation of the statistical methods for
testing the performance of crop models with observed data. Agric.
Syst. 127, 81-89.