es()
is a part of smooth package. It allows constructing Exponential Smoothing (also known as ETS), selecting the most appropriate one among 30 possible ones, including exogenous variables and many more.
In this vignette we will use data from Mcomp
package, so it is advised to install it.
Let’s load the necessary packages:
require(smooth)
require(Mcomp)
You may note that Mcomp
depends on forecast
package and if you load both forecast
and smooth
, then you will have a message that forecast()
function is masked from the environment. There is nothing to be worried about - smooth
uses this function for consistency purposes and has exactly the same original forecast()
as in the forecast
package. The inclusion of this function in smooth
was done only in order not to include forecast
in dependencies of the package.
The simplest call of this function is:
es(M3$N2457$x, h=18, holdout=TRUE)
## Time elapsed: 1.02 seconds
## Model estimated: ETS(MAN)
## Persistence vector g:
## alpha beta
## 0.108 0.000
## Initial values were optimised.
## 5 parameters were estimated in the process
## Residuals standard deviation: 0.414
## Cost function type: MSE; Cost function value: 1268104.714
##
## Information criteria:
## AIC AICc BIC
## 1648.418 1649.078 1661.292
## Forecast errors:
## MPE: 24.1%; Bias: 84.9%; MAPE: 39.3%; SMAPE: 48%
## MASE: 2.883; sMAE: 117.6%; RelMAE: 1.232; sMSE: 234.5%
In this case function uses branch and bound algorithm to form a pool of models to check and after that constructs a model with the lowest information criterion. As we can see, it also produces an output with brief information about the model, which contains:
holdout=TRUE
).The function has also produced a graph with actuals, fitted values and point forecasts.
If we need prediction intervals, then we run:
es(M3$N2457$x, h=18, holdout=TRUE, intervals=TRUE)
## Time elapsed: 1.32 seconds
## Model estimated: ETS(MAN)
## Persistence vector g:
## alpha beta
## 0.108 0.000
## Initial values were optimised.
## 5 parameters were estimated in the process
## Residuals standard deviation: 0.414
## Cost function type: MSE; Cost function value: 1268104.714
##
## Information criteria:
## AIC AICc BIC
## 1648.418 1649.078 1661.292
## 95% parametric prediction intervals were constructed
## 89% of values are in the prediction interval
## Forecast errors:
## MPE: 24.1%; Bias: 84.9%; MAPE: 39.3%; SMAPE: 48%
## MASE: 2.883; sMAE: 117.6%; RelMAE: 1.232; sMSE: 234.5%
Due to multiplicative nature of error term in the model, the intervals are asymmetric. This is the expected behaviour. The other thing to note is that the output now also provides the theoretical width of prediction intervals and its actual coverage.
If we save the model (and let’s say we want it to work silently):
ourModel <- es(M3$N2457$x, h=18, holdout=TRUE, silent="all")
we can then reuse it for different purposes:
es(M3$N2457$x, model=ourModel, h=18, holdout=FALSE, intervals="np", level=0.93)
## Time elapsed: 0.08 seconds
## Model estimated: ETS(MAN)
## Persistence vector g:
## alpha beta
## 0.108 0.000
## Initial values were provided by user.
## 1 parameter was estimated in the process
## 4 parameters were provided
## Residuals standard deviation: 0.428
## Cost function type: MSE; Cost function value: 1938252.127
##
## Information criteria:
## AIC AICc BIC
## 1993.245 1993.280 1995.990
## 93% nonparametric prediction intervals were constructed
We can also extract the type of model in order to reuse it later:
modelType(ourModel)
## [1] "MAN"
This handy function, by the way, also works with ets() from forecast package.
We can then use persistence or initials only from the model to construct the other one:
es(M3$N2457$x, model=modelType(ourModel), h=18, holdout=FALSE, initial=ourModel$initial, silent="graph")
## Time elapsed: 0.02 seconds
## Model estimated: ETS(MAN)
## Persistence vector g:
## alpha beta
## 0.119 0.000
## Initial values were provided by user.
## 3 parameters were estimated in the process
## 2 parameters were provided
## Residuals standard deviation: 0.431
## Cost function type: MSE; Cost function value: 1937089.528
##
## Information criteria:
## AIC AICc BIC
## 1997.176 1997.392 2005.411
es(M3$N2457$x, model=modelType(ourModel), h=18, holdout=FALSE, persistence=ourModel$persistence, silent="graph")
## Time elapsed: 0.02 seconds
## Model estimated: ETS(MAN)
## Persistence vector g:
## alpha beta
## 0.108 0.000
## Initial values were optimised.
## 3 parameters were estimated in the process
## 2 parameters were provided
## Residuals standard deviation: 0.431
## Cost function type: MSE; Cost function value: 1937169.382
##
## Information criteria:
## AIC AICc BIC
## 1997.181 1997.397 2005.416
or provide some arbitrary values:
es(M3$N2457$x, model=modelType(ourModel), h=18, holdout=FALSE, initial=1500, silent="graph")
## Time elapsed: 0.1 seconds
## Model estimated: ETS(MAN)
## Persistence vector g:
## alpha beta
## 0.117 0.000
## Initial values were optimised.
## 5 parameters were estimated in the process
## Residuals standard deviation: 0.435
## Cost function type: MSE; Cost function value: 1936550.947
##
## Information criteria:
## AIC AICc BIC
## 2001.144 2001.695 2014.869
Using some other parameters may lead to completely different model and forecasts:
es(M3$N2457$x, h=18, holdout=TRUE, cfType="aMSTFE", bounds="a", ic="BIC", intervals=TRUE)
## Time elapsed: 0.38 seconds
## Model estimated: ETS(MNN)
## Persistence vector g:
## alpha
## 0.08
## Initial values were optimised.
## 3 parameters were estimated in the process
## Residuals standard deviation: 0.42
## Cost function type: aMSTFE; Cost function value: 246.291
##
## Information criteria:
## AIC AICc BIC
## 25551.52 25556.16 25690.55
## 95% parametric prediction intervals were constructed
## 72% of values are in the prediction interval
## Forecast errors:
## MPE: 33.3%; Bias: 90.4%; MAPE: 43.3%; SMAPE: 56.3%
## MASE: 3.232; sMAE: 131.9%; RelMAE: 1.381; sMSE: 277.6%
You can play around with all the available parameters to see what’s their effect on final model.
In order to combine forecasts we need to use “C” letter:
es(M3$N2457$x, model="CCN", h=18, holdout=TRUE, silent="graph")
## Estimation progress: 10%20%30%40%50%60%70%80%90%100%... Done!
## Time elapsed: 0.96 seconds
## Model estimated: ETS(CCN)
## Initial values were optimised.
## Residuals standard deviation: 1408.59
## Cost function type: MSE
##
## Information criteria:
## Combined AICc
## 1647.651
## Forecast errors:
## MPE: 27.8%; Bias: 88.4%; MAPE: 40.5%; SMAPE: 50.8%
## MASE: 3.005; sMAE: 122.6%; RelMAE: 1.284; sMSE: 249.9%
Model selection from a specified pool and forecasts combination are called using respectively:
es(M3$N2457$x, model=c("ANN","AAN","AAdN","ANA","AAA","AAdA"), h=18, holdout=TRUE, silent="graph")
## Estimation progress: 17%33%50%67%83%100%... Done!
## Time elapsed: 0.61 seconds
## Model estimated: ETS(AAN)
## Persistence vector g:
## alpha beta
## 0 0
## Initial values were optimised.
## 5 parameters were estimated in the process
## Residuals standard deviation: 1410.988
## Cost function type: MSE; Cost function value: 1888265.077
##
## Information criteria:
## AIC AICc BIC
## 1687.037 1687.697 1699.911
## Forecast errors:
## MPE: 27.7%; Bias: 88.2%; MAPE: 40.9%; SMAPE: 51.2%
## MASE: 3.022; sMAE: 123.3%; RelMAE: 1.291; sMSE: 251.8%
es(M3$N2457$x, model=c("CCC","ANN","AAN","AAdN","ANA","AAA","AAdA"), h=18, holdout=TRUE, silent="graph")
## Estimation progress: 17%33%50%67%83%100%... Done!
## Time elapsed: 0.61 seconds
## Model estimated: ETS(CCC)
## Initial values were optimised.
## Residuals standard deviation: 1361.192
## Cost function type: MSE
##
## Information criteria:
## Combined AICc
## 1688.456
## Forecast errors:
## MPE: 25.6%; Bias: 86.1%; MAPE: 39.9%; SMAPE: 49.3%
## MASE: 2.939; sMAE: 119.9%; RelMAE: 1.256; sMSE: 241.7%
Now let’s introduce some artificial exogenous variables:
x <- cbind(rnorm(length(M3$N2457$x),50,3),rnorm(length(M3$N2457$x),100,7))
and fit a model with all the exogenous first:
es(M3$N2457$x, model="ZZZ", h=18, holdout=TRUE, xreg=x)
## Time elapsed: 0.43 seconds
## Model estimated: ETSX(MNN)
## Persistence vector g:
## alpha
## 0.148
## Initial values were optimised.
## 5 parameters were estimated in the process
## Residuals standard deviation: 0.413
## Xreg coefficients were estimated in a normal style
## Cost function type: MSE; Cost function value: 1258845.079
##
## Information criteria:
## AIC AICc BIC
## 1647.707 1648.367 1660.581
## Forecast errors:
## MPE: 24.4%; Bias: 86.8%; MAPE: 39.7%; SMAPE: 48.6%
## MASE: 2.912; sMAE: 118.8%; RelMAE: 1.244; sMSE: 241.6%
or construct a model with selected exogenous (based on IC):
es(M3$N2457$x, model="ZZZ", h=18, holdout=TRUE, xreg=x, xregDo="select")
## Time elapsed: 0.35 seconds
## Model estimated: ETS(MNN)
## Persistence vector g:
## alpha
## 0.145
## Initial values were optimised.
## 3 parameters were estimated in the process
## Residuals standard deviation: 0.415
## Cost function type: MSE; Cost function value: 1288657.07
##
## Information criteria:
## AIC AICc BIC
## 1647.978 1648.413 1658.277
## Forecast errors:
## MPE: 26.3%; Bias: 87%; MAPE: 39.8%; SMAPE: 49.4%
## MASE: 2.944; sMAE: 120.1%; RelMAE: 1.258; sMSE: 242.7%
or the one with the updated xreg:
ourModel <- es(M3$N2457$x, model="ZZZ", h=18, holdout=TRUE, xreg=x, updateX=TRUE)
If we want to check if lagged x can be used for forecasting purposes, we can use xregExpander()
function:
es(M3$N2457$x, model="ZZZ", h=18, holdout=TRUE, xreg=xregExpander(x), xregDo="select")
## Time elapsed: 0.74 seconds
## Model estimated: ETSX(MNN)
## Persistence vector g:
## alpha
## 0.15
## Initial values were optimised.
## 4 parameters were estimated in the process
## Residuals standard deviation: 0.41
## Xreg coefficients were estimated in a normal style
## Cost function type: MSE; Cost function value: 1258903.036
##
## Information criteria:
## AIC AICc BIC
## 1645.712 1646.147 1656.011
## Forecast errors:
## MPE: 26%; Bias: 86.9%; MAPE: 39.8%; SMAPE: 49.2%
## MASE: 2.926; sMAE: 119.4%; RelMAE: 1.25; sMSE: 236.2%
If we are confused about the type of estimated model, the function formula()
will help us:
formula(ourModel)
## [1] "y[t] = (l[t-1] + b[t-1]) * exp(a1[t-1] * x1[t] + a2[t-1] * x2[t]) * e[t]"
A feature available since 2.1.0 is fitting ets()
model and then using its parameters in es()
:
etsModel <- forecast::ets(M3$N2457$x)
esModel <- es(M3$N2457$x, model=etsModel, h=18)
The point forecasts in the majority of cases should the same, but the prediction intervals may be different (especially if error term is multiplicative):
forecast(etsModel,h=18,level=0.95)
## Point Forecast Lo 95 Hi 95
## Aug 1992 8619.214 1215.16444 16023.26
## Sep 1992 8674.340 1086.29318 16262.39
## Oct 1992 8729.467 958.84520 16500.09
## Nov 1992 8784.593 832.69344 16736.49
## Dec 1992 8839.719 707.72264 16971.71
## Jan 1993 8894.845 583.82787 17205.86
## Feb 1993 8949.971 460.91328 17439.03
## Mar 1993 9005.097 338.89106 17671.30
## Apr 1993 9060.223 217.68056 17902.77
## May 1993 9115.349 97.20746 18133.49
## Jun 1993 9170.475 -22.59688 18363.55
## Jul 1993 9225.602 -141.79599 18593.00
## Aug 1993 9280.728 -260.44882 18821.90
## Sep 1993 9335.854 -378.61019 19050.32
## Oct 1993 9390.980 -496.33117 19278.29
## Nov 1993 9446.106 -613.65942 19505.87
## Dec 1993 9501.232 -730.63954 19733.10
## Jan 1994 9556.358 -847.31330 19960.03
forecast(esModel,h=18,level=0.95)
## Point forecast Lower bound (2.5%) Upper bound (97.5%)
## Aug 1992 8619.214 3651.070 20145.95
## Sep 1992 8674.340 3749.634 20954.14
## Oct 1992 8729.467 3785.703 22242.53
## Nov 1992 8784.593 3751.308 23007.18
## Dec 1992 8839.719 3785.024 23182.85
## Jan 1993 8894.845 3743.804 23952.08
## Feb 1993 8949.971 3801.560 25118.78
## Mar 1993 9005.097 3868.788 26087.47
## Apr 1993 9060.223 3896.555 26881.35
## May 1993 9115.349 4017.338 27289.40
## Jun 1993 9170.475 4054.433 28671.74
## Jul 1993 9225.602 4076.137 29672.31
## Aug 1993 9280.728 4048.523 29589.86
## Sep 1993 9335.854 4011.002 30739.02
## Oct 1993 9390.980 4059.053 31987.42
## Nov 1993 9446.106 4083.420 32661.74
## Dec 1993 9501.232 4160.683 33274.27
## Jan 1994 9556.358 4174.078 34747.98