Complex Exponential Smoothing

By | October 22, 2016

A couple of days ago my ex-student Ivan Svetunkov successfully defended his PhD. My thanks to both Siem Jan Koopman and Rebecca Killick who were his examiners and with their questions  led to a very interesting discussion.

Ivan’s PhD topic is a new model, the Complex Exponential Smoothing (CES). In this post I will very briefly demonstrate the key advantages of the proposed model over conventional exponential smoothing. I will not go into the mathematical details, which can be found in detail in this working paper. Furthermore, a CES implementation is available in the smooth package for R.

One of the motivating ideas behind CES is that recent research has demonstrated that the standard forms of exponential smoothing may not be able to capture all the patterns observed in real time series. In particular, the interpretation of the level and trend components, once one goes in the details, is elusive and arguably the distinction is ad-hoc, perhaps leading to more modelling woes than what it solves. Instead, CES avoids this separation, and introduces a new flexible component that captures various types of information, beyond the level component, as necessary to achieve good fit to the given data.

Effectively, CES is able to model a wide spectum of level and trend time series, without requiring to switch between different models, as is the case for exponential smoothing. CES is able to model both stationary and non-stationary in the mean time series, whereas exponential smoothing is capable of modelling only non-stationary ones. Because CES does not switch between level and trend models, but smoothly “slides” between the two types, not only it captures more behaviours, but also avoids any abrupt changes in successive forecasts of a time series. For example, as new data become available and new forecasts are generated, a different form of exponential smoothing may become optimal, for instance using Akaike Information Criterion, and new forecasts may become substantially different. In turn, this may cause issues for the users of the forecasts, as they become very erratic. The Multiple Prediction Aggregation Algorithm (MAPA) addressed this by strengthening the identification and estimation of exponential smoothing, but at the cost of multiple model fits. CES does that by proposing an alternative the modelling series as if they are separable into level and trend components.

The following example illustrates the point. I have simulated a simple exponential smoothing (level) and a Holt linear trend exponential smoothing (trend)  time series. Both time series are 8 years long (96 monthly observations). For the last 4 years I perform a rolling origin evaluation experiment. At each forecast origin models are optimised and appropriate forecasts are produced. The fitting sample is extended by one observation and the process is repeated, until there is no remaining sample. At minimum 4 years of data are used to fit the first batch of models. Fig. 1 and Fig. 2 visualise the process for the level and trend series respectively.

ces_level

Fig. 1: CES and ETS rolling forecasts on test set for the level time series.

ces_trend

Fig. 2: CES and ETS rolling forecasts on test set for the trend time series.

Observe the erratic changes in the forecasts in the ETS case. These are caused by switching between level and trend models, which are not always correctly identified. For the level series the misspecification occurs 32% of the times, whereas for the trend series 24%. The figures provide the avergae Mean Absolute Error (MAE) across forecast origins, showing that CES is more accurate. Of course, these are just two examples, and I refer you to the working paper for a more rigorous evaluation of CES.

An advantage of CES is that it sidesteps model selection. The same model is adequate for both level and trend series. This is not possible with conventional time series models. We argue that this simplifies the forecasting process substantially, both in terms of complexity and computational requirements. Fig. 3 illustrates the point by providing the total computation time for each example above. ETS requires substantially more time, as 19 different models are trialed at each forecast origin.

ces_time

Fig. 3: Total computation time for each example.

Ivan looked into the properties of the basic CES model, deriving likelihood, parameter bounds and variance expressions. He then proceeded to extend the model to the seasonal case. The conventional repertoire of series that exponential smoothing models is tackled by only two variants of CES: non-seasonal and seasonal, retaining the model selection challenge relatively simple. Obviously, we do not claim dominance of CES over ETS in every case, but the empirical results are favourable to CES, especially when the qualitative elements of the comparison (simplified or no model selection) are considered. Detailed results are descriptions are in two papers currently under review, which I hope I will be able to post here soon. Meanwhile there are two presentations by Ivan that you may find helpful about CES and seasonal CES.

If you want to trial CES for yourselves, I recommend you to explore the smooth package for R that is available on CRAN.

Finally, let me congratulate Ivan for successfully defending his PhD!

Leave a Reply

Your email address will not be published. Required fields are marked *