Multiple temporal aggregation: the story so far. Part I

By | April 27, 2017

Over the last years I have been working (with my co-authors!) on the idea of Multiple Temporal Aggregation (MTA) for time series forecasting. A number of papers have been published introducing and developing the idea further, or testing its effectiveness for forecasting.

In this series of blog posts I will try to summarise the progress so far, and highlight ways that you can use it. This first post will summarise the papers so far and give an overview of the main findings. Later posts will focus on explaining how MTA works.

The key points behind MTA are the following:

  • It is a radically different approach to time series modelling, recognising that the data sampling frequency may not be the best for a given modelling purpose.
  • A time series is modelled simultaneously at multiple temporal aggregation levels that can be easily generated from the original data. At each level an appropriate model is fit, focusing on the components of the series that are strengthened by temporal aggregation.
  • If forecasting is the objective, then the produced forecast reconciles the information from all these models. This makes the forecast robust to modelling uncertainty and lessens the importance of model selection.
  • The resulting forecasts have been shown to be reliable and typically outperform the conventional modelling approach.

Table 1 summarises our contributions on MTA so far (follow the links to access the papers). We have also released two R packages that implement MTA: MAPA and thief. The former implements, as the name suggests, MAPA, while the latter provides code to use Temporal Hierarchies.

Table 1. Papers on MTA
Paper Summary
Kourentzes et al. 2014. Improving forecasting by estimating time series structural components across multiple frequencies. The initial paper on MTA modelling. It introduces the Multiple Aggregation Prediction Algorithm (MAPA) and demonstrates its superior performance on the well-known M3 competition.
Petropoulos and Kourentzes 2014. Forecast combinations for intermittent demand. Expands MAPA for the case of intermittent demand.
Kourentzes and Petropoulos 2016. Forecasting with multivariate temporal aggregation: The case of promotional modelling. Expands MAPA for promotional modelling purposes at Stock Keeping Unit level.
Barrow and Kourentzes 2016. Distributions of forecasting errors of forecast combinations: implications for inventory management. Provides evidence of very strong performance of MAPA over established benchmarks for demand forecasting and inventory management purposes.
Athanasopoulos et al. 2017. Forecasting with temporal hierarchies. Introduces a general framework for MTA: Temporal Hierarchies that allows use of any model/method to produce forecasts at each level.
Kourentzes et al. 2017. Demand forecasting by temporal aggregation: using optimal or multiple aggregation levels? Demonstrates that MTA modelling is more robust to uncertainty than modelling either using the original data or using a single (optimal) temporal aggregation level.

To give you an idea of the reported improvements, I have collated some of the results from the papers above. The best forecast in each column, in all tables, is highlighted in boldface. Table 2 provides a summary for the quarterly and monthly M3 datasets, using as benchmarks the Exponential Smoothing (ETS) family of models, with automatic model selection (via AICc), and Theta, the best performing method on the original M3 competition – a position it held for almost 15 years! In this case both MAPA and Temporal Hierarchies make use of the ETS family of models, so you can get a feeling of the improvement provided by MTA over conventional time series forecasting, as the results are directly comparable with the ETS row.

Tables 3 and 4 provide results for a number of real datasets. Table 4 also provides results on a variety of simulated ARIMA series. The detailed results can be found in the respective papers. In all cases MAPA is better, or at least as good, compared to the various benchmarks. Table 5 provides results on real series that have promoted periods. There are two comparisons: forecasts without and with promotional information. In both cases MTA based forecasts (MAPA) are on average the most accurate.

Table 2. sMAPE results on M3 quarterly and monthly data1
Forecast Quarterly set Monthly set
Exponential Smoothing (ETS) 9.94% 14.45%
Theta (M3 competition)2 8.96% 13.85%
MAPA (Kourentzes et al. 2014) 9.58% 13.69%
Temporal Hierarchies (Athanasopoulos et al. 2017) 9.70% 13.61%

1 Papers provide results on more robust metrics!
2 Best performance in the original M3 competition.

Table 3. Scaled RMSE results on Fast Moving Consumer Goods sales (Barrow and Kourentzes, 2016)
Forecast 1-step ahead 3-steps ahead 5-steps ahead
Naive 0.882 0.900 0.919
ETS 0.677 0.688 0.711
AR 0.707 0.719 0.737
ARIMA 1.446 0.701 0.721
Theta 0.674 0.685 0.705
MAPA 0.668 0.670 0.687
Table 4. Average Relative MAE on simulated and real data (Kourentzes et al., 2017)
Forecast Simulated ARIMA Manaufacturing Call centre
Single Exponential Smoothing (SES) 1.000 1.000 1.000
Exponential Smoothing (ETS) 0.985 1.011 1.005
Optimal Temporal Aggregation & SES 0.974 0.999 1.080
MAPA 0.971 0.994 0.979
Table 5. Scaled MAE results on SKUs with promotions (Kourentzes and Petropoulos, 2016)
Forecast 4-steps ahead 8-steps ahead 12-steps ahead
Naive 0.743 0.818 0.704
ETS 0.704 0.774 0.701
MAPA 0.679 0.754 0.736
Regression + Promotional 0.611 0.659 0.714
ETS + Promotional 0.642 0.627 0.543
MAPA + Promotional 0.525 0.521 0.515

The main argument in all papers is that MTA helps to improve forecast accuracy due to the way it mitigates modelling uncertainty. As we will see this comes at no additional data cost and relatively limited additional computations. An added benefit, which is not very evident from the summarised tables provided here, is that the MTA forecasts are reliable both for short and long term forecasting, providing a way to reconcile operational, tactical and strategic planning.

Unpublished results on different applications provide a similar picture in terms of accuracy. There is also evidence that MTA can strengthen statistical tests, as the initial results of this experiment show. However, all this is ongoing research, so until a full analysis is conducted and the results are peer reviewed, I would add a pinch of salt to these!

In following blog posts I will explain how MTA works and elaborate more on results from the various papers.

Multiple Temporal Aggregation: the story so far: Part I; Part II.

Leave a Reply

Your email address will not be published. Required fields are marked *