In the previous post we saw how the Multiple Aggregation Prediction Algortihm (MAPA) implements the ideas of MTA. We also saw that it has some limitations, particularly requiring splitting forecasts into subcomponents (level, trend and seasonality). Although some forecasting methods provide such outputs naturally, for example Exponential Smoothing and Theta, others do not. More crucially, manually adjusted forecasts do not either, and even though it is possible to use MAPAx for that, a simpler approach would be welcome. This is where Temporal Hierarchies become quite useful, which is an alternative way to implement MTA.

Temporal Hierarchies borrow many ideas from cross-section hierarchies and organise the different temporal aggregation levels as a hierarchy. Consider for example four quarterly observations. The first two quarters constitute the first half-year, and the last two quarters constitute the second half-year. The two half-years add up to make a complete year. These connections imply a hierarchy, much like sales of different packet sizes of a product in a supermarket can be organised in a product hierarchy. However, temporal hierarchies have one key advantage over cross-sectional ones, they are uniquely specified by the problem at hand. Suppose I am given monthly data to forecast. There is a single hierarchy across temporal aggregation levels, much like in the quarterly example before, that I need to deal with, irrespective of the item I need to forecast, the way I got the forecast or the properties of the time series. Once this unique hierarchy is defined (and all the data are coming from temporally aggregate views of the original time series), then all that is left is to do is to forecast across the hierarchy, i.e., all temporal aggregation levels and reconcile the forecasts. The act of reconciliation brings together information from all modelling levels, with the MTA benefits discussed in the previous posts.

Some hierarchies are more complex than others. The quarterly hierarhcy, from the example above, is a very simple three level hierarchy (quarters, half-years, years). A monthly hierarchy is more complex, because there are more than one ways to reach to yearly data from monthly. For example, one could aggregate by 2 months, then these by 2 (4-monthly level), and then that by 3 (yearly level). Alternatively, one could aggregate to quarterly data, half-yearly and then yearly. The two aggregation paths can happen in parallel. The temporal hierarchy is made up by all possible paths. Note that in constrast to MAPA, levels that do not fully add up to a yearly time series are excluded (intuitively they do not belong in any path from the bottom dissagregate level to the top yearly level). This has the advantage that any forecasting model/method does not need to deal with series that may have fractional seasonality. Nonetheless, this is an interesting future research avenue.

The following interactive plot provides the temporal hierarchies for common types of time series. Observe that many have multiple pathways to the top yearly level (for example, monthly time series), and some are very simple hierarchies (for example, days in week). Use the highlight option to easily visualise the various pathways. Once visualised, the analogies with cross-sectional hierarchies are apparent.

To forecast we need to populate every level of the hierarchy with a forecast. So for example, for the quarterly hierarchy we need to provide 3 sets of forecasts, one for the quarterly time series, one for the semi-yearly and one for the yearly. Imagine that each hierarchy depicts one year’s worth of forecasts, but obviously we can produce the same hierarchy for the next year and so on. Mathematically this is just another column of forecasts to be handled by the hierarchy, so in fact it is trivial to do. But an implication is that forecasts are produced in horizons that are multiples of full years (and then any shorter horizons are used accordingly). People are more familiar with two specific cases of temporal hierarchies. One is when we need to produce a total figure over a period, for example for tactical/strategic forecasts. This is simply the bottom-up interpretation of temporal hierarchies: forecasts from the lowest level are summed to a higher level. The other alternative is to produce a forecast and then use a `profile’ to split this further. In supply chain forecasting and call centres this is very common, in breaking weekly forecasts into daily profiles, or daily forecasts into intra-daily profiles. This is merely the top-down interpretation of temporal hierarchies.

**Forecasting with Temporal Hierarchies**

You may have already noticed that there is nothing to restrict the source of forecasts. They can be based on some statistical model, judgement, mix of both, differ amongst levels, or whatever other exotic source. This is a substantial advantage over MAPA, and temporal hierarchies provide a flexible MTA foundation. In reconciling the forecast there are couple of complications that we deal with in this paper (the scale and variance of the forecasts are different, which needs to be taken into account during reconciliation). I mentioned earlier that temporal hierarchies are unique. This simplifies substantially the solution, but I will not go into the mathematical details here.

In the following interactive plot you can choose from the usual time series I have been using as examples in this series of posts to produce base (conventional built forecast from a single level, in red) and Temporal Hierarchy Forecasts (THieF, in blue). I provide the forecasts across the various temporal aggregation levels permitted by the hierarchy. Observe how the information across the temporal aggregation levels is shared in the THieF forecasts to achieve better modelling of the series. You can also choose between three different forecasts: exponential smoothing, ARIMA and naive. The naive forecasts are quite illuminating in showing how the multiple views offered by THieF achieve supperior results. There other two types of forecasts are quite illustrative as well.

I also provide Mean Absolute Error (MAE) for the base and THeiF forecasts for the dissagregate series. You will observe that on average THieF forecasts are more accurate. The gains improve at more aggregate levels. In the paper we demonstrate with simulations that in various scenarios of uncertainty (parameter, model) THieF performs better or at least as good as base forecasts.

To sum up, forecasting with temporal hierarchies:

- offers a very flexible framework to implement MTA, with all its advantages;
- is independent of source of forecasts, allowing to provide different additional information at different levels, if available;
- has been shown to offer substantial gains in terms of accuracy over base forecasts, by blending the information available across temporal aggregation levels;
- provides reconciled short term (dissaggregate) and long-term (aggregate) forecasts, leading to aligned operational, tactical and strategic planning.

If you want to try it out we have released the thief package for R.

A final note on THieF. THieF and MAPA both perform very well and neither is a clear winner in terms of forecast accuracy alone. The two MTA alternatives handle information in a different way. MAPA also takes advantage of the `in-between’ levels that THieF excludes. The good performance of both, even though they have some key differences, is exciting: it gives further merit to MTA and offers some clear directions for future work!

Multiple Temporal Aggregation: the story so far: Part I; Part II; Part III; Part IV.

]]>**Abstract**

The four major Scandinavian economies (Denmark, Finland, Sweden and Norway) have high workforce mobility and depending on market dynamics the unemployment in one country can be influenced by conditions in the neighbouring ones. We provide evidence that Vector Autoregressive modelling of unemployment between the four countries produces more accurate predictions than constructing independent forecasting models. However, given the dimensionality of the VAR model its specification and estimation can become challenging, particularly when modelling unemployment across multiple factors. To overcome this we consider the hierarchical structure of unemployment in Scandinavia, looking at three dimensions: age, country and gender. This allows us to construct multiple complimentary hierarchies, aggregating across each dimension. The resulting grouped hierarchy enforces a well-defined structure to the forecasting problem. By producing forecasts across the hierarchy, under the restriction that they are reconciled across the hierarchical structure, we provide an alternative way to establish connections between the time series that describe the four countries. We demonstrate that this approach is not only competitive with VAR modelling, but as each series is modelled independently, we can easily employ advanced forecasting models, in which case independent and VAR forecasts are substantially outperformed. Our results illustrate that there are three useful alternatives to model connections between series, directly through multivariate vector models, through the covariance of the prediction errors across a hierarchy of series, and through the implicit restrictions enforced by the hierarchical structure. We provide evidence of the performance of each, as well as their combination.

]]>**Abstract**

In this paper we explore how judgment can be used to improve model selection for forecasting.We benchmark the performance of judgmental model selection against the statistical one, based on information criteria. Apart from the simple model choice approach, we also examine the efficacy of a judgmental model build approach, where experts are asked to decide on the existence of the structural components (trend and seasonality) of the time series. The sample consists of almost 700 participants that contributed in a custom-designed laboratory experiment. The results suggest that humans perform model selection differently than statistics. When forecasting performance is assessed, individual judgmental model selection performs equally if not better to statistical model selection. Simple combination of the statistical and judgmental selections and judgmental aggregation significantly outperform both statistical and judgmental selection.

]]>**Abstract**

With thousands of call centres worldwide employing millions and serving billions of customers as a first point of contact, accurate scheduling and capacity planning of resources is important. Forecasts are required as inputs for such scheduling and planning in the short medium and long-term. Current approaches involve forecasting weekly demand and subsequent disaggregation into half-hourly, hourly and daily time buckets as forecast are required to support multiple decisions and plans. Once the weekly call volume forecasts are prepared, accounting for any seasonal variations, they are broken down into high frequencies using appropriate proportions that mainly capture the intra-week and intra-day seasonality. Although this ensures reconciled forecasts across all levels, and therefore aligned decision making, it is potentially not optimal in terms of forecasting. On the other hand, producing forecasts at the highest available frequency, and aggregating to lower frequencies, may also not be ideal as very long lead-time forecasts may be required. A third option, which is more appropriate from a forecasting standpoint, is to produce forecasts at different levels using appropriate models for each. Although this has the potential to generate good forecasts, in terms of decision making the forecasts are not aligned, which may cause organisational problems. Recently, Kourentzes et al. (2014) proposed the Multiple Aggregation Prediction Algorithm (MAPA), where forecasting with multiple temporal aggregation (MTA) levels allows both accurate and reconciled forecasts. The main idea of MTA is to model a series at multiple aggregation levels separately, taking advantage of the information that is highlighted at each level, and subsequently combine the forecasts by using the implied temporal hierarchical structure. Athanasopoulos et al. (2017) proposed a more general MTA framework than MAPA, defining appropriate temporal hierarchies and reconciliation mechanisms, and thus providing a MTA forecasting framework that is very flexible and model independent, while retaining all the benefits of MAPA. Given the high frequency, multi-temporal nature of the forecast requirements and the subsequent planning associated with call centre arrival forecasting, MTA becomes a natural, but yet unexplored candidate for call centre forecasting. This work evaluates whether there are any benefits from temporal aggregation both at the level of decision making as well as at the level of aggregation in terms of forecast accuracy and operational efficiency. In doing so, various methods of disaggregation are considered when the decision level and the forecasting level differ, including methods which results in reconciled and unreconciled forecasts. The findings of this study will contribute to call centre management practice by proposing best approaches for forecasting call centre data at the various decision levels taking into account accuracy and operational efficiency, but will also contribute to research on the use of temporal hierarchies in the area of high frequency time series data.

]]>In this third post about modelling with Multiple Temporal Aggregation (MTA), I will explain how the Multiple Aggregation Prediction Algorithm (MAPA) works, which was the first incarnation of MTA for forecasting.

MAPA is quite simple in its logic:

- a time series is temporally aggregated into multiple levels, at each level strengthening and weakening various components of the time series, as discussed before;
- at each level an independent exponential smoothing (ETS) model is fit and its components are extracted;
- the ETS components are combined, using a few tricks (see the paper for details), to produce the final forecast, which borrows information from all levels.

In the original MAPA paper alternative combination approaches were trialed, where all temporal aggregation levels were given equal importance and combined through mean or median. For the seasonal component (which is the high frequency one), this causes an important issue: as the seasonality is filtered at the aggregate levels, it is effectively shrunk towards zero. Therefore, the combined seasonal component will be shrunk as well. Originally this was addressed by using a simple heuristic, combining the ETS forecast of the original series, with the MAPA forecast (hybrid approach). This effectively means that we use a weighted combination, where the first temporal aggregation level is given more weight than all other temporal aggregation levels together. Empirical evidence suggests that this re-weighting is beneficial.

The latter developed w.mean and w.median weight schemes attempt to do the same with variable weights across temporal aggregation levels for the seasonal component. In fact, when dealing with high frequency time series, it is always recommended to use these.

The interactive plot below illustrates how MAPA works. You can choose between various time series, the combination scheme for the components across temporal aggregation levels, and whether the hybrid forecast (effectively a re-weighting of the components) is used or not. The top plot shows the identified ETS models at each temporal aggregation level. Greyed cells indicate levels that no seasonality is estimated. These are levels that would require fractional seasonality, not permitted by conventional ETS. The second row of plots provides the forecasted ETS components across temporal aggregation levels, as well as the combined one (thick black line). The components of the first aggregation level are those of the ETS fitted at the original time series, and are plotted with a thicker line. For the seasonal component only levels that can be seasonal are used. Note the difference between the trajectories for the ETS and MAPA components, as well as the various components at different temporal aggregation levels. The bottom plot provides the forecasts of ETS and MAPA. The MAPA forecast is simply the addition of the three MAPA components. If Hybrid is used, the resulting MAPA forecast is the combination of the MAPA components and the ETS forecast (that is just equivalent to a specific weighting scheme of the components).

In most cases the level and trend components are different from the ETS ones, and the seasonality is always somewhat shrunk, depending on the combination weights scheme used.

Why does MAPA work? It captures low frequency components (trend) better, because of the temporal aggregation. Furthermore, it does not rely on a single ETS model that may or may not be well identified, therefore mitigating model uncertainty. I argue that MTA, as implemented in MAPA is a neat trick to extract more information from a time series… for free!

What are the limitations of MAPA? Quite a few, but there are two major ones: (i) the combinations weight schemes are ad-hoc, but there is strong evidence that equal weights are surely not the best solution; (ii) the identification of the ETS model for levels that any seasonality might be fractional is weakened by not considering that seasonality and letting it contaminate the level and slope components. Arguably, the use of ETS at its core is another potential limitation, although that is possible to lift.

In practice, MAPA was the first method after almost 15 years to improve upon the M3 competition results and since then there is mounting empirical evidence supporting its good forecasting performance, as well as extensions to incorporate explanatory variables and forecast intermittent demand time series. An interesting finding is that MAPA is very robust against misspecification, when compared to more conventional approaches that attempt to capitalise on a single (optimal) level of temporal aggregation. If you want to try it out, there is a package for R available (or on GitHub).

Ultimately the true contribution of MAPA was to demonstrate that the ideas behind MTA were sound and useful for forecasting! For this, the paper that introduced MAPA recently received the International Journal of Forecasting 2014-2015 best paper award; I am very humbled and happy for this!

Although the aforementioned limitations are not resolved, MAPA motivated research into Temporal Hierarchies that provide a more thorough foundation for using MTA in forecasting, overcoming many of MAPA’s issues, and enabling multiple avenues of future research. This will be the topic of a future post in the series. Till then, I will conclude by mentioning that MAPA in many applications still provides more accurate forecasts than Temporal Hierarchies, demonstrating that it is still an interesting research topic.

Multiple Temporal Aggregation: the story so far: Part I; Part II; Part III; Part IV.

]]>This paper proposes the use of Multiple Temporal Aggregation approach that I have been posting about, and introduces the MAPA forecasting method, for which there is an R package. The other shortlisted papers were of very good quality and I am humbled by the choice of the editorial board. My thanks to my co-authors and the reviewers, who made this paper possible.

]]>- Time series exploration
- Univariate (extrapolative) forecasting
- Intermittent demand series forecasting
- Forecasting with regression
- Special topics: (i) Hierarchical forecasting; (ii) ABC-XYZ analysis; and (iii) LASSO regression

Material:

- Workshop notes: these provide code examples with comments. You will also find some references for the various methods used in the workshop.
- Workshop slides: these provide an
**extremely brief**overview of some of the methods used and their implementation. - Workshop R solution scripts: these replicate the examples in the notes.
- Workshop data: these are needed to replicate the examples in the notes and scripts.

The notes are aimed at researchers and experienced practitioners, who are comfortable with the theory behind the various models and methods. I hope you find this material useful!

A couple of more packages to explore:

- thief: A package that implement forecasting with temporal hierarchies
- smooth: A package that provides alternative implementations of exponential smoothing, ARIMA and other exciting forecasting models.

Tactical forecasting in supply chain management supports planning for inventory, scheduling production, and raw material purchase, amongst other functions. It typically refers to forecasts up to 12 months ahead. Traditional forecasting models take into account univariate information extrapolating from the past, but cannot anticipate macroeconomic events, such as steep increases or declines in national economic activity. In practice this is countered by using managerial expert judgement, which is well known to suffer from various biases, is expensive and not scalable. This paper evaluates multiple approaches to improve tactical sales forecasting using macro-economic leading indicators. The proposed statistical forecast selects automatically both the type of leading indicators, as well as the order of the lead for each of the selected indicators. However as the future values of the leading indicators are unknown an additional uncertainty is introduced. This uncertainty is controlled in our methodology by restricting inputs to an unconditional forecasting setup. We compare this with the conditional setup, where future indicator values are assumed to be known and assess the theoretical loss of forecast accuracy. We also evaluate purely statistical model building against judgement aided models, where potential leading indicators are pre-filtered by experts, quantifying the accuracy-cost trade-off. The proposed framework improves on forecasting accuracy over established time series benchmarks, while providing useful insights about the key leading indicators. We evaluate the proposed approach on a real case study and find 18.8\% accuracy gains over the current forecasting process.

Download paper.

]]>In this post I will demonstrate the effects of temporal aggregation and motivate the use of multiple temporal aggregation (MTA). I will not delve into the econometric aspects of the discussion, but it is worthwhile to summarise key findings from the literature. A concise forecasting related summary is available in our recent paper Athanasopoulos et al. (2017), section 2:

- Temporal aggregation changes the (identifiable) structure of the time series;
- As the aggregation level increases there are less components that appear and higher-frequency components (for example, seasonality and promotions) become weaker or vanish altogether;
- Temporal aggregation reduces the sample size resulting in loss of estimation efficiency. To make this simple, if you have 4 years of monthly data and you aggregate your series to a yearly level you will have to build a model with only for four data point, risky!
- There are accuracy gains to be had, but identifying the (single) appropriate temporal aggregation level is very difficult! Yet, it still simplifies some problems, like intermittent demand forecasting.

What do these mean for our forecasts? Well, if you work on the basis that the true model is an elusive idea, these are not too prescriptive for constructing forecasts. I will try to give you an intuition visually. In the following interactive visualisation you can choose between different time series and plot the original and temporally aggregated data, together with a seasonal plot. The seasonal plot will be shown only when it is feasible, i.e. the resulting seasonality after the aggregation has an integer period greater than 1. For each series I also fit an appropriate exponential smoothing model (selected using AICc) and provide a list of the fitted components for all temporal aggregation levels, up to yearly data. I also provide the relevant forecast. Observe a few things:

- The identified exponential smoothing models are often different across temporal aggregation levels. In particular the seasonality is filtered as we aggregate into bigger time buckets. Of course, for some aggregation levels (for example, aggregate every 5-months) the resulting series has a non-integer seasonal period and typical forecasting methods cannot capture it and instead it contributes to the error part;
- Other aspects of the series, like outliers, vanish as we aggregate to higher levels;
- Some times aggregation makes the time series easier to model, and sometimes it over-smooths the series! The forecasts surely vary a lot as we aggregate.

At minimum we can say that temporal aggregation alters the identifiable parts of the time series, strengthening low-frequency components (such as trend), while weakening high-frequency components (such as seasonality). Depending on the forecasting objective, this may result in better forecasts, especially if we are aiming at long term forecasts. Furthermore, simply because the temporal aggregation filters part of the noise (it is a moving average filter!) it may just be better to model a series at a more aggregate level.

The main problem in the literature is that it is very difficult to know what is the optimal temporal aggregation level, which will maximise your forecast accuracy, for **real data**. This is not a trivial point: There are theoretical solutions suggesting the optimal temporal aggregation level for various data generation processes, but they rely on full knowledge of the process! Well, if I knew the process, then forecasting it would be trivial. Recent research showed that although we can easily show benefits on simulated data, it becomes much more complicated with real data that the true model is unknown.

If we connect the dots, there are four key arguments in favour of MTA (discussed in more detail in these three papers [1], [2] and [3]):

- Because we are provided with a time series sampled at some time interval, we do not have to model it at that level! It may be better to do so at some aggregate level.
- Temporal aggregation can be beneficial for forecasting, but identifying a single optimal level of aggregation is very challenging, so why not use multiple?
- Using multiple levels we avoid relying on a single forecasting model, therefore we mitigate modelling uncertainty by considering multiple (different) models across temporal aggregation levels.
- Holistic modelling of the time series information: Models built on the original data or on low aggregation levels can focus more on high frequency components, while models build on high aggregation levels focus on low frequency components, which may not be easy to capture in the originally sampled time series.

All these points suggest that using Multiple Temporal Aggregation levels should be useful, but we have not yet addressed the question how to do this! I will introduce our first attempt to do this, the Multiple Temporal Aggregation Algorithm (MAPA) in the next post in the series.

Multiple Temporal Aggregation: the story so far: Part I; Part II; Part III; Part IV.

]]>

In this series of blog posts I will try to summarise the progress so far, and highlight ways that you can use it. This first post will summarise the papers so far and give an overview of the main findings. Later posts will focus on explaining how MTA works.

The key points behind MTA are the following:

- It is a radically different approach to time series modelling, recognising that the data sampling frequency may not be the best for a given modelling purpose.
- A time series is modelled simultaneously at multiple temporal aggregation levels that can be easily generated from the original data. At each level an appropriate model is fit, focusing on the components of the series that are strengthened by temporal aggregation.
- If forecasting is the objective, then the produced forecast reconciles the information from all these models. This makes the forecast robust to modelling uncertainty and lessens the importance of model selection.
- The resulting forecasts have been shown to be reliable and typically outperform the conventional modelling approach.

Table 1 summarises our contributions on MTA so far (follow the links to access the papers). We have also released two R packages that implement MTA: MAPA and thief. The former implements, as the name suggests, MAPA, while the latter provides code to use Temporal Hierarchies.

Paper | Summary |
---|---|

Kourentzes et al. 2014. Improving forecasting by estimating time series structural components across multiple frequencies. | The initial paper on MTA modelling. It introduces the Multiple Aggregation Prediction Algorithm (MAPA) and demonstrates its superior performance on the well-known M3 competition. |

Petropoulos and Kourentzes 2014. Forecast combinations for intermittent demand. | Expands MAPA for the case of intermittent demand. |

Kourentzes and Petropoulos 2016. Forecasting with multivariate temporal aggregation: The case of promotional modelling. | Expands MAPA for promotional modelling purposes at Stock Keeping Unit level. |

Barrow and Kourentzes 2016. Distributions of forecasting errors of forecast combinations: implications for inventory management. | Provides evidence of very strong performance of MAPA over established benchmarks for demand forecasting and inventory management purposes. |

Athanasopoulos et al. 2017. Forecasting with temporal hierarchies. | Introduces a general framework for MTA: Temporal Hierarchies that allows use of any model/method to produce forecasts at each level. |

Kourentzes et al. 2017. Demand forecasting by temporal aggregation: using optimal or multiple aggregation levels? | Demonstrates that MTA modelling is more robust to uncertainty than modelling either using the original data or using a single (optimal) temporal aggregation level. |

To give you an idea of the reported improvements, I have collated some of the results from the papers above. The best forecast in each column, in all tables, is highlighted in boldface. Table 2 provides a summary for the quarterly and monthly M3 datasets, using as benchmarks the Exponential Smoothing (ETS) family of models, with automatic model selection (via AICc), and Theta, the best performing method on the original M3 competition – a position it held for almost 15 years! In this case both MAPA and Temporal Hierarchies make use of the ETS family of models, so you can get a feeling of the improvement provided by MTA over conventional time series forecasting, as the results are directly comparable with the ETS row.

Tables 3 and 4 provide results for a number of real datasets. Table 4 also provides results on a variety of simulated ARIMA series. The detailed results can be found in the respective papers. In all cases MAPA is better, or at least as good, compared to the various benchmarks. Table 5 provides results on real series that have promoted periods. There are two comparisons: forecasts without and with promotional information. In both cases MTA based forecasts (MAPA) are on average the most accurate.

Forecast | Quarterly set | Monthly set |
---|---|---|

Exponential Smoothing (ETS) | 9.94% | 14.45% |

Theta (M3 competition)^{2} |
8.96% |
13.85% |

MAPA (Kourentzes et al. 2014) | 9.58% | 13.69% |

Temporal Hierarchies (Athanasopoulos et al. 2017) | 9.70% | 13.61% |

Forecast | 1-step ahead | 3-steps ahead | 5-steps ahead |
---|---|---|---|

Naive | 0.882 | 0.900 | 0.919 |

ETS | 0.677 | 0.688 | 0.711 |

AR | 0.707 | 0.719 | 0.737 |

ARIMA | 1.446 | 0.701 | 0.721 |

Theta | 0.674 | 0.685 | 0.705 |

MAPA | 0.668 |
0.670 |
0.687 |

Forecast | Simulated ARIMA | Manaufacturing | Call centre |
---|---|---|---|

Single Exponential Smoothing (SES) | 1.000 | 1.000 | 1.000 |

Exponential Smoothing (ETS) | 0.985 | 1.011 | 1.005 |

Optimal Temporal Aggregation & SES | 0.974 | 0.999 | 1.080 |

MAPA | 0.971 |
0.994 |
0.979 |

Forecast | 4-steps ahead | 8-steps ahead | 12-steps ahead |
---|---|---|---|

Naive | 0.743 | 0.818 | 0.704 |

ETS | 0.704 | 0.774 | 0.701 |

MAPA | 0.679 | 0.754 | 0.736 |

Regression + Promotional | 0.611 | 0.659 | 0.714 |

ETS + Promotional | 0.642 | 0.627 | 0.543 |

MAPA + Promotional | 0.525 |
0.521 |
0.515 |

The main argument in all papers is that MTA helps to improve forecast accuracy due to the way it mitigates modelling uncertainty. As we will see this comes at no additional data cost and relatively limited additional computations. An added benefit, which is not very evident from the summarised tables provided here, is that the MTA forecasts are reliable both for short and long term forecasting, providing a way to reconcile operational, tactical and strategic planning.

Unpublished results on different applications provide a similar picture in terms of accuracy. There is also evidence that MTA can strengthen statistical tests, as the initial results of this experiment show. However, all this is ongoing research, so until a full analysis is conducted and the results are peer reviewed, I would add a pinch of salt to these!

In following blog posts I will explain how MTA works and elaborate more on results from the various papers.

Multiple Temporal Aggregation: the story so far: Part I; Part II; Part III; Part IV.

]]>