Author Archives: Nikos

ISF 2015 invited session on “Forecasting with Combinations and Hierarchies”

I am organising a special session on the upcoming International Symposium on Forecasting with the topic “Forecasting with Combinations and Hierarchies”. Please contact me if you are interested to contribute to this session.

In many applications there are time series that can be hierarchically organised and can be grouped or aggregated in several ways, based on some predefined structure. For example, an organisation may be interested in forecasting a quantity defined by geographical boundaries, which can be modelled at a detailed disaggregate level, or at an aggregate top level. Such problems are typically modelled using hierarchical time series methods: bottom-up, top-down or middle-out. These methods attempt to reconcile the forecasts for the various time series by producing forecasts at a single level of the hierarchy and aggregating or disaggregating, as appropriate to the rest of the hierarchy.

More recently a method based on optimal combinations has been proposed (Hyndman et al., 2011) that has shown to perform favourably against competing hierarchical methods (Athanasopoulos et al., 2009). The idea behind this approach is based on linearly combining predictions from all the time series in the hierarchy, where the weights of the linear combination are defined by the hierarchy imposed by the problem definition. Such combinations are optimal in minimising the reconciliation error, under certain assumptions. Current research has demonstrated that these assumptions are perhaps too strong and different combination weighting schemes have been proposed that empirically perform better (Athanasopoulos et al., 2014), although the best combining scheme remains an open question.

Irrespective of the methodology followed to solve the hierarchical/combination forecasting problem this research is important for practice as multiple organisation and companies face hierarchical cross-sectional problems as defined by products, market segments, geographical boundaries, etc. or combinations of these factors. Recently Kourentzes et al. (2014) proposed that instead of focusing our investigation on cross-sectional hierarchies, we can define hierarchies of temporal nature, which are applicable to any forecasting problem. They showed empirically that using multiple temporal aggregation levels of a time series and combining forecasts produced at each level resulted in predictions that substantially outperformed conventional forecasting approaches. The combination of forecasts across temporal hierarchies allowed for better capturing the various structural components of the time series, as well as mitigating the impact of model misspecification, either in terms of model form or parameters. Therefore, their methodology demonstrated that hierarchical combination problems are relevant to any forecasting application and thus deserve more attention in research, as there is substantial potential for impact to practice.

References

Hyndman, R. J.; Ahmed, R. A.; Athanasopoulos, G. & Shang, H. L. Optimal combination forecasts for hierarchical time series. Computational Statistics and Data Analysis, 2011, 55, 2579-2589

Athanasopoulos, G.; Ahmed, R. A. & Hyndman, R. J. Hierarchical forecasts for Australian domenstic tourism. International Journal of Forecasting, 2009, 25, 146-166

Athanasopoulos, G.; Hyndman, R. J.; Kourentes, N. & Petropoulos, F. Forecasting hierarchical time series. The 2014 Internatinal Symposium on Forecasting, 2014, Rotterdam.

Kourentzes, N.; Petropoulos, F. & Trapero, J. R. Improving forecasting by estimating time series structural components across multiple frequencies. International Journal of Forecasting, 2014, 30, 291-302

Special Session on Energy Forecasting – EURO 2015

Juan R. Trapero and I are organising a special session on Energy Forecasting within the Forecasting & Time Series Prediction stream at EURO2014.

Energy modelling and forecasting has become essential to optimize the generation, control and distribution processes of countries’ energy systems. This special session calls for abstracts analyzing concepts, models, methodologies, case studies that contribute to strengthen our knowledge of such an important area. Topics of interests (but not limited to) are forecasting: electricity load and price; renewables including wind energy, solar energy, wave energy and biomass; coal and gas demand, oil products and its derivatives; as well as model for predicting the energy mix.

We invite you to submit your relevant research to this special session. To do this you will need submit your abstract using the organised session option. Please use the following submission code: 4e0e6c1f

With Juan and Alberto Martin we have been working on solar irradiation forecasting and we plan to present our work at the conference. For a working paper of our research so far please visit Juan’s blog.

Update: Added submission code for the special session.

The Bias Coefficient: a new metric for forecast bias

In this post I introduce a new bias metric that has several desirable properties over traditional ones.

When evaluating forecasting performance it is important to look at two elements: forecasting accuracy and bias. Although there has been substantial progress in the measurement of accuracy with various metrics being proposed, there has been rather limited progress in measuring bias. A typical metrics is the Mean Error (ME):

$ME = n^{-1}\sum_{j=1}^n{e_j},$

where n is the number of errors e_j. Although in principle this is a scale dependent metric, this limitation is overcome by scaling appropriately the raw errors. Nonetheless, the ME still suffers from a number of other limitations. First, the size of errors is lost. Large positive and negative errors are lost. Although the bias information is retained we cannot infer whether an observed bias is associated with large or small errors. Second, the size of bias is very difficult to interpret, as the ME does not have any natural upper or lower bounds. Variants of ME, such as the Mean Percentage Errors (MPE) have been proposed to provide an easier to communicate bias size, expressed as a percentage:

$MPE = \frac{100}{n}\sum_{j=1}^n{\frac{e_j}{y_j}},$

where y_j is the actual observation at time j. However, MPE is again unbounded and introduces additional complications. For example the MPE is not symmetric in terms of how negative and positive biases are measured. Consider the following: the forecast for a period is 90, while the observed demand is 100. The error, measured as the difference between actual and forecasted values will be 10 and the MPE = 10%. If on the other hand the forecast was 100 and the demand was 90 the error would be -10, but the MPE = -11.1%, even though the bias is actually of the same size.

In this working paper we propose a new metric, the Root Error, attempting to overcome these limitations. Complex number analysis is not a common tool in forecasting; however it has certain advantages that we can take advantage of here. We calculate the Root Error (RE) for each period as:

$z_j = \sqrt{e_j} = a + bi.$

Since errors can be negative z_j can be a real or imaginary number and i is the imaginary unit that satisfies the equation i²=-1. Value a is the real part and b is the imaginary part of the complex number. For positive errors:

$a = \sqrt{e}$

and b=0, while for negative a=0 and

$b=\sqrt{|e|}.$

We can summarise the RE with the Sum Root Error (SRE) and Mean Root Error (MRE) to summarise across several errors:

$SRE = \sum_{j=1}^n{\sqrt{e_j}} = \sum_{j=1}^n{a_j} + i\sum_{j=1}^n{b_j},$
$MRE = \frac{1}{n}SRE = \frac{1}{n}\sum_{j=1}^n{a_j} + \frac{i}{n}\sum_{j=1}^n{b_j}.$

These metrics are relatively robust to outliers due to the square root involved in the calculation. Let us assume that we have three experts A, B and C. The following figure shows how ME and MRE summarise the bias and error information differently.

If the negative and positive errors are equal, then a and b will be equal and MRE will be on the diagonal. If the positive errors are more, or the negative, then the line will be under or over the diagonal respectively. The size of the bias is represented by how distant the MRE of each expert is from the diagonal. Also note that the MRE retains the size of errors, clearly highlighting that expert B is more inaccurate, although less biased than expert C.

Expressing the complex errors in their polar form allows us to achieve a more intuitive interpretation. Let us define γ as the angle of MRE:

$\gamma = arctan(b/a), \text{if } a > 0,$

where arctan is the inverse tangent and γ is expressed in radians. If a=0 and b>0 then γ = π/2. If both a, b = 0 then γ = 0. Notice that the magnitude of MRE does not relate to the bias. Following the previous description of unbiasedness, a forecast will be unbiased for γ = π/4, while all forecast errors will have an angle γ between 0 and π/2. Instead of using γ that is expressed radians, we can define the Bias Coefficient:

$\kappa = 1 - {4\gamma}/{\pi}.$

The bias coefficient is a unit-free metric. A forecast that is always over the observed values will have a bias coefficient equal to -1, always over-forecasting, while the bias coefficient will be equal to 1 for the opposite case. Let us visualise the bias coefficient in the following figure. Assuming a large number of forecasts for different time series, the MRE per time series is calculated and subsequently the bias coefficient. This is then visualiased using a boxplot. The dot in the boxplot denotes the mean MRE.

The bias coefficient:

is bounded, therefore we can characterise biases as strong or weak and have -1 and 1 as bounds of maximally biased forecasts.
can be read similarly to the well known linear correlation coefficient. A zero value means no bias, while other values mean strong or weak bias, positive or negative. This makes it very easy to interpret and gives a non-relative understanding whether a forecast exhibits strong bias or not.
is free of units or scale, allowing comparisons and summaries between different time series without any pre-processing.
being unit free and bounded makes it ideal to benchmark the bias behaviour of different forecasts, methods, experts, organisations, sectors, etc.

The Root Error has several interesting properties as a metric. The bias coefficient is only one of the ways to use this new metric to characterise the performance of forecasts. This working paper introduces the Root Error and discusses many of the properties and uses of the new metric. I believe you will find it interesting.

I have updated TStools to include three new functions: mre, mre.plot and bias.coeff to help you experiment with the new metric and visualisations in the paper.

Here is a quick demonstration of how to use these functions. Let us first create some random errors:

> err <- runif(10,-10,10)

Now let us calculate the Mean Root Error of these:

> library(TStools)
> re <- mre(err)
> re
[1] 0.754675+1.924278i

Now let us calculate the bias coefficient for this:

> bias.coeff(re)
[1] -0.5241242

For the 10 random errors of this example we find a relatively strong negative bias. The bias.coeff function can also output histograms and boxplots of the bias coefficients for the input MRE.

Measuring the behaviour of experts on demand forecasting: a complex task

N. Kourentzes, J. R. Trapero and I. Svetunkov, 2014.

Forecasting plays a crucial role in decision making and accurate forecasts can bring important benefits for organizations. Human judgement is a significant element when preparing these forecasts. Judgemental forecasts made by experts may influence accuracy, since experts can incorporate information difficult to structure and include in statistical models. Typically, such judgemental forecasts may enhance the accuracy under certain circumstances, although they are biased given the nature of human behaviour. Although researchers has been actively looking into possible causes of human bias, there has been limited research devoted to empirically measuring it, to the extent that conclusions can be totally divergent depending on the error metric chosen. Furthermore, most of the error metrics are focused on quantifying the magnitude of the error, where the bias measure has remained relatively overlooked. Therefore, in order to assess human behaviour and performance, an error metric able to measure both the magnitude and bias of the error should be designed. This paper presents a novel metric that overcomes the aforementioned limitations by using an innovative application of the complex numbers theory. The methodology is successfully applied to analyse the judgemental forecasts of a household products manufacturer. This new point of view is also utilized to revisit related problems as the mechanistic integration of judgemental forecasts and the bias-accuracy trade-off.

Download paper.

Supply Chain Forecasting: Questionnaire

Aris Syntetos, Mohamed Zied Babai, John Boylan, Stephan Kolassa and Kostas Nikolopoulos have prepared a questionnaire about supply chain forecasting to help them on their research for a review article on the topic. I think that surveying the community to identify important areas for academics and practitioners is commendable. You can participate by submitting your views in the survey link below.

We’re looking for your help!

We have been commissioned by the European Journal of Operational Research to write a review article on Supply Chain Forecasting. In undertaking this task, we would like to ensure that the topics covered reflect the priorities of the forecasting community as a whole. To that end, we have prepared an (anonymous) questionnaire that aims to identify the most important areas that should be addressed in our paper, both from academic and practitioner perspectives.

To participate in this study, visit the Supply Chain Questionnaire (http://www.surveymonkey.com/s/G9TMWYD). It will take no more than 5-10 minutes of your time. Please indicate if you are an academic or a practitioner in the last section of the questionnaire.

We are working on very tight deadlines! Your timely help and participation is appreciated. For any enquiries or comments on this work please contact Aris Syntetos. Thank you for your time and co-operation,

A.A. Syntetos, M.Z. Babai, J.E. Boylan, S. Kolassa & K. Nikolopoulos

Time series forecasting competition with computational intelligence methods

I recently became aware of a new forecasting competition: “International Forecasting Competition – Computational Intelligence in Forecasting”. The competition involves forecasting 91 time series of annual, quarterly, monthly and daily sampling frequency of various lengths. Although the competition is focused on computational intelligence methods (incl. fuzzy method, artificial neural networks, evolutionary algorithms, decision & regression tress, support vector machines, hybrid approaches etc.) other forecasting methods are welcome as benchmarks.

Submission deadline for the forecasts is the 11th of January, 2015. The competition will be hosted as a special session in the upcoming ISFA/EUSFLAT 2015 conference. You can find more details on the competition’s website.

Additive and multiplicative seasonality – can you identify them correctly?

Seasonality is a common characteristic of time series. It can appear in two forms: additive and multiplicative. In the former case the amplitude of the seasonal variation is independent of the level, whereas in the latter it is connected. The following figure highlights this:

Note that in the example of multiplicative seasonality the season is becoming “wider”. Obviously if the level was decreasing the seasonal amplitude of the multiplicative case would decrease as well. For selecting the appropriate model to produce our forecasts we need to know the type of seasonality we are dealing with. How do you compare against statistical identification? Select additive or multiplicative in the demonstration below and submit your choice to see if you can do better than statistics and the average accuracy of participants so far.

Guest post: On the robustness of bagging exponential smoothing

This is a guest blog entry by Fotios Petropoulos.

A few months ago, Bergmeir, Hyndman and Benitez made available a very interesting working paper titled “Bagging exponential smoothing methods using STL decomposition and Box-Cox transformation”. In short, they successfully employed the bootstrap aggregation technique for improving the performance of exponential smoothing. The bootstrap technique is applied on the remainder of the series which is extracted via STL decomposition on the Box-Cox transformed data so that the variance is stabilised. After generating many different instances of the remainder, they reconstruct the series by its structural components, ending with an ensemble size of 30 series (the original plus 29 bootstraps). Using ETS (selecting the most appropriate method from the exponential smoothing family using information criteria) they produce multiple sets of forecasts, for each one of these series. The final point forecasts are produced by calculating the arithmetic mean.

The performance of their approach (BaggedETS) is measured using the M3-Competition data. In fact, the new approach succeeds in improving the forecasts of ETS in every data frequency, while a variation of this approach produced the best results for the monthly subset, even outperforming the Theta method. I should admit that the very last bit intrigued me even more to search in depth how this new approach works and if it is possible to improve it even more. To this end, I tried to explore the following:

Is the new approach robust? In essence, the presented results are based on a single run. However, given the way bootstrapping works, each run may give different results. Also, what is the impact of the ensemble size (number of bootstrapped series essentially) on the forecasting performance? A recent study by the owner of this blog suggests that higher sample sizes result in more robust performance in the case of neural networks. Is this also true for the ETS bootstraps?
How about using different combination operators? The arithmetic mean is only one possibility. How about using the median or the mode (both have shown increased performance in the aforementioned study by Nikos – a demonstration of the performance of the different operators can be found here).
Will the new approach work with different forecasting methods? How about replacing the ETS with ARIMA? Or even Theta method?

To address my thoughts, I set up a simulation, using the monthly series from the M3-Competition. For each time series, I expanded the number of bootstrapped series to 300. Then, for every ensemble with sizes from 5 up to 150 members with step 5, I randomly selected the required number of bootstrapped series as to create 20 groups. The point forecasts for each group were calculated not only by the arithmetic mean but also by the median. For each operator and for each ensemble size considered, a five-number summary (minimum, Q1, median, Q3 and maximum sMAPE) for the 20 groups is graphically provided in the following figure.

So, given that the ensemble size is at least of size 20, we can safely argue that the performance of the new approach is 75% of the times better than that of Theta method (and by far always better than that of ETS). Having said that, the higher the number of bootstraps, the better and more robust the performance of the new approach is. This is especially evident for the median operator. Another significant result is that the median operator performs better at all times compared to the mean operator. Both results are in line with the conclusions of the aforementioned paper.

In addition, I repeated the procedure using the Theta method as the estimator. The results are presented in the next figure.

Once more, the new approach is proved to be better than the estimator applied only on the original data. So, it might be of value to check whether this particular way of creating bootstrapped time series is something that can be generalised, so that the new approach can be regarded as a “self-improving mechanism”. However, it is worth mentioning that in this case the arithmetic mean operator generally performs better than that of the median operator (at least with regards to the median forecast error), resulting, though, in more variable (less robust) performance.

By comparing the two figures, it is interesting that BaggedETS is slightly better than BaggedTheta for the ensemble sizes greater than 50. But this comes with a computational cost: BaggedETS is (more or less) 40 times slower in producing forecasts compared to BaggedTheta. This may render the application of BaggedETS problematic even for a few thousands SKUs if these are to be forecasted daily.

To conclude, I find that the new approach (BaggedETS) is a very interesting technique that results in improved forecasting performance. The appropriate selection of the ensemble size and the operator can lead to robust forecasting performance. Also, one may be able to use this approach with other forecasting methods as well, as to create the BaggedTheta, BaggedSES, BaggedDamped, BaggedARIMA…

Fotios Petropoulos (October 31, 2014)

Acknowledgments

I would like to thank Christoph Bergmeir for providing clarifications on the algorithm that rendered the replication possible. Also, I am grateful to Nikos for hosting this post in his forecasting blog.

Update 13/11/2014: Minor edits/clarifications by Fotis.

Exponential smoothing demo

In my experience users of exponential smoothing have often limited transparency in how the various smoothing parameters interact. I built this small demo to illustrate how the different smoothing parameters and exponential smoothing components interact. You can choose between some simulated and some real time series, as well as the option to add outliers or level shifts to the series, to explore what is the effect of different parameters in each case. The parameters can be setup either using their traditional form or the state space reformulation.

You can download the R code for this demo here.

SAS-IIF grants to promote forecasting research

Every year the International Institute of Forecasters (IIF) in collaboration with SAS has been supporting forecasting research with grants up to $5,000. The application deadline for this year is the 30th of September 2014.

Applications must include:

Description of the project, up to 4 pages;
Letter of support from the home institution that the researcher is based at;
Brief CV, up to 4 pages;
Budget and workplan for the project.

For more information have a look at the IIF website and blog. There you can also find previously successful grants to give you an idea about their scope.

Related Posts

Related Posts

Related Posts

Related Posts

Related Posts

Related Posts

Related Posts

Related Posts

Related Posts