Author Archives: Nikos

Update for nnfor: reuse and retrain models

A feature that was missing in the nnfor package was the ability to reuse and retrain mlp and elm model fits. This is now possible with the new arguments model and retrain.

As an example, let us use the AirPassengers time series, with three different sample sizes and re-use and re-train the same model in various combinations.

# Get some data
y <- AirPassengers
y1 <- window(y,end=c(1958,12))
y2 <- window(y,end=c(1959,12))
y3 <- window(y,end=c(1960,12))

# Fit NN 
fit <- list()
fit[[1]] <- mlp(y1)
fit[[2]] <- mlp(y2,model=fit[[1]])
fit[[3]] <- mlp(y2,model=fit[[1]],retrain=TRUE)
fit[[4]] <- mlp(y3,model=fit[[1]])
fit[[5]] <- mlp(y3,model=fit[[3]])
fit[[6]] <- mlp(y3,model=fit[[1]],retrain=TRUE)
fit[[7]] <- mlp(y3)

# Get MSE and number of lags
mse <- unlist(lapply(fit,function(x){x$MSE}))
lags <- unlist(lapply(fit,function(x){length(x$lags)}))
Model Series Sample Training Sample Retrain MSE Lags
fit[[1]] up to 1958 up to 1958 X 6.73 9
fit[[2]] up to 1958 up to 1958 61.25 9
fit[[3]] up to 1959 up to 1959 X 6.68 9
fit[[4]] up to 1960 up to 1958 541.13 9
fit[[5]] up to 1960 up to 1959 260.22 9
fit[[6]] up to 1960 up to 1960 X 12.65 9
fit[[7]] up to 1960 up to 1960 New fit 7.95 10

As you can see, with different in-sample data and no retraining the in-sample MSE deteriorates. Using the “up to 1958” fit on the “up to 1959” and “up to 1960” samples returns MSE of 61.25 and 541.13 respectively. If we refit the network (keeping the specification fixed) the error reduces to 6.68 and 12.65 respectively. Building a new model on the “up to 1960” data finds a different lag structure (increasing the used lags from 9 to 10), resulting in an MSE of 7.95.

The same arguments can be used with elm.

Can you predict the closing price of Bitcoin?

Lately, there has been a lot of talks whether Bitcoin is a bubble (about to burst) or not. The discussion is quite interesting, not only because there is potentially a lot of money involved, but also because it shows how our economic theories are primarily unclear and secondarily incomplete on concepts such as bubbles and cryptocurrencies. Irrespectively of the various views of the “real” value of Bitcoin, most agree that it is fueled by lots of speculation. That made me wonder, how well can crowdsourcing anticipate its movements?

I built the app below that will update with the daily closing prices and also show how successful or not yesterday’s predictions were. Apart from getting your forecasts, I would be interested to get your views about this in the comments.

Check again tomorrow to see what was the outcome on today’s closing price!
I provide here the latest tweets about bitcoin, to fuel your predictions!

MAPAx example for R

I have been asked a few time to provide an example how to use MAPAx, from the MAPA package for R, so I prepared this blog post. I admit the documentation could be better, so I put together this example from a retailing case – the original setting MAPAx was developed for (see paper here). This provides a step by step use of MAPAx.

First we load the package and some data.

y <- cbind(c(1030.829,893.551,1084.09,1278.436,936.708,915.322,885.713,2364.399,774.88,977.506,831.616,813.656,1569.956,967.925,806.146,1063.117,1123.787,906.686,996.498,1088.464,977.414,1128.328,896.594,1007.172,1046.379,1514.648,1626.115,2959.558,838.506,949.377,1433.307,805.048,1218.907,872.43,1730.103,865.734,1845.713,919.291,1003.363,1102.969,847.38,1965.26,809.673,953.193,1066.089,991.352,1115.694,1003.333,1090.48,930.749,1006.184,1239.068,873.707,728.583,881.316,1302.468,997.442,3481.118,841.042,997.601,1830.194,909.693,2358.406,2573.673,777.08,773.781,945.424,968.646,1074.589,1046.22,1155.559,990.627,931.943,786.285,2297.025,628.166,889.238,937.631,1113.925,870.384,1018.375,799.458,1542.328,1879.587,750.307,1087.948,1247.803,1052.352,883.899,793.126,913.736,1082.142,968.823,2099.176,841.224,964.227,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA))
x <- cbind(c(0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0),
y <- y[!]
y <- ts(y,frequency=12)

This toy set includes sales of a heavily promoted item, with three different promotion. Let’s visualise what we have:

cmp <- brewer.pal(3,"Set1")
for (i in 1:3){

Now let’s fit a MAPA and MAPAx and produce forecasts. I am modelling the sales in logs to capture the multiplicative promotional effects. For other types of applications this is not needed.

mapafit.x <- mapaest(log(y),type="es",display=1,outplot=1,xreg=x)
mapafit <- mapaest(log(y),type="es",outplot=1)

frc.x <- mapafor(log(y),mapafit.x,ifh=13,fh=13,xreg=x,conf.lvl=c(0.8,0.9,0.95),comb="w.mean")
frc <- mapafor(log(y),mapafit,ifh=13,fh=13,conf.lvl=c(0.8,0.9,0.95),comb="w.mean")

Let’s plot the results

plot(1:96, y, type="l",xlim=c(1,109),xaxs="i",main="MAPA",xlab="Period",ylab="Sales")
for (i in 1:96){
cmp <- brewer.pal(9,"Reds")[4:2]
for (i in 1:3){

plot(1:96, y, type="l",xlim=c(1,109),xaxs="i",main="MAPAx",xlab="Period",ylab="Sales")
for (i in 1:96){
cmp <- brewer.pal(9,"Reds")[4:2]
for (i in 1:3){

As you can see, MAPA provides a flat forecast. This is “correct” in the sense that apart from promotions this time series contains like information that could be captured by exponential smoothing. On the other hand, MAPAx is provided with the promotional information and makes use of it.

nnfor on github

I have put up a github repository for the nnfor package for R:

I will be putting updates and fixes there, before they are pushed on CRAN. You can also report there bugs.

You can install the current github version with:

if (!require("devtools")){install.packages("devtools")}

Congratulations Dr. Sagaert!

Yesterday, Yves Sagaert successfully defended his PhD in a public presentation at Ghent University! Yves’ PhD research has been on: tactical sales forecasting with external leading indicators. It has been a pleasure to work with Yves over the past years!

During his PhD he published two papers, with more currently under review:

Looking forward to seeing his future work!

New R package nnfor: time series forecasting with neural networks

My new R package nnfor is available on CRAN. This collects the various neural network functions that appeared in TStools. See this post for demo of these functions. In summary the package includes:

  • Automatic, semi-automatic or fully manual specification of MLP neural networks for time series modelling, that helps in specifying inputs with lags of the target and exogenous variables. It can automatically deal with pre-processing (differencing and scaling) and identify the number of hidden nodes. The user can control which of these settings are left on automatic or not.
  • A few options for building network ensembles.
  • Plotting functions of the network topology, fit and forecast.
  • All the above for ELMs (Extreme Learning Machines).
  • Support for Temporal Hierarchies Forecasting, with the thief package for R.

This builds on the neuralnet package for R, and provides the code to make the networks capable of handling time series data automatically. Although that package is quite flexible, it is computationally expensive and does not permit for deep learning. The plan is to eventually implement such capabilities in the package.

There are numerous papers that support the ideas used to put together this package:

  • In my new book, Ord et al., 2017, Principles of Business Forecasting, 2e, Wessex Press Publishing. Chapter 10 describes the basic logic in building MLP networks for time series forecasting. This package implements the logic described there.
  • This paper demonstartes the performance of the input variable selection algorithm: Crone and Kourentzes, 2010, Feature selection for time series prediction – a combined filter and wrapper approach for neural networks. Neurocmputing, 73, 1923-1936. There is some influence from this proceedings paper as well. (These feel like really old papers!)
  • This paper looks at the combination operator for the ensembles. Please move away from the average! Kourenztes et al., 2014, Neural network ensembles operators for time series forecasting. Expert Systems with Applications, 41, 4235-4244.

The neural network functions in TStools will be removed, initially pointing towards this package and latter removed completely.

There is a github repository for this, where I will be posting updates and fixes till they go on CRAN:

Happy (nonlinear) forecasting!

Principles of Business Forecasting 2e

I recently got my hands on a physical copy of my new book: Principles of Business Forecasting (2nd edition).

Ord, K., Fildes, R. and Kourentzes, N., 2017. Principles of business forecasting. 2nd ed. Wessex Press Publishing Co.

I was invited by Keith Ord and Robert Fildes to join them in writing the much-revised 2nd edition of the book. The book is aimed at both practitioners and students and it differs from typical time series textbooks in being focused on business forecasting, with appropriate focus on various methods, as well as processes and judgemental forecasting.

The book’s chapters give you an idea of the various topics covered:

  • Chapter 1: Forecasting, the Why and the How
  • Chapter 2: Basic Tools for Forecasting
  • Chapter 3: Forecasting Non-Seasonal Series
  • Chapter 4: Seasonal Series: Forecasting and Decomposition
  • Chapter 5: State-Space Models for Time Series
  • Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
  • Chapter 7: Simple Linear Regression for Forecasting
  • Chapter 8: Multiple Regression for Time Series
  • Chapter 9: Model Building
  • Chapter 10: Advanced Methods of Forecasting
  • Chapter 11: Judgment-Based Forecasting
  • Chapter 12: Putting Forecasting Methods to Work
  • Chapter 13: Forecasting in Practice

The content of some chapters is self-evident, though others cover a broad set of topics. For example, chapter 10 looks at logistic regression and neural networks, amongst other topics, while chapters 12 and 13 provide a lot of the business context that is missing in many of the available time series textbooks. The book is supported by online material, including R based exercises and examples. You can find more information about the book here, or head to the publisher’s website.

One forecast I surely got wrong is how much work is involved in writing a book! Nonetheless, it has been a fantastic experience to co-author this book with Keith and Robert! I hope you will find it equally interesting and rewarding to read it.

OR59 Keynote: Uncertainty in predictive modelling

I recently presented at the OR59 conference my views and current work (with colleagues) on uncertainty in predictive modelling. I think this is a topic that deserves quite a bit of research attention, as it has substnatial implications for estimation, model selection and eventually decision making.

The talk has three parts:

  • Argue (as others before me!) that model based uncertainty (the sigma we get from our models) is not the full story, and estimation/model uncertainty should be accounted in prediction intervals and decision making. Key point: most model outputs assume that the model itself is true, which is… not true!
  • Provide initial results from an approach to directly account for model selection uncertainty that leads is improvements in model selection, but also leads naturally to model combination, aswering what and when to combine.
  • Demonstrate how work in Multiple Temporal Aggregation is an effective way at addressing modelling uncertainty, summarising research up to this point.

You can download the talk here.


Forecasts are central to decision making. Over the last decades there have been substantial innovations in business forecasting, resulting in increased accuracy of forecasts. Models and modelling principles have matured to address company problems in a realistic sense, i.e. they are aware of the requirements and limitations of practice; and tested empirically to demonstrate their effectiveness. Furthermore, there has been a shift in recognising the importance of having models instead of methods to facilitate parameterisation, model selection and the generation of prediction intervals. The latter has been instrumental in refocusing from point forecasts to prediction intervals, which reflect the relevant risk for the decisions supported by the forecasts. At the same time the quality and quantity of potential model inputs has increased exponentially, permitting models to use more information sources and support higher frequency of decision making, such as daily and weekly planning cycles. All these have facilitated and made necessary an increase in automation of the forecasting process, bringing to the forefront a new dimension of uncertainty: the model selection and specification uncertainty. The uncertainty captured in the prediction intervals assumes that the selected model is `true’. This is hardly the case in practice and we should account for that additional uncertainty. First, we discuss the uncertainties implied in model selection and specification. Then we proceed to develop a way to measure this uncertainty and derive a new way to perform model selection. We demonstrate that that this not only leads to superior selection, but also provides a natural link to model combination and specifying the relevant pool of models. Last, we demonstrate that once we recognise the uncertainty in model specification, we can extract more information from our data by using the multiple temporal aggregation frameworks, and empirically show the achieved increase in forecast accuracy and reliability.

Unconstraining Methods for Revenue Management Systems under Small Demand

N. Kourentzes, D. Li and A.K. Strauss, 2017. Journal of Revenue & Pricing Management.

Sales data often only represents a part of the demand for a service product owing to constraints such as capacity or booking limits. Unconstraining methods are concerned with estimating the true demand from such constrained sales data. This paper addresses the frequently encountered situation of observing only a few sales events at the individual product level and proposes variants of small demand forecasting methods to be used for unconstraining. The usual procedure is to aggregate data; however, in that case we lose information on when restrictions were imposed or lifted within a given booking profile. Our proposed methods exploit this information and are able to approximate convex, concave or homogeneous booking curves. Furthermore, they are numerically robust due to our proposed group-based parameter optimization. Empirical results on accuracy and revenue performance based on data from a major car rental company indicate revenue improvements over a best practice benchmark by statistically significant 0.5%-1.4% in typical scenarios.

Download paper.

Benchmarking Facebook’s Prophet

Last February Facebook open sourced its Prophet forecasting tool. Since, it had appeared in quite a few discussions online. A good thing about Prophet is that it one can use it very easily through R (and Python). This gave me the opportunity to benchmark it against some more standard – and not! – forecasting models and methods. To do this I tried it on the M3 competition dataset (available through the Mcomp package for R).

I should start by saying that the development team of Prophet suggests that its strengths are:

  • high-frequency data (hourly, daily, or weekly) with multiple seasonalities, such as day of week and time of year;
  • special events and bank holidays that are not fixed in the year;
  • in the presence of missing values or large outliers;
  • changes in the historical trends, which themselves are non-linear growth curves.

The M3 dataset has multiple series of micro/business interest and as a recent presentation by E. Spiliotis et al. at ISF2017 (slides 11-12) indicated, the characteristics of the time series overlap with typical business time series, albeit not high frequency. However, a lot of business forecasting is still not hourly or daily, so not including high frequency examples for many business forecasters is not necessarily an issue when benchmarking Prophet.

The setup of the experiment is:

  • Use Mean Absolute Scaled Error (MASE). I chose this measure as it has good statistical properties and has become quite common in forecasting research.
  • Use rolling origin evaluation, so as ensure that the reported figures are robust against particularly lucky (or unlucky) forecast origins and test sets.
  • Use the forecast horizons and test sets indicated in Table 1, for each M3 subset.
Table 1. M3 dataset
Set No. of series Horizon Test set
Yearly 645 4 8
Quarterly 756 4 8
Monthly 1428 12 18
Other 174 12 18

I used a number of benchmarks from some existing packages in R, namely:

  • forecast package, from which I used the exponential smoothing (ets) and ARIMA (auto.arima) functions. Anybody doing forecasting in R is familiar with this package! ETS and ARIMA over the years have been shown to be very strong benchmarks for business forecasting tasks and specifically for the M3 dataset.
  • smooth package. This is a less known package that offers alternative implementations of exponential smoothing (es) and ARIMA (auto.ssarima), which follow a different modelling philoshopy than the forecast package equivalents. If you are interested, head over to Ivan’s blog to read more about these (and other nice blog posts).  forecast and smooth packages used together offer a tremendous flexibiltiy in ETS and ARIMA modelling.
  • MAPA and thief packages, which both implement Multiple Temporal Aggregation (MTA) for forecasting, following to alternative approaches that I detail here (for MAPA) and here (for THieF). I included these as they have been shown to perform quite well on such tasks.

The idea here is to give Prophet a hard time, but also avoid using too exotic forecasting methods.

I provide the mean and median MASE across all forecast origins and series for each subset in tables 2 and 3 respectively. In brackets I provide the percentage difference from the ETS’ accuracy. In boldface I have highlight the best forecast for each M3 subset. Prophet results are in blue. I provide two MAPA results, the first uses the default options, whereas the second uses comb=”w.mean” that is more mindful of seasonality. For THieF I only provide the default result (using ETS), as in principle it could be applied to any forecast on the table.

Table 2. Mean MASE results
Set ETS ARIMA ES (smooth) SSARIMA (smooth) MAPA MAPA (w.mean) THieF (ETS) Prophet
Yearly 0.732 (0.00%) 0.746 (-1.91%) 0.777 (-6.15%) 0.783 (-6.97%) 0.732 (0.00%) 0.732 (0.00%) 0.732 (0.00%) 0.954 (-30.33%)
Quarterly 0.383 (0.00%) 0.389 (-1.57%) 0.385 (-0.52%) 0.412 (-7.57%) 0.386 (-0.78%) 0.384 (-0.26%) 0.400 (-4.44%) 0.553 (-44.39%)
Monthly 0.464 (0.00%) 0.472 (-1.72%) 0.465 (-0.22%) 0.490 (-5.60%) 0.459 (+1.08%) 0.458 (+1.29%) 0.462 (+0.43%) 0.586 (-26.29%)
Other 0.447 (0.00%) 0.460 (-2.91%) 0.446 (+0.22%) 0.457 (-2.24%) 0.444 (+0.67%) 0.444 (+0.67%) 0.447 (0.00%) 0.554 (-23.94%)

Table 3. Median MASE results
Set ETS ARIMA ES (smooth) SSARIMA (smooth) MAPA MAPA (w.mean) THieF (ETS) Prophet
Yearly 0.514 (0.00%) 0.519 (-0.97%) 0.511 (+0.58%) 0.524 (-1.95%) 0.520 (-1.17%) 0.520 (-1.17%) 0.514 (0.00%) 0.710 (-38.13%)
Quarterly 0.269 (0.00%) 0.266 (+1.12%) 0.256 (+4.83%) 0.278 (-3.35%) 0.254 (+5.58%) 0.254 (+5.58%) 0.262 (+2.60%) 0.388 (-44.24%)
Monthly 0.353 (0.00%) 0.348 (+1.42%) 0.351 (+0.57%) 0.373 (-5.67%) 0.352 (+0.28%) 0.351 (+0.57%) 0.351 (+0.57%) 0.473 (-33.99%)
Other 0.275 (0.00%) 0.269 (+2.18%) 0.270 (+1.82%) 0.268 (+2.55%) 0.283 (-2.91%) 0.283 (-2.91%) 0.275 (0.00%) 0.320 (-16.36%)

Some comments about the results:

  • Prophet performs very poorly. The dataset does not contain multiple seasonalities, but it does contain human-activity based seasonal patters (quarterly and monthly series), changing trends and outliers or other abrupt changes (especially the `other’ subset), where Prophet should do ok. My concern is not that it is not ranking first, but that at best it is almost 16% worse than exponential smoothing (and at worst almost 44%!);
  • ETS and ARIMA between packages perform reasonably similar, indicating that although there are implementation differences, both packages have followed sound modelling philoshopies;
  • MAPA and THieF are meant to work on the quarterly and monthly subsets, where, in line with the research, they improve upon their base model (ETS).

In all fairness, more testing is needed on high frequency data with multiple seasonalities before one should conclude about the performance of Prophet. Nonetheless. for the vast majority of business forecasting needs (such as supply chain forecasting), Prophet does not seem to perform that well. As a final note, this is an open source project, so I am expecting over time to see interesting improvements.

Finally, I want to thank Oliver Schaer for providing me with Prophet R code examples! You can also find some examples here.