Author Archives: Nikos

Improving your forecasts using multiple temporal aggregation

In most business forecasting applications, the problem usually directs the sampling frequency of the data that we collect and use for forecasting. Conventional approaches try to extract information from the historical observations to build a forecasting model. In this article, we explore how transforming the data through temporal aggregation allows us to gather additional information about the series at hand, resulting in better forecasts. We discuss a newly introduced modelling approach that combines information from many different levels of temporal aggregation, augmenting various features of the series in the process, into a robust and accurate forecast.

  • Temporal aggregation can aid the identification of series characteristics, as these are enhanced across different frequencies. Moreover, this simple trick reduces intermittency, allowing the use of established conventional forecasting methods for slow moving data.
  • Using multiple temporal aggregation levels can lead to substantial improvements in terms of forecasting performance, especially for longer horizons, as the various long term components of the series are better captured.
  • Combining across different levels of aggregation leads to estimates that are reconciled across all frequencies. From a practitioner’s point of view, this is very important, as it produces forecasts that are reconciled for operational, tactical and strategic horizons.
  • The associated R package MAPA allows for direct use of this algorithm in practice through the open source R statistical software.

Why temporal aggregation?

Typically, for short-term operational forecasts monthly, weekly, or even daily data are used. On the other hand, a common suggestion is to use quarterly and yearly data to produce long-term forecasts. This type of data does not contain the same amount of details and often are smoother, providing a better view of the long term behaviour of the series. Of course, forecasts produced from different frequencies of the series are bound to be different, often making the cumulative operational forecasts over a period different than the tactical or strategic forecasts derived from the cumulative demand for the same period. Nonetheless, in practice, it is most common that the same data and models will be used to produce both short and long term forecasts, with apparent limitations, in particular for the long-horizon forecasts.

Different frequencies of the data can reveal or conceal the various time series features. When fast moving time series are considered, random variations and seasonal patterns are more apparent in daily, weekly or monthly data. Using non-overlapping temporal aggregation it is easy to construct lower frequency time series. The level of aggregation refers to the size of the time buckets in the temporal aggregation procedure and is directly linked with the frequency of the data. The increase of the aggregation level results in a series of lower frequency. At the same time, this process acts as a filter, smoothing the high frequency features and providing a better approximation of the long-term components of the data, such as level, trend and cycle. Notice in Fig. 1 how the series changes for various levels of aggregation. The original monthly series is dominated by the seasonal component, while at the 12th level of aggregation, the now annual series, is dominated by a shift in the level and a weak trend.

fors1.fig1

Fig. 1. A monthly fast-moving time series at different levels of non-overlapping temporal aggregation.

In the case of intermittent demand data, moving from higher (monthly) to lower (yearly) frequency data reduces (or even removes) the intermittence of the data, minimising the number of periods with zero demands. This can make conventional forecasting methods applicable to the problem. Fig. 2 demonstrates how such time series change behaviour as the level of the aggregation changes.

fors1.fig2

Fig. 2. A monthly slow-moving time series at different levels of non-overlapping temporal aggregation.

The presence and the magnitude of such time series features, both for fast- and slow-moving items, affects forecasting accuracy. So, the obvious question for a practising forecaster is: “Which aggregation level of my data should I use?”

Don’t trust your models!

As the time series features change with the frequency of the data (or the level of aggregation), different methods will be identified as optimal. These will produce different forecasts, which will ultimately lead to different decisions. In essence, we have to deal with the true model uncertainty, that is the appropriateness or misspecification of the model identified as optimal at a specific level of data aggregation. Even if we ignore the issue of temporal aggregation, with a simple time series we have two major types of uncertainties: i) data uncertainty: are we just very lucky, or unlucky with the sample we have? as more observations become available how would our model and its parameters change? and ii) model uncertainty: is the model or its parameters appropriate, or because of potential optimisation issues the model provides poor forecasts? The second problem is exacerbated for models with many degrees of freedom. Temporal aggregation reveals these differences and forces us to face this issue.

A potential answer to this issue is to try to consider all alternative view of your data, by using multiple aggregation levels, therefore reducing the aforementioned uncertainties. Then model each one separately, capturing best the different time series features that are amplified at each level. As the models will be different, instead of preferring a single one, combine them (or their forecasts) into a robust final forecast that takes into account information from all frequencies of the data.

So, what we propose is not to trust a single model on a single aggregation level of the data, which will avoidably lead to the selection of a single “optimal” model. Instead, consider multiple aggregation levels. This approach attempts to reduce the risk of selecting a bad model, parameterised on a single view of data, thus it mitigates the importance of model selection.

The how-to

In this section we explain how the proposed Multiple Aggregation Prediction Algorithm (MAPA) should be applied in practice, through three steps: aggregation, forecast and combination. A graphical illustration of the proposed framework is presented in Fig. 3, contrasting it with the standard forecasting approach that is usually applied in practice.

fors1.fig3

Fig. 3. The standard versus the MAPA forecasting approach.

Step 1: Aggregation.

In the standard approach the input of the statistical forecast methods is a single frequency of the data, usually the sampling that has been used during the data collection process or the one where forecasts will be used later on. So, the aggregation level is driven either from data availability or intended use of the forecasts.  On the other hand, the MAPA uses multiple instances of the same data, which correspond to the different frequencies or aggregation levels. Suppose we have 4 years of monthly observations (48 data points). In order to transform these to quarterly data (1 quarter=3 months), we define 48/3 = 16 time-buckets. Then, we aggregate, using simple summation, as to calculate the cumulative demand over each of these time-buckets. This process will create 16 aggregated observations that will match the required quarterly frequency. In this case, the aggregation level is equal to 3 periods, while the transformed frequency is equal to 1/3 of the original.

In the case that the selected aggregation level is not a multiplier of the available observations at the original frequency, then some data points may be dropped out through this temporal aggregation process. Continuing our previous example, if the aggregation level is set to 5, then only 9 time buckets can be defined, corresponding to 45 monthly observations. In this case, 3 monthly periods are discarded. We choose to discard observations from the beginning of the series, as the latter ones are considered as more relevant.

This process of transforming the original data to alternative (lower) frequencies, using multiple aggregation levels, can continue as long as all transformed series have enough data to produce statistical forecasts. If the original sampling frequency is monthly data, and given that companies usually hold 3 to 5 years of history, we suggest that the aggregation process continues up to the yearly frequency (aggregation level equal to 12 periods). This will also be adequate to highlight long-term movements of the series.

In any case, while the starting frequency is always bounded from the sampling of the raw data, we propose that the upper level of aggregation should reach at least the annual level, where seasonality is filtered completely from the data and long-term (low frequency) components, such as trend, dominate. Of course, the range of aggregation levels should contain the ones that are relevant to the intended use of the forecasts: for example monthly forecasts for S&OP, or yearly forecasts for long-term planning. This ensures that the resulting MAPA forecast captures all relevant time series features and therefore provides temporally reconciled forecasts.

The output of this first step is a set of series, all corresponding to the same base data but translated to different frequencies.

Step 2: Forecasting.

Each one of the series calculated in step 1 should be forecasted separately. In the forecasting literature, several automatic model selection protocols exist. For fast-moving items exponential smoothing is regarded as reliable and relatively accurate. It corresponds to models that may include trend (damped or not) and seasonality, while allowing for multiple forms of interaction (additive or multiplicative). A widely used approach is the automatic selection of the best model form, based on the minimisation of a predefined information criterion (such as the AIC). It is expected that on our set of series different models will be selected across the various aggregation levels. Seasonality and various high frequency components are expected to be modelled better at lower aggregation levels (such as monthly data), while the long-term trend will be better captured as the frequency is decreased (yearly data).

In the case of intermittent demand series, there are classification schemes that allow selecting between the two most widely used forecasting methods: Croston’s method and the Syntetos-Boylan Approximation (SBA). Under such schemes, the levels of intermittency and the variability of the demand distinguish which method is more appropriate. Using multiple aggregation levels will change both the intermittence and the variability, so we expect that different methods will be identified as optimal across all frequencies. Furthermore, at higher levels of aggregation the resulting series may contain no zero demand periods. In these cases, we propose using conventionally optimised Simple Exponential Smoothing, taking advantage of its good performance in non-intermittent time series. Moreover, it might be the case that the transformation of the data from slow-moving to fast-moving may reveal regular time series components (such as trend or seasonality) that may not exist at the lower levels of granularity where noise is the dominant component. Any time series at a sufficiently high frequency of sampling is intermittent, therefore these cases should be considered as a continuum.

The output of this step is multiple sets of forecasts, each one corresponding to an alternative frequency.

Step 3: Combination.

The final stage of the proposed MAPA approach refers to the appropriate combination of the different forecasts derived from the alternative frequencies.

Before being able to combine the forecasts produced from the various frequencies, we need to transform them back to the original frequency. While the aggregation of forecasts from lower aggregation levels is straightforward, disaggregation can be more complicated. Many alternative strategies have been discussed in the literature, however simply dividing an observation into equal parts at the higher frequency level has been shown to be very effective and simple to use. Consider, for example, that after the forecasting step we have some aggregated forecasts at the quarterly frequency. In order to transform each one of these point forecasts to the original (monthly) frequency, we simply divide them by three and use the same value for all three months. So, the forecast for January will be the same as the forecast of February or March, and all these will be equal to the forecast of Q1 divided by 3.

Once all forecasts from the various aggregation levels are translated back to the original frequency, these are subsequently combined into a final forecast. The combination can be done by calculating averages, medians or using other operators, such as trimmed means of the forecasts. We have found that both means and medians performed well, although the latter are more appropriate to ensure that the combined final forecasts are not affected from extreme values.

It may be the case that the indented use of the forecasts is at a different frequency from the data sampling one. Or it might be that forecasts from various aggregation levels are needed for operational, tactical or strategic planning. To achieve this, simply aggregate the final forecast to the desired level(s). A convenient property of MAPA forecasts is that they are appropriate for all these levels, as they contain information from all. Therefore such forecasts on various aggregation levels are temporally reconciled and there is no longer a need to work out how to ensure that these forecasts agree, which otherwise would typically be different.

The case of seasonality

One problem that arises from the application of the algorithm described above is with regard to the extreme dampening of the seasonal component of the time series. Consider the case of monthly data that are converted to all frequencies down to yearly. If the original series is seasonal, then seasonality will be modelled only on the aggregation levels 1 (which corresponds to the original frequency), 2, 3, 4 and 6. At the same time, seasonality will not be modelled at any of the other aggregation levels considered, either being fractional or completely filtered-out.

Calculating the simple combination across the forecasts of all levels will essentially dampen the seasonal pattern. This is illustrated in Fig. 4, where the models, one with seasonality (red) and one without (green) are combined and the resulting forecast has a poor fit, due to the dampening of the seasonality.

fors1.fig4

Fig. 4. Example of dampened seasonal component due to forecast combination.

In order to address this issue the combination should be done at model components and not the forecasts. This is trivial when using the exponential smoothing family of methods, as it provides estimates for each component (level, trend, and seasonality) separately. The combination of the seasonal component will only take into account the aggregation levels where seasonality is permitted (1, 2, 3, 4 and 6). On the other hand, level and trend will be modelled at each level. If at one level a model with no trend is selected, then the trend component for that aggregation level is equal to zero. Therefore instead of combining the forecasts, one has to first combine all the level estimates from the various aggregation levels, then all the trend estimates and then the seasonal estimates. The resulting combined components are then used to calculate the final forecast, in the same way as one would do with conventional exponential smoothing.

Other ways of tackling this modelling issue would include the consideration of only some of the aggregation levels or the introduction of models that can deal with fractional seasonality.

Does it worth the extra effort?

The short answer is yes! The MAPA approach has been tested on both fast-moving and slow-moving demand data, providing improved forecasting performance when compared with traditional approaches. In more detail, the proposed approach gives better estimates both in terms of accuracy and bias. In the case of fast-moving data, the MAPA improves on exponential smoothing at most data frequencies, while being especially accurate at longer-term forecasts. This is a direct outcome of fitting models at high aggregation levels, where the level and the long-term trend of the series are best identified. In the case of slow-moving data, the MAPA, under the enhanced selection scheme (Croston-SBA-SES), performs better when compared to any single method, while its performance is also superior to that of the original selection scheme (Croston-SBA).

On top of any improvements on the forecasting performance of the MAPA, another advantage of this algorithm is that the decision maker does not have to a-priori select a single aggregation level. While, in some cases, setting the aggregation level to be equal with the lead time plus the review period makes sense, removing this hyper-parameter can be regarded as an advantage, simplifying the forecasting process.

It’s not only about accuracy

Although there are accuracy and robustness benefits from using MAPA, the key advantage is an organisational one. Combining multiple temporal aggregation levels (thus capturing both high and low frequency components) leads to more accurate forecasts for both short and (especially) long horizons. Most importantly, this strategy provides reconciled forecasts across all frequencies, which is suitable for aligning operational, tactical and strategic decision making. This is useful for practice as the same forecast can be used for all three levels of planning.

Therefore, forecasts and decisions are reconciled and there is no need to alter or prefer forecasts from a particular level, something that in practice is often done ad-hoc with detrimental effects. This is a significant drive towards the “one-number” forecast, where several decisions and organisational functions are based on the same view about the future, thus aligning them. It is not uncommon in practice that short term operational forecasts, driving demand planning and inventory management are incompatible with long-term budget forecasts. MAPA addresses this issue by providing a single reconciled view across the various planning horizons. This is just the first step towards the fully reconciled forecasts within an industrial setting. Cross-sectional reconciliation and effective interaction and communication of the different stakeholders are only some of the remaining aspects of the same problem.

Use it in practice

If you are interested in using this approach for forecasting a starting point is the MAPA package for R. Follow that link for examples on how to run MAPA with your own data.

Further reading

N. Kourentzes, F. Petropoulos and J.  R. Trapero, 2014, Improving forecasting by estimating time series structural components across multiple frequencies. International Journal of Forecasting, 30: 291-302.

F. Petropoulos and N. Kourentzes, 2014, Forecast combinations for intermittent demand. Journal of Operational Research Society.

This text is an adapted version of:

F. Petropoulos and N. Kourentzes, 2014, Improving Forecasting via Multiple Temporal Aggregation, Foresight: The International Journal of Applied Forecasting.

Forecast Combinations for Intermittent Demand

F. Petropoulos and N. Kourentzes, 2015, Journal of Operational Research Society, 66: 914-924. http://dx.doi.org/10.1057/jors.2014.62

Intermittent demand is characterised by infrequent demand arrivals, where many periods have zero demand, coupled with varied demand sizes. The dual source of variation renders forecasting for intermittent demand a very challenging task. Many researchers have focused on the development of specialised methods for intermittent demand. However, apart from a case study on hierarchical forecasting, the effects of combining, which is a standard practice for regular demand, have not been investigated. This paper empirically explores the efficiency of forecast combinations in the intermittent demand context. We examine both method and temporal combinations of forecasts. The first are based on combinations of different methods on the same time series, while the latter use combinations of forecasts produced on different views of the time series, based on temporal aggregation. Temporal combinations of single or multiple methods are investigated, leading to a new time series classification, which leads to model selection and combination. Results suggest that appropriate combinations lead to improved forecasting performance over single methods, as well as simplifying the forecasting process by limiting the need for manual selection of methods or hyper-parameters of good performing benchmarks. This has direct implications for intermittent demand forecasting in practice.

Download paper.
A simplified discussion of the article can be found here.
Some updated results can be found here.

Update for TStools

Amongst various minor improvements a few interesting new functions have been added to TStools package for R:

  • Theta method
  • ABC-XYZ analysis

Theta method was found to be the most accurate in the M3 forecasting competition, but since then there has been limited use of the method. As it was later shown, the M3 competition Theta method setup is equivalent to using a single exponential smoothing with drift and an additive constant, with classical decomposition.

This is a beta function incorporating a few tweaks over the original method:

  • Decomposition is performed only if seasonality is detected. The original method tested this using an ACF based test. This implementation uses the approach used in the seasplot function;
  • Seasonality can be additive or multiplicative, estimated using classical decomposition or a pure seasonal exponential smoothing model;
  • If decomposition is performed then the Theta0 line is based on the trend component, rather than the deseasonalised time series;
  • Theta0 is a linear trend only if changes in the level are detected, otherwise it is just a mean level estimation;
  • Theta0 and Theta2 lines, as well as the pure seasonal model can be optimised with various cost functions. The idea is that it may be beneficial to have robust Theta0 line (optimised on MAE) and more reactive Theta2 line and so on.

Limited benchmarking has shown that this implementation is marginally more accurate than the original Theta method, or the reformulated one. The best setup and options are still under investigation!

To use the method simply run:

> theta(referrals,h=18)

The output includes the forecasted values, the result of trend and seasonality identification, the predicted values of the various Theta lines and the seasonal component, model parameters and cost functions used. There are two different visualisations available, just the forecast:
tstoolsR.fig10
or the forecast with the Theta lines. I quite prefer the second one, as it explains how the forecast is calculated:
tstoolsR.fig11

There are three functions to support the ABC-XYZ analysis. Let us first get some quarterly time series to analyse:

> library(Mcomp)
> M3.subset <- subset(M3, "QUARTERLY")
> x <- matrix(NA,46,100)
> for (i in 1:100){
  x.temp <- c(M3.subset[[i]]$x,M3.subset[[i]]$xx)
  x[1:length(x.temp),i] <-x.temp
}

To perform ABC analysis use the abc function. Unless the importance of a series is calculated externally, then it is estimated as the mean volume of sales of a product.

> imp <- abc(x,outplot=TRUE)

tstoolsR.fig.12
The plot shows how concentrated is the value in the A, B or C classes. The bigger the difference between the straight diagonal line and the concentration curve, the more value is included in the A and B classes. For example, here we have set class A to contain 20% of the products, but it accounts for 28.7% of the value. Another example with very high concentration is the following:

tstoolsR.fig.17

Here A products account for 47.9% of the total value. On the other hand, the C products account for only 11.8% of the total value, even though this class contains 50% of the items.

Although ABC analysis usually classifies items in three groups, it is fairly easy to introduce more groups, depending on the percentages used:

> abc(x,c(0.1,0.2,0.3,0.4))

tstoolsR.fig.13
Similarly XYZ analysis can be done using the xyz function. Forecastability can be estimated in three different ways. The default is to calculate a scaled in-sample RMSE of the best of two methods: Naive and Seasonal Naive. The scaling is done by dividing the RMSE with mean of each time series. The second approach is to calculate the coefficient of variation for each time series. The last approach is to fit an ETS model (using the forecast package) and calculate again the scaled in-sample RMSE. The advantage of the first approach is that it can tackle with seasonal time series and it is relatively fast. The second approach is very simple, but it does not handle well trended or seasonal time series. The last approach selects that appropriate ETS model for each time series, thus being able to handle most types of time series. On the downside, it requires substantially more time.

> frc <- xyz(x,m=4,outplot=TRUE)

tstoolsR.fig.14
As with the abc function, external measures of forecastability can be used and any number of classes.

Finally, the function the abcxyz visualises the combined classification:

> abcxyz(imp,frc)
   Z  Y  X
A  4  7  9
B  6  7 17
C 10 16 24

tstoolsR.fig.15
This function also supports custom number of classes.
tstoolsR.fig.16

Critical values for the Nemenyi test

The critical distance for the Nemenyi test is calculated as:

r_{\alpha,K,N} \approx q_{\alpha,K}\sqrt{\frac{K(K+1)}{6N}},

where \alpha is the confidence level, K is the number of models and N is the number of measurements. To calculate q_{\alpha,K} the Studentised range statistic for infinite degrees of freedom divided by \sqrt{2} is used. The values of q_{\alpha,K} for up to K=100 can be download here as csv.

You may also be interested in the implementations of the Nemenyi test in MatLab and R, or a discussion on how to best visualise its results and its connection with the MCB test.

How to use R packages from GitHub

Update: The following R commands will install a package directly from GitHub to R:

> if (!require("devtools"))
    install.packages("devtools")
> devtools::install_github("YYY/ZZZ")

where YYY is the GitHub username and ZZZ is the package name. For example to install TStools use:

> devtools::install_github("trnnick/TStools")

This makes the process described below unnecessary, however if you want the ‘full’ story read on!


I typically use RStudio in Windows when I code in R, so I will give my examples using that. However this is not crucial for the process, as long as you have some basic knowledge of building packages.

Step 1. Download the files from GitHub

I will first save the files locally. I will be using TStools for this example. At the GitHub page you will see at the bottom-right corner a Download ZIP button. Use that to get a full copy of the directory.

githubR.fig1

Unzip the contents of the file in a temporary directory.

Step 2. Get Rtools if you don’t have it already

Rtools is a collection of resources for building R under Microsoft Windows. If you have built a package in the past chances are that Rtools is already installed. If you don’t have it then you can get it here. Install it and proceed to the next step.

Step 3. Create a new project in R

In RStudio go to File -> New Project

githubR.fig2

In the new window select Existing Directory and browse to the folder you unzip the file you downloaded from Github. Drill down until you see the R subdirectory and press Select Folder.

githubR.fig3

and then click Create Project.

Step 4. Build the package

With the project loaded all that you need to do now is click on Build & Reload in the Build tab. This is typically on the top-left of your screen. Rstudio will now build the package and it will add it to your package list.

githubR.fig4

If you can see the tabs Environment and History, but not Build then the project is not loaded or something went wrong in the project creation. Repeat step 3 or try looking for the project file in the folder you unzipped the files from GitHub. If there is an .Rproj file you should be able to load it by going to File -> Open Project in the RStudio menu.

Step 5. Close project

The last thing you need to do is go to File -> Close Project in RStudio menu. The package you downloaded from GitHub is now ready to use. You can now delete the temporary directory, including the project files.

Of course there are many different ways to do this process. Here I highlighted one of the simplest ones (I think!).

Increasing knowledge base for nowcasting GDP by quantifying the sentiment about the state of economy

N. Kourentzes and F. Petropoulos, 2014.

Predicting the present state of the economy can be challenging. Recently, econometric models that are capable of using multiple sources of hard and soft information, published at various frequencies and dates, to produce nowcasts of policy relevant variables have been proposed. Following the progress in the modelling aspects of nowcasting, the literature has started exploring new useful inputs. The ‘Big data` and various Internet sources can be used in order to further enhance nowcasting models, by enriching them with new types of information, as well as being more up to date. This research proposes a methodology to create useful inputs by mining news articles, capturing the current sentiment in the media about the state of the economy. News articles are nowadays easy to access and process, as most news outlets provide online versions of them. Furthermore, the high frequency of publication and the representation of the current economic discourse are invaluable for nowcasts. The case study of the Greek economy is used and rolling nowcasts are produced for the period of 2008 to 2013. Empirical evidence suggests that the inclusion of such information improves the accuracy of nowcasts.

Download paper.

Participate in our Judgemental Model Selection Experiment!

We are invit­ing you to par­tic­i­pate in a web-​​based judg­men­tal fore­cast­ing exer­cise. You are asked to select the best model, based on your judg­ment, for 32 time series.

The exer­cise con­sists of four rounds. Each round will con­tain 8 series and will be fol­lowed by a short ques­tion­naire, while dif­fer­ent types of infor­ma­tion will be pro­vided on top of the esti­mated fore­casts for the future peri­ods. Detailed instruc­tions will be given prior to each round. You may com­plete the exer­cise at once (pre­ferred option) or in mul­ti­ple instances. Your progress will be saved and you may resume at your convenience.

At the end of the exer­cise you will be pro­vided with your score. Your final per­for­mance will be cal­cu­lated based on the rank of the mod­els cor­re­spond­ing to your choices, accord­ing to the actual (out-​​of-​​sample) per­for­mance of each model. The par­tic­i­pants within the Top-​​20 will be awarded with Ama­zon Gift Cards (£50 each).

You can access the web-​​based exer­cise here: Forecasting Society

Thank you in advance for your participation!

Fotios Petropoulos and Nikolaos Kourentzes

TStools for R

This is a collection of functions for time series analysis/modelling for R. Follow link to GitHub. If you need help installing this package in R have a look at this post. Alternatively just type in R the following commands:

> if (!require("devtools"))
    install.packages("devtools")
> devtools::install_github("trnnick/TStools")

At the time of posting the following functions are included:

  • cmav: Centred moving average;
  • coxstuart: Cox-Stuart test;
  • decomp: Time series decomposition;
  • kdemode: Mode estimation via KDE;
  • nemenyi: Friedman and Nemenyi tests;
  • regopt: Identify regression beta using various cost functions;
  • seasplot: Seasonal plots.

Here are some examples of what these functions do.

The cmav function will calculate the centred moving average of a time series. It differs from the ma function in the forecast package in the sense that it can provide values for the first and last segment of the time series, where the centred moving average cannot be calculated, by fitting an ETS model.

> cmav(referrals,outplot=TRUE)

tstoolR.fig9

The decomp function performs classical decomposition on a time series.

> decomp(referrals,outplot=TRUE)

tstoolR.fig1
You can estimate the season using either the mean or the median of the historical indices or fit a pure seasonal model. Here, I also choose the type of decomposition and ask the seasonal indices for the next 12 months.

> decomp(referrals,outplot=TRUE,type="pure.seasonal",decomposition="additive",h=12)

tstoolR.fig2

The kdemode can be used to find the mode of a distribution by kernel density estimation. The bandwidth is automatically identified.

> data <- rlnorm(200,mean=0,sd=1)
kdemode(data,outplot=TRUE)

tstoolR.fig3
The bandwidth is estimated automatically using the excellent estimation proposed by Botev et al. Alternatively the bandwidth can be calculated using either the Sheater and Jones method or the Silverman heuristic. This was quite useful for producing the mode ensemble neural networks that are proposed here.

 

The nemenyi function can be used to run the Nemenyi test to compare models. Here is an example:

> N <- 50
> M <- 4
> data <- matrix( rnorm(N*M,mean=0,sd=1), N, M) 
> data[,2] <- data[,2]+1
> data[,3] <- data[,3]+0.7
> data[,4] <- data[,4]+0.5
> colnames(data) <- c("Method A","Method B","Method C","Method D");
> nemenyi(data,conf.int=0.95,plottype="vline")

tstoolR.fig4
I can also get the MCB test by using:

> nemenyi(data,conf.int=0.95,plottype="mcb")

tstoolR.fig5
These tests use the same statistic. A discussion on this and various visualisations can be found here.

 

Finally, the seasplot function produces seasonal plots. It differs from the similar seasonplot in the forecast package as it can automatically test for presence of trend and remove it, if present.

> seasplot(referrals)

tstoolR.fig6
Furthermore, there are a few alternative visualisations you can do with seasplot. Here are some examples:
tstoolR.fig7
tstoolR.fig8
You may wonder how is that p-value calculated. Based on the definition of deterministic seasonality, I just compare the distributions of the seasonal elements and test whether at least one is different. I do this by using the nonparametric Friedman test.

The plan is to keep on updating TStools on GitHub. When there are substantial updates, I will post them here.

For examples of some of the new features of TStools have a look here.

Multiple Aggregation Prediction Algorithm (MAPA)

MAPA code for R is available on CRAN.
An online interactive demo of MAPA can be found here.

Here is a quick demonstration what you can do with the code. The easiest way to produce a forecast with MAPA is to use the mapasimple function.

> mapasimple(admissions)
     t+1      t+2      t+3      t+4      t+5      t+6      t+7      t+8      t+9     t+10     t+11     t+12 
457438.0 446869.3 450146.7 462231.5 457512.8 467895.1 457606.0 441295.7 471611.2 454282.0 458308.0 453472.5

This also gives you a simple plot of the series and the forecast:

I can ask for a detailed view of the predicted states at each temporal aggregation level:

> mapasimple(admissions,outplot=2,paral=2)


In this example I also used paral=2. This creates a parallel cluster, executes mapasimple and then closes the cluster. If you already have a parallel cluster running you can use paral=1.
Using functions mapaest and mapafor I can have a more detailed control of the estimation and the forecasts across the different levels of temporal aggregation.

> mapafit <- mapaest(admissions,paral=2)
> mapafor(admissions,mapafit)

The first function estimates the model fit at each temporal aggregation level and also provides a visualisation of the identified ETS components.
mapaR.fig2
The second function provides in- and out-of-sample forecasts. By default only one step ahead in-sample forecasts are given.
mapaR.fig3
This is easy to change, for instance to 12-steps ahead:

> mapafor(admissions,mapafit,ifh=12,fh=0)

mapaR.fig4
I can stop plotting the output by setting outplot=0 in any of the above functions. The functions have are several more options, allowing you to set the maximum temporal aggregation level, the type of MAPA combination, etc. Finally, the function mapa is a wrapper for both mapaest and mapafor, so these can be run with a single call.
Since version 1.5 of the package there are some new interesting features. The first one is to force a particular exponential smoothing model on all aggregation levels. This can be simply done by using the option model:

> mapa(admissions,model="AAdN",paral=2)

A nonseasonal damped trend model is fitted to the time series in this case. Since MAPA can no longer change between models and choose a simpler one, it is possible that the preselected model will have too many degrees of freedom for the aggregate version of a given series. In this case no model is fitted.
mapaR.fig6
Furthermore, if a seasonal model is selected, for any aggregation levels with non=integer seasonality a non-seasonal version of that model will be fitted.
Another new option is the ability to calculate empirical prediction intervals. As these require simulating forecasts for their calculation, they are computationally expensive and not provided by default. To get the 80%, 90%, 95% and 99% prediction intervals you can use:

> mapa(admissions,conf.lvl=c(0.8,0.9,0.95,0.99),paral=2)

mapaR.fig7
For more details about the Multiple Aggregation Prediction Algorithm (MAPA) look at the this post.

Update: The MAPA package now allows modelling high-frequency time series and include regressors (MAPAx).

Neural network ensemble operators for time series forecasting

N. Kourentzes, D. K. Barrow and S. F. Crone, 2014, Expert Systems with Applications, 41: 4235-4244. http://dx.doi.org/10.1016/j.eswa.2013.12.011

The combination of forecasts resulting from an ensemble of neural networks has been shown to outperform the use of a single “best” network model. This is supported by an extensive body of literature, which shows that combining generally leads to improvements in forecasting accuracy and robustness, and that using the mean operator often outperforms more complex methods of combining forecasts. This paper proposes a mode ensemble operator based on kernel density estimation, which unlike the mean operator is insensitive to outliers and deviations from normality, and unlike the median operator does not require symmetric distributions. The three operators are compared empirically and the proposed mode ensemble operator is found to produce the most accurate forecasts, followed by the median, while the mean has relatively poor performance. The findings suggest that the mode operator should be considered as an alternative to the mean and median operators in forecasting applications. Experiments indicate that mode ensembles are useful in automating neural network models across a large number of time series, overcoming issues of uncertainty associated with data sampling, the stochasticity of neural network training, and the distribution of the forecasts.

Download paper.
An interactive summary of the paper can be found here.