Author Archives: Nikos

Multiple temporal aggregation: the story so far. Part III: MAPA

Multiple Aggregation Prediction Algorithm (MAPA)

In this third post about modelling with Multiple Temporal Aggregation (MTA), I will explain how the Multiple Aggregation Prediction Algorithm (MAPA) works, which was the first incarnation of MTA for forecasting.

MAPA is quite simple in its logic:

  1. a time series is temporally aggregated into multiple levels, at each level strengthening and weakening various components of the time series, as discussed before;
  2. at each level an independent exponential smoothing (ETS) model is fit and its components are extracted;
  3. the ETS components are combined, using a few tricks (see the paper for details), to produce the final forecast, which borrows information from all levels.

In the original MAPA paper alternative combination approaches were trialed, where all temporal aggregation levels were given equal importance and combined through mean or median. For the seasonal component (which is the high frequency one), this causes an important issue: as the seasonality is filtered at the aggregate levels, it is effectively shrunk towards zero. Therefore, the combined seasonal component will be shrunk as well. Originally this was addressed by using a simple heuristic, combining the ETS forecast of the original series, with the MAPA forecast (hybrid approach). This effectively means that we use a weighted combination, where the first temporal aggregation level is given more weight than all other temporal aggregation levels together. Empirical evidence suggests that this re-weighting is beneficial.

The latter developed w.mean and w.median weight schemes attempt to do the same with variable weights across temporal aggregation levels for the seasonal component. In fact, when dealing with high frequency time series, it is always recommended to use these.

The interactive plot below illustrates how MAPA works. You can choose between various time series, the combination scheme for the components across temporal aggregation levels, and whether the hybrid forecast (effectively a re-weighting of the components) is used or not. The top plot shows the identified ETS models at each temporal aggregation level. Greyed cells indicate levels that no seasonality is estimated. These are levels that would require fractional seasonality, not permitted by conventional ETS. The second row of plots provides the forecasted ETS components across temporal aggregation levels, as well as the combined one (thick black line). The components of the first aggregation level are those of the ETS fitted at the original time series, and are plotted with a thicker line. For the seasonal component only levels that can be seasonal are used. Note the difference between the trajectories for the ETS and MAPA components, as well as the various components at different temporal aggregation levels. The bottom plot provides the forecasts of ETS and MAPA. The MAPA forecast is simply the addition of the three MAPA components. If Hybrid is used, the resulting MAPA forecast is the combination of the MAPA components and the ETS forecast (that is just equivalent to a specific weighting scheme of the components).

In most cases the level and trend components are different from the ETS ones, and the seasonality is always somewhat shrunk, depending on the combination weights scheme used.

Why does MAPA work? It captures low frequency components (trend) better, because of the temporal aggregation. Furthermore, it does not rely on a single ETS model that may or may not be well identified, therefore mitigating model uncertainty. I argue that MTA, as implemented in MAPA is a neat trick to extract more information from a time series… for free!

What are the limitations of MAPA? Quite a few, but there are two major ones: (i) the combinations weight schemes are ad-hoc, but there is strong evidence that equal weights are surely not the best solution; (ii) the identification of the ETS model for levels that any seasonality might be fractional is weakened by not considering that seasonality and letting it contaminate the level and slope components.  Arguably, the use of ETS at its core is another potential limitation, although that is possible to lift.

In practice, MAPA was the first method after almost 15 years to improve upon the M3 competition results and since then there is mounting empirical evidence supporting its good forecasting performance, as well as extensions to incorporate explanatory variables and forecast intermittent demand time series. An interesting finding is that MAPA is very robust against misspecification, when compared to more conventional approaches that attempt to capitalise on a single (optimal) level of temporal aggregation. If you want to try it out, there is a package for R available (or on GitHub).

Ultimately the true contribution of MAPA was to demonstrate that the ideas behind MTA were sound and useful for forecasting! For this, the paper that introduced MAPA recently received the International Journal of Forecasting 2014-2015 best paper award; I am very humbled and happy for this!

Although the aforementioned limitations are not resolved, MAPA motivated research into Temporal Hierarchies that provide a more thorough foundation for using MTA in forecasting, overcoming many of MAPA’s issues, and enabling multiple avenues of future research. This will be the topic of a future post in the series. Till then, I will conclude by mentioning that MAPA in many applications still provides more accurate forecasts than Temporal Hierarchies, demonstrating that it is still an interesting research topic.

Multiple Temporal Aggregation: the story so far: Part I; Part II; Part III; Part IV.

International Journal of Forecasting 2014-2015 best paper award

In the very enjoyable and stimulating International Symposium on Forecasting that just finished in Cairns, Australia, the International Journal of Forecasting (IJF) best paper award for the years 2014-2015 (list of past papers can be found here) was given to one of my papers: Improving forecasting by estimating time series structural components across multiple frequencies!

This paper proposes the use of Multiple Temporal Aggregation approach that I have been posting about, and introduces the MAPA forecasting method, for which there is an R package. The other shortlisted papers were of very good quality and I am humbled by the choice of the editorial board. My thanks to my co-authors and the reviewers, who made this paper possible.

Material for `Forecasting with R: A practical workshop’ at ISF2017

You can find the material from yesterday’s workshop at the International Symposium on Forecasting, 2017 here. The workshop notes assume some knowledge of what the various forecasting methods do, and the main focus is on showing which functions to use and how, so as to perform a wide variety of forecasting tasks:

  • Time series exploration
  • Univariate (extrapolative) forecasting
  • Intermittent demand series forecasting
  • Forecasting with regression
  • Special topics: (i) Hierarchical forecasting; (ii) ABC-XYZ analysis; and (iii) LASSO regression

Material:

  1. Workshop notes: these provide code examples with comments. You will also find some references for the various methods used in the workshop.
  2. Workshop slides: these provide an extremely brief overview of some of the methods used and their implementation.
  3. Workshop R solution scripts: these replicate the examples in the notes.
  4. Workshop data: these are needed to replicate the examples in the notes and scripts.

The notes are aimed at researchers and experienced practitioners, who are comfortable with the theory behind the various models and methods. I hope you find this material useful!

A couple of more packages to explore:

Tactical sales forecasting using a very large set of macroeconomic indicators

Y.R. Sagaert, E-H. Aghezzaf, N. Kourentzes and B. Desmet, 2018. European Journal of Operational Research, 264.2 (2018): 558-569. https://doi.org/10.1016/j.ejor.2017.06.054

Tactical forecasting in supply chain management supports planning for inventory, scheduling production, and raw material purchase, amongst other functions. It typically refers to forecasts up to 12 months ahead. Traditional forecasting models take into account univariate information extrapolating from the past, but cannot anticipate macroeconomic events, such as steep increases or declines in national economic activity. In practice this is countered by using managerial expert judgement, which is well known to suffer from various biases, is expensive and not scalable. This paper evaluates multiple approaches to improve tactical sales forecasting using macro-economic leading indicators. The proposed statistical forecast selects automatically both the type of leading indicators, as well as the order of the lead for each of the selected indicators. However as the future values of the leading indicators are unknown an additional uncertainty is introduced. This uncertainty is controlled in our methodology by restricting inputs to an unconditional forecasting setup. We compare this with the conditional setup, where future indicator values are assumed to be known and assess the theoretical loss of forecast accuracy. We also evaluate purely statistical model building against judgement aided models, where potential leading indicators are pre-filtered by experts, quantifying the accuracy-cost trade-off. The proposed framework improves on forecasting accuracy over established time series benchmarks, while providing useful insights about the key leading indicators. We evaluate the proposed approach on a real case study and find 18.8\% accuracy gains over the current forecasting process.

Download paper.

Multiple temporal aggregation: the story so far. Part II: The effects of TA

The effects of temporal aggregation

In this post I will demonstrate the effects of temporal aggregation and motivate the use of multiple temporal aggregation (MTA). I will not delve into the econometric aspects of the discussion, but it is worthwhile to summarise key findings from the literature. A concise forecasting related summary is available in our recent paper Athanasopoulos et al. (2017), section 2:

  • Temporal aggregation changes the (identifiable) structure of the time series;
  • As the aggregation level increases there are less components that appear and higher-frequency components (for example, seasonality and promotions) become weaker or vanish altogether;
  • Temporal aggregation reduces the sample size resulting in loss of estimation efficiency. To make this simple, if you have 4 years of monthly data and you aggregate your series to a yearly level you will have to build a model with only for four data point, risky!
  • There are accuracy gains to be had, but identifying the (single) appropriate temporal aggregation level is very difficult! Yet, it still simplifies some problems, like intermittent demand forecasting.

What do these mean for our forecasts? Well, if you work on the basis that the true model is an elusive idea, these are not too prescriptive for constructing forecasts. I will try to give you an intuition visually. In the following interactive visualisation you can choose between different time series and plot the original and temporally aggregated data, together with a seasonal plot. The seasonal plot will be shown only when it is feasible, i.e. the resulting seasonality after the aggregation has an integer period greater than 1. For each series I also fit an appropriate exponential smoothing model (selected using AICc) and provide a list of the fitted components for all temporal aggregation levels, up to yearly data. I also provide the relevant forecast. Observe a few things:

  • The identified exponential smoothing models are often different across temporal aggregation levels. In particular the seasonality is filtered as we aggregate into bigger time buckets. Of course, for some aggregation levels (for example, aggregate every 5-months) the resulting series has a non-integer seasonal period and typical forecasting methods cannot capture it and instead it contributes to the error part;
  • Other aspects of the series, like outliers, vanish as we aggregate to higher levels;
  • Some times aggregation makes the time series easier to model, and sometimes it over-smooths the series! The forecasts surely vary a lot as we aggregate.


At minimum we can say that temporal aggregation alters the identifiable parts of the time series, strengthening low-frequency components (such as trend), while weakening high-frequency components (such as seasonality). Depending on the forecasting objective, this may result in better forecasts, especially if we are aiming at long term forecasts. Furthermore, simply because the temporal aggregation filters part of the noise (it is a moving average filter!) it may just be better to model a series at a more aggregate level.

The main problem in the literature is that it is very difficult to know what is the optimal temporal aggregation level, which will maximise your forecast accuracy, for real data. This is not a trivial point: There are theoretical solutions suggesting the optimal temporal aggregation level for various data generation processes, but they rely on full knowledge of the process! Well, if I knew the process, then forecasting it would be trivial. Recent research showed that although we can easily show benefits on simulated data, it becomes much more complicated with real data that the true model is unknown.

If we connect the dots, there are four key arguments in favour of MTA (discussed in more detail in these three papers [1], [2] and [3]):

  • Because we are provided with a time series sampled at some time interval, we do not have to model it at that level! It may be better to do so at some aggregate level.
  • Temporal aggregation can be beneficial for forecasting, but identifying a single optimal level of aggregation is very challenging, so why not use multiple?
  • Using multiple levels we avoid relying on a single forecasting model, therefore we mitigate modelling uncertainty by considering multiple (different) models across temporal aggregation levels.
  • Holistic modelling of the time series information: Models built on the original data or on low aggregation levels can focus more on high frequency components, while models build on high aggregation levels focus on low frequency components, which may not be easy to capture in the originally sampled time series.

All these points suggest that using Multiple Temporal Aggregation levels should be useful, but we have not yet addressed the question how to do this! I will introduce our first attempt to do this, the Multiple Temporal Aggregation Algorithm (MAPA) in the next post in the series.

Multiple Temporal Aggregation: the story so far: Part I; Part II; Part III; Part IV.

 

Multiple temporal aggregation: the story so far. Part I

Over the last years I have been working (with my co-authors!) on the idea of Multiple Temporal Aggregation (MTA) for time series forecasting. A number of papers have been published introducing and developing the idea further, or testing its effectiveness for forecasting.

In this series of blog posts I will try to summarise the progress so far, and highlight ways that you can use it. This first post will summarise the papers so far and give an overview of the main findings. Later posts will focus on explaining how MTA works.

The key points behind MTA are the following:

  • It is a radically different approach to time series modelling, recognising that the data sampling frequency may not be the best for a given modelling purpose.
  • A time series is modelled simultaneously at multiple temporal aggregation levels that can be easily generated from the original data. At each level an appropriate model is fit, focusing on the components of the series that are strengthened by temporal aggregation.
  • If forecasting is the objective, then the produced forecast reconciles the information from all these models. This makes the forecast robust to modelling uncertainty and lessens the importance of model selection.
  • The resulting forecasts have been shown to be reliable and typically outperform the conventional modelling approach.

Table 1 summarises our contributions on MTA so far (follow the links to access the papers). We have also released two R packages that implement MTA: MAPA and thief. The former implements, as the name suggests, MAPA, while the latter provides code to use Temporal Hierarchies.

Table 1. Papers on MTA
Paper Summary
Kourentzes et al. 2014. Improving forecasting by estimating time series structural components across multiple frequencies. The initial paper on MTA modelling. It introduces the Multiple Aggregation Prediction Algorithm (MAPA) and demonstrates its superior performance on the well-known M3 competition.
Petropoulos and Kourentzes 2014. Forecast combinations for intermittent demand. Expands MAPA for the case of intermittent demand.
Kourentzes and Petropoulos 2016. Forecasting with multivariate temporal aggregation: The case of promotional modelling. Expands MAPA for promotional modelling purposes at Stock Keeping Unit level.
Barrow and Kourentzes 2016. Distributions of forecasting errors of forecast combinations: implications for inventory management. Provides evidence of very strong performance of MAPA over established benchmarks for demand forecasting and inventory management purposes.
Athanasopoulos et al. 2017. Forecasting with temporal hierarchies. Introduces a general framework for MTA: Temporal Hierarchies that allows use of any model/method to produce forecasts at each level.
Kourentzes et al. 2017. Demand forecasting by temporal aggregation: using optimal or multiple aggregation levels? Demonstrates that MTA modelling is more robust to uncertainty than modelling either using the original data or using a single (optimal) temporal aggregation level.

To give you an idea of the reported improvements, I have collated some of the results from the papers above. The best forecast in each column, in all tables, is highlighted in boldface. Table 2 provides a summary for the quarterly and monthly M3 datasets, using as benchmarks the Exponential Smoothing (ETS) family of models, with automatic model selection (via AICc), and Theta, the best performing method on the original M3 competition – a position it held for almost 15 years! In this case both MAPA and Temporal Hierarchies make use of the ETS family of models, so you can get a feeling of the improvement provided by MTA over conventional time series forecasting, as the results are directly comparable with the ETS row.

Tables 3 and 4 provide results for a number of real datasets. Table 4 also provides results on a variety of simulated ARIMA series. The detailed results can be found in the respective papers. In all cases MAPA is better, or at least as good, compared to the various benchmarks. Table 5 provides results on real series that have promoted periods. There are two comparisons: forecasts without and with promotional information. In both cases MTA based forecasts (MAPA) are on average the most accurate.

Table 2. sMAPE results on M3 quarterly and monthly data1
Forecast Quarterly set Monthly set
Exponential Smoothing (ETS) 9.94% 14.45%
Theta (M3 competition)2 8.96% 13.85%
MAPA (Kourentzes et al. 2014) 9.58% 13.69%
Temporal Hierarchies (Athanasopoulos et al. 2017) 9.70% 13.61%

1 Papers provide results on more robust metrics!
2 Best performance in the original M3 competition.

Table 3. Scaled RMSE results on Fast Moving Consumer Goods sales (Barrow and Kourentzes, 2016)
Forecast 1-step ahead 3-steps ahead 5-steps ahead
Naive 0.882 0.900 0.919
ETS 0.677 0.688 0.711
AR 0.707 0.719 0.737
ARIMA 1.446 0.701 0.721
Theta 0.674 0.685 0.705
MAPA 0.668 0.670 0.687
Table 4. Average Relative MAE on simulated and real data (Kourentzes et al., 2017)
Forecast Simulated ARIMA Manaufacturing Call centre
Single Exponential Smoothing (SES) 1.000 1.000 1.000
Exponential Smoothing (ETS) 0.985 1.011 1.005
Optimal Temporal Aggregation & SES 0.974 0.999 1.080
MAPA 0.971 0.994 0.979
Table 5. Scaled MAE results on SKUs with promotions (Kourentzes and Petropoulos, 2016)
Forecast 4-steps ahead 8-steps ahead 12-steps ahead
Naive 0.743 0.818 0.704
ETS 0.704 0.774 0.701
MAPA 0.679 0.754 0.736
Regression + Promotional 0.611 0.659 0.714
ETS + Promotional 0.642 0.627 0.543
MAPA + Promotional 0.525 0.521 0.515

The main argument in all papers is that MTA helps to improve forecast accuracy due to the way it mitigates modelling uncertainty. As we will see this comes at no additional data cost and relatively limited additional computations. An added benefit, which is not very evident from the summarised tables provided here, is that the MTA forecasts are reliable both for short and long term forecasting, providing a way to reconcile operational, tactical and strategic planning.

Unpublished results on different applications provide a similar picture in terms of accuracy. There is also evidence that MTA can strengthen statistical tests, as the initial results of this experiment show. However, all this is ongoing research, so until a full analysis is conducted and the results are peer reviewed, I would add a pinch of salt to these!

In following blog posts I will explain how MTA works and elaborate more on results from the various papers.

Multiple Temporal Aggregation: the story so far: Part I; Part II; Part III; Part IV.

Demand forecasting by temporal aggregation: using optimal or multiple aggregation levels?

N. Kourentzes, B. Rostami-Tabar and D.K. Barrow, 2017, Journal of Business Research. http://doi.org/10.1016/j.jbusres.2017.04.016

Recent advances have demonstrated the benefits of temporal aggregation for demand forecasting, including increased accuracy, improved stock control and reduced modelling uncertainty. With temporal aggregation a series is transformed, strengthening or attenuating different elements and thereby enabling better identification of the time series structure. Two different schools of thought have emerged. The first focuses on identifying a single optimal temporal aggregation level at which a forecasting model maximises its accuracy. In contrast, the second approach fits multiple models at multiple levels, each capable of capturing different features of the data. Both approaches have their merits, but so far they have been investigated in isolation. We compare and contrast them from a theoretical and an empirical perspective, discussing the merits of each, comparing the realised accuracy gains under different experimental setups, as well as the implications for business practice. We provide suggestions when to use each for maximising demand forecasting gains.

Download paper.

R package (MAPA). Code for optimal aggregation level: function get.opt.k in TStools package.

Can you spot trend in time series?

Past experiments have demonstrated that humans (with or without formal training) are quite good at visually identifying the structure of time series. Trend is a key component, and arguably the most relevant to practice, as many of the forecasts that affect our lives have to do with potential increases or decreases of economic variables. Forecasters and econometricians often rely on formal statistical tests, while practitioners typically use their intuition, to assess whether a time series exhibits trend. There is fairly limited research contrasting these two. Furthermore, sometimes trend is understood with a rather vague definition. I do not think it is an exaggeration to suggest that even experts often can be confused on the exact definition (and effect) of a unit root and a trend.

To understand more about this, I set up a simple experiment to collect evidence how humans perceive trends. The experiment below asks you to distinguish between trended and non-trended time series. Every 10 time series that you will assess it will provide you with some statistics on your accuracy and the accuracy of some statistical tests (by no means an exhaustive list!). It also provides overall statistics from all participants so far. As you can see, it is no so trivial to identify correctly the presence of trend! What do you think, can you better than the average performance so far?

Forecasting with Temporal Hierarchies

G. Athanasopoulos, R. J. Hyndman, N. Kourentzes and F. Petropoulos, 2017, European Journal of Operational Research. http://doi.org/10.1016/j.ejor.2017.02.046

This paper introduces the concept of Temporal Hierarchies for time series forecasting. A temporal hierarchy can be constructed for any time series by means of non-overlapping temporal aggregation. Predictions constructed at all aggregation levels are combined with the proposed framework to result in temporally reconciled, accurate and robust forecasts. The implied combination mitigates modelling uncertainty, while the reconciled nature of the forecasts results in a unified prediction that supports aligned decisions at different planning horizons: from short-term operational up to long-term strategic planning. The proposed methodology is independent of forecasting models. It can embed high level managerial forecasts that incorporate complex and unstructured information with lower level statistical forecasts. Our results show that forecasting with temporal hierarchies increases accuracy over conventional forecasting, particularly under increased modelling uncertainty. We discuss organisational implications of the temporally reconciled forecasts using a case study of Accident & Emergency departments.

Download paper.

R package (thief).

Forecasting time series with neural networks in R

I have been looking for a package to do time series modelling in R with neural networks for quite some time with limited success. The only implementation I am aware of that takes care of autoregressive lags in a user-friendly way is the nnetar function in the forecast package, written by Rob Hyndman. In my view there is space for a more flexible implementation, so I decided to write a few functions for that purpose. For now these are included in the TStools package that is available in GitHub, but when I am happy with their performance and flexibility I will put them in a package of their own.

Here I will provide a quick overview of what these is available right now. I plan to write a more detailed post about these functions when I get the time.

For this example I will model the AirPassengers time series available in R. I have kept the last 24 observations as a test set and will use the rest to fit the neural networks. Currently there are two types of neural network available, both feed-forward: (i) multilayer perceptrons (use function mlp); and extreme learning machines (use function elm).

# Fit MLP
mlp.fit <- mlp(y.in)
plot(mlp.fit)
print(mlp.fit)

This is the basic command to fit an MLP network to a time series. This will attempt to automatically specify autoregressive inputs and any necessary pre-processing of the time series. With the pre-specified arguments it trains 20 networks which are used to produce an ensemble forecast and a single hidden layer with 5 nodes. You can override any of these settings. The output of print is a summary of the fitted network:

MLP fit with 5 hidden nodes and 20 repetitions.
Series modelled in differences: D1.
Univariate lags: (1,3,4,6,7,8,9,10,12)
Deterministic seasonal dummies included.
Forecast combined using the median operator.
MSE: 6.2011.

As you can see the function determined that level differences are needed to capture the trend. It also selected some autoregressive lags and decided to also use dummy variables for the seasonality. Using plot displays the architecture of the network (Fig. 1).

Fig. 1. Output of plot(mlp.fit).

The light red inputs represent the binary dummies used to code seasonality, while the grey ones are autoregressive lags. To produce forecasts you can type:

mlp.frc <- forecast(mlp.fit,h=tst.n)
plot(mlp.frc)

Fig. 2 shows the ensemble forecast, together with the forecasts of the individual neural networks. You can control the way that forecasts are combined (I recommend using the median or mode operators), as well as the size of the ensemble.

Fig. 2. Output of the plot function for the MLP forecasts.

You can also let it choose the number of hidden nodes. There are various options for that, but all are computationally expensive (I plan to move the base code to CUDA at some point, so that computational cost stops being an issue).

# Fit MLP with automatic hidden layer specification
mlp2.fit <- mlp(y.in,hd.auto.type="valid",hd.max=10)
print(round(mlp2.fit$MSEH,4))

This will evaluate from 1 up to 10 hidden nodes and pick the best on validation set MSE. You can also use cross-validation (if you have patience…). You can ask it to output the errors for each size:

        MSE
H.1  0.0083
H.2  0.0066
H.3  0.0065
H.4  0.0066
H.5  0.0071
H.6  0.0074
H.7  0.0061
H.8  0.0076
H.9  0.0083
H.10 0.0076

There are a few experimental options in specifying various aspects of the neural networks, which are not fully documented and is probably best if you stay away from them for now!

ELMs work pretty much in the same way, although for these I have made default the automatic specification of the hidden layer.

# Fit ELM
elm.fit <- elm(y.in)
print(elm.fit)
plot(elm.fit)

This gives the following network summary:

ELM fit with 100 hidden nodes and 20 repetitions.
Series modelled in differences: D1.
Univariate lags: (1,3,4,6,7,8,9,10,12)
Deterministic seasonal dummies included.
Forecast combined using the median operator.
Output weight estimation using: lasso.
MSE: 83.0044.

I appreciate that using 100 hidden nodes on such a short time series can make some people uneasy, but I am using a shrinkage estimator instead of conventional least squares to estimate the weights, which in fact eliminates most of the connections. This is apparent in the network architecture in Fig. 3. Only the nodes connected with the black lines to the output layer contribute to the forecasts. The remaining connection weights have been shrunk to zero.

Fig. 3. ELM network architecture.

Another nice thing about these functions is that you can call them from the thief package, which implements Temporal Hierarchies forecasting in R. You can do that in the following way:

# Use THieF
library(thief)
mlp.thief <- thief(y.in,h=tst.n,forecastfunction=mlp.thief)

There is a similar function for using ELM networks: elm.thief.

Since for this simple example I kept some test set, I benchmark the forecasts against exponential smoothing:

Method MAE
MLP (5 nodes) 62.471
MLP (auto) 48.234
ELM 48.253
THieF-MLP 45.906
ETS 64.528

Temporal hierarchies, like MAPA, are great for making your forecasts more robust and often more accurate. However, with neural networks the additional computational cost is evident!

These functions are still in development, so the default values may change and there are a few experimental options that may give you good results or not!