Evaluation of input variable selection methodologies for multilayer perceptrons for high frequency time series

By | April 19, 2009

N. Kourentzes and S. F. Crone, 2009, The 29th Annual international Symposium on Forecasting, Hong Kong.

Neural networks (NN) have been successfully applied in several time series forecasting applications. Past forecasting competitions, like the NN3, NN5 and the MH competitions, have shown that as the data frequency increases, the relative accuracy of NN against benchmarks increases too, providing evidence of promising NN performance on high frequency forecasting problems. However, most of the published modelling methodologies for NN have been developed for low frequency data, like monthly time series. Literature suggests that the modelling tools of low frequency data do not readily apply for high frequency problems, which exhibit different properties of multiple overlying seasonalities, large amount of data, persisting outliers, etc. Therefore, a number of modelling challenges arise as the time series frequency increases. Furthermore, the selection of the input variables for NN, which is the most important determinant of NN accuracy, is usually based on tools developed for low frequency problems, like the ACF and PACF analysis, which become problematical as the frequency increases. This leaves an open question on how to model the input vector of NN for high frequency data and whether the methodologies that have been developed in the past are still applicable.

This analysis evaluates how several ACF and PACF, regression and heuristic based approaches, which are widely used to model NN, perform when applied to high frequency data, discusses the challenges that arise in modelling high frequency data and provides evidence which of these methodologies are still useful for high frequency problems. A large set of daily time series is used to evaluate the competing input variable selection methodologies, using the established standards of valid empirical evaluation, i.e. using a homogeneous set of time series, rolling origin evaluation, robust error measures and statistical tests to determine how these methodologies compare to each other and against a set of established benchmarks.

Download presentation.

Leave a Reply

Your email address will not be published. Required fields are marked *