{"id":500,"date":"2014-07-04T16:16:23","date_gmt":"2014-07-04T16:16:23","guid":{"rendered":"http:\/\/kourentzes.com\/forecasting\/?p=500"},"modified":"2014-12-18T10:40:18","modified_gmt":"2014-12-18T10:40:18","slug":"benchmarking-your-forecasting-method","status":"publish","type":"post","link":"https:\/\/kourentzes.com\/forecasting\/2014\/07\/04\/benchmarking-your-forecasting-method\/","title":{"rendered":"Benchmark your forecasts"},"content":{"rendered":"<p style=\"text-align: justify;\">Over the years I have reviewed numerous papers that do not properly benchmark the various methods proposed. In my opinion if a paper has an empirical evaluation, then it must have appropriate benchmarks as well. Otherwise, one cannot claim that convincing empirical evidence is provided. The argument is simple: if the proposed method does not provide benefits over established benchmarks, what does it offer? Of course it is important to be careful not to focus just on forecasting accuracy, as there may be other benefits such as automation, robustness, computational efficiency, etc.<\/p>\n<p style=\"text-align: justify;\">This is very relevant to practice as well. Often companies do not benchmark their forecasts and have little evidence if they their forecasts are fine, if they could improve, or how to improve. In my consultancy work this is often the first step that I take: benchmark the current process.<\/p>\n<p style=\"text-align: justify;\">Of course, this is often easier said than done. One needs to have benchmarks implemented and these should not require far too much expertise to use. This is where all the work by various members of the community comes in handy! Currently there are multiple <a href=\"http:\/\/cran.r-project.org\/web\/views\/TimeSeries.html\">time series packages<\/a> for R freely available. A lot of these are very easy to use.<\/p>\n<p style=\"text-align: justify;\">Let us assume that we have a time series we want to forecast and that we have developed a new method. This is how easy it is to produce some benchmark forecasts:<\/p>\n<pre>&gt; <span style=\"color: #008000;\"># Load the necessary libraries<\/span>\r\n&gt; library(forecast)\r\n&gt; library(MAPA)\r\n&gt; library(tsintermittent)\r\n&gt; \r\n&gt; <span style=\"color: #008000;\"># In-sample data: 'y.trn'<\/span>\r\n&gt; <span style=\"color: #008000;\"># Out-of sample: 'y.tst'<\/span>\r\n&gt; <span style=\"color: #008000;\"># Forecasts from our new brilliant method are stored in 'mymethod'<\/span>\r\n&gt; \r\n&gt; <span style=\"color: #008000;\"># Start timer - Just to see how long this takes<\/span>\r\n&gt; tm &lt;- proc.time()\r\n&gt; \r\n&gt; <span style=\"color: #008000;\"># Let's produce some benchmarks<\/span>\r\n&gt; frc &lt;- array(NA,c(7,24)) # 7 benchmarks, forecast horizon 24\r\n&gt; \r\n&gt; fit.ets &lt;- ets(y.trn)\r\n&gt; fit.arima &lt;- auto.arima(y.trn)\r\n&gt; fit.mapa &lt;- mapaest(y.trn,paral=1,outplot=FALSE)\r\n&gt; frc[1,] &lt;- rep(y.trn[n-24],24)\r\n&gt; frc[2,] &lt;- forecast(fit.ets,h=24)$mean\r\n&gt; frc[3,] &lt;- forecast(fit.arima,h=24)$mean\r\n&gt; frc[4,] &lt;- mapafor(y.trn,fit.mapa,fh=24,ifh=0,outplot=FALSE)$outfor\r\n&gt; frc[5,] &lt;- crost(y.trn,h=24)$frc.out\r\n&gt; frc[6,] &lt;- crost(y.trn,h=24,type=\"sba\")$frc.out\r\n&gt; frc[7,] &lt;- tsb(y.trn,h=24)$frc.out\r\n&gt; rownames(frc) &lt;- c(\"Naive\",\"ETS\",\"ARIMA\",\"MAPA\",\"Croston\",\"SBA\",\"TSB\")\r\n&gt; \r\n&gt; <span style=\"color: #008000;\"># Calculate accuracy<\/span>\r\n&gt; PE &lt;- (matrix(rep(y.tst,8),nrow=8,byrow=TRUE) - \r\n+ rbind(mymethod,frc))\/matrix(rep(y.tst,8),nrow=8,byrow=TRUE)\r\n&gt; MAPE &lt;- rowMeans(abs(PE))*100\r\n&gt; \r\n&gt; <span style=\"color: #008000;\"># Stop timer<\/span>\r\n&gt; tm &lt;- proc.time() - tm\r\n<\/pre>\n<p>So what we have here is forecasts from &#8216;mymethod&#8217; and some benchmarks:<\/p>\n<ul>\n<li>Naive: hopefully our method predicts better than the random walk.<\/li>\n<li>ETS: state space exponential smoothing from the &#8216;forecast&#8217; package.<\/li>\n<li>MAPA: multiple aggregation prediction algorithm using ets from the &#8216;MAPA&#8217; package.<\/li>\n<li>Croston&#8217;s method: this is supposed to be used to intermittent data, but I just wanted to demonstrate how easy is to use this as a benchmark. This is from the &#8216;tsintermittent&#8217; package.<\/li>\n<li>SBA: this a variant of Croston&#8217;s method from the &#8216;tsintermittent&#8217; package.<\/li>\n<li>TSB: another intermittent demand method from the &#8216;tsintermittent&#8217; package.<\/li>\n<\/ul>\n<p style=\"text-align: justify;\">Of course not all benchmarks are appropriate, but I wanted to demonstrate how easy is to use them. Accuracy is assessed using Mean Absolute Percentage Error (MAPE), for t+1 up to t+24 forecasts.<\/p>\n<pre>&gt; print(round(MAPE,2))\r\n<span style=\"color: #000000;\">mymethod    Naive      ETS    ARIMA     MAPA  Croston      SBA      TSB \r\n    5.76     7.37     6.00     7.34     5.59     5.76     7.07     7.26<\/span> \r\n&gt; print(tm)\r\n<span style=\"color: #000000;\">   user  system elapsed \r\n   4.24    0.00    4.86<\/span> \r\n<\/pre>\n<p style=\"text-align: justify;\">Apparently mymethod is pretty good, but in terms of accuracy MAPA seems to be doing better. Perhaps it is interesting to see that Croston&#8217;s method is not that bad, which makes sense considering that for non-intermittent data it is equivalent to exponential smoothing.<\/p>\n<p style=\"text-align: justify;\">The whole benchmarking took very little coding and only 4.86 seconds to run. This could be sped up using parallel processing that both &#8216;forecast&#8217; and &#8216;MAPA&#8217; packages support. Of course we would prefer to have used rolling origin evaluation (cross-validation), and this could be implemented easily with a loop.<\/p>\n<p style=\"text-align: justify;\">Here is the series and the various forecasts:<\/p>\n<pre>&gt; cmp = rainbow(8, start = 0\/6, end = 4\/6)\r\n&gt; ts.plot(y.trn,y.tst,ts(t(rbind(mymethod,frc)),frequency=12,end=end(y.tst)),\r\n+ col=c(\"black\",\"black\",cmp))\r\n&gt; legend(\"bottomleft\",c(\"MyMethod\",rownames(frc)),col=cmp,lty=1,ncol=2)\r\n<\/pre>\n<p><a href=\"http:\/\/kourentzes.com\/forecasting\/wp-content\/uploads\/2014\/07\/bench.fig1_.png\"><img decoding=\"async\" loading=\"lazy\" class=\"wp-image-501 size-full aligncenter\" src=\"http:\/\/kourentzes.com\/forecasting\/wp-content\/uploads\/2014\/07\/bench.fig1_.png\" alt=\"bench.fig1\" width=\"400\" height=\"274\" srcset=\"https:\/\/kourentzes.com\/forecasting\/wp-content\/uploads\/2014\/07\/bench.fig1_.png 721w, https:\/\/kourentzes.com\/forecasting\/wp-content\/uploads\/2014\/07\/bench.fig1_-150x102.png 150w, https:\/\/kourentzes.com\/forecasting\/wp-content\/uploads\/2014\/07\/bench.fig1_-300x205.png 300w, https:\/\/kourentzes.com\/forecasting\/wp-content\/uploads\/2014\/07\/bench.fig1_-660x453.png 660w\" sizes=\"(max-width: 400px) 100vw, 400px\" \/><\/a><\/p>\n<p style=\"text-align: justify;\">The example series is &#8216;referrals&#8217; from the &#8216;MAPA&#8217; package. In other posts here you can find more information about the functions in the <a title=\"Multiple Aggregation Prediction Algorithm (MAPA)\" href=\"http:\/\/kourentzes.com\/forecasting\/2014\/04\/19\/multiple-aggregation-prediction-algorithm-mapa\/\">MAPA<\/a> and <a title=\"Intermittent demand forecasting package for R\" href=\"http:\/\/kourentzes.com\/forecasting\/2014\/06\/23\/intermittent-demand-forecasting-package-for-r\/\">tsintermittent<\/a> packages.<\/p>\n<p style=\"text-align: justify;\">To conclude:<strong> benchmark your forecasts, it is easy and necessary!<\/strong><\/p>\n<div class=\"SPOSTARBUST-Related-Posts\"><H3>Related Posts<\/H3><ul class=\"entry-meta\"><li class=\"SPOSTARBUST-Related-Post\"><a title=\"Special issue on innovations in hierarchical forecasting\" href=\"https:\/\/kourentzes.com\/forecasting\/2020\/10\/25\/special-issue-on-innovations-in-hierarchical-forecasting\/\" rel=\"bookmark\">Special issue on innovations in hierarchical forecasting<\/a><\/li>\n<li class=\"SPOSTARBUST-Related-Post\"><a title=\"Intermittent demand &#038; THieF &#8211; EJOR Editors\u2019 Choice Articles\" href=\"https:\/\/kourentzes.com\/forecasting\/2020\/06\/08\/intermittent-demand-thief-ejor-editors-choice-articles\/\" rel=\"bookmark\">Intermittent demand &#038; THieF &#8211; EJOR Editors\u2019 Choice Articles<\/a><\/li>\n<li class=\"SPOSTARBUST-Related-Post\"><a title=\"Automatic robust estimation for exponential smoothing: perspectives from statistics and machine learning\" href=\"https:\/\/kourentzes.com\/forecasting\/2020\/06\/04\/automatic-robust-estimation-for-exponential-smoothing-perspectives-from-statistics-and-machine-learning\/\" rel=\"bookmark\">Automatic robust estimation for exponential smoothing: perspectives from statistics and machine learning<\/a><\/li>\n<\/ul><\/div><!-- AddThis Advanced Settings generic via filter on the_content --><!-- AddThis Share Buttons generic via filter on the_content -->","protected":false},"excerpt":{"rendered":"<p>Over the years I have reviewed numerous papers that do not properly benchmark the various methods proposed. In my opinion if a paper has an empirical evaluation, then it must have appropriate benchmarks as well. Otherwise, one cannot claim that convincing empirical evidence is provided. The argument is simple: if the proposed method does not\u2026 <span class=\"read-more\"><a href=\"https:\/\/kourentzes.com\/forecasting\/2014\/07\/04\/benchmarking-your-forecasting-method\/\">Read More &raquo;<\/a><\/span><!-- AddThis Advanced Settings generic via filter on get_the_excerpt --><!-- AddThis Share Buttons generic via filter on get_the_excerpt --><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[41],"tags":[24,32,22,38,39],"_links":{"self":[{"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/posts\/500"}],"collection":[{"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/comments?post=500"}],"version-history":[{"count":0,"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/posts\/500\/revisions"}],"wp:attachment":[{"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/media?parent=500"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/categories?post=500"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/tags?post=500"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- WP Super Cache is installed but broken. The constant WPCACHEHOME must be set in the file wp-config.php and point at the WP Super Cache plugin directory. -->