{"id":1402,"date":"2017-07-29T07:36:59","date_gmt":"2017-07-29T07:36:59","guid":{"rendered":"http:\/\/kourentzes.com\/forecasting\/?p=1402"},"modified":"2017-07-29T08:37:02","modified_gmt":"2017-07-29T08:37:02","slug":"benchmarking-facebooks-prophet","status":"publish","type":"post","link":"https:\/\/kourentzes.com\/forecasting\/2017\/07\/29\/benchmarking-facebooks-prophet\/","title":{"rendered":"Benchmarking Facebook&#8217;s Prophet"},"content":{"rendered":"<p>Last February Facebook open sourced its <a href=\"https:\/\/facebookincubator.github.io\/prophet\/\" target=\"_blank\" rel=\"noopener\">Prophet<\/a> forecasting tool. Since, it had appeared in quite a few discussions online. A good thing about Prophet is that it one can use it very easily through R (and Python). This gave me the opportunity to benchmark it against some more standard &#8211; and not! &#8211; forecasting models and methods. To do this I tried it on the M3 competition dataset (available through the <a href=\"https:\/\/cran.r-project.org\/package=Mcomp\" target=\"_blank\" rel=\"noopener\">Mcomp<\/a> package for R).<\/p>\n<p>I should start by saying that the development team of Prophet suggests that its <a href=\"https:\/\/research.fb.com\/prophet-forecasting-at-scale\/\" target=\"_blank\" rel=\"noopener\">strengths are<\/a>:<\/p>\n<ul>\n<li>high-frequency data (hourly, daily, or weekly) with multiple seasonalities, such as day of week and time of year;<\/li>\n<li>special events and bank holidays that are not fixed in the year;<\/li>\n<li>in the presence of missing values or large outliers;<\/li>\n<li>changes in the historical trends, which themselves are non-linear growth curves.<\/li>\n<\/ul>\n<p>The M3 dataset has multiple series of micro\/business interest and as a recent presentation by E. Spiliotis et al. at ISF2017 (<a href=\"https:\/\/www.researchgate.net\/publication\/318594980_Data_as_a_service_Providing_new_datasets_to_the_forecasting_community_for_time_series_analysis\" target=\"_blank\" rel=\"noopener\">slides 11-12<\/a>) indicated, the characteristics of the time series overlap with typical business time series, albeit not high frequency. However, a lot of business forecasting is still not hourly or daily, so not including high frequency examples for many business forecasters is not necessarily an issue when benchmarking Prophet.<\/p>\n<p>The setup of the experiment is:<\/p>\n<ul>\n<li>Use Mean Absolute Scaled Error (<a href=\"https:\/\/www.otexts.org\/fpp\/2\/5\" target=\"_blank\" rel=\"noopener\">MASE<\/a>). I chose this measure as it has good statistical properties and has become quite common in forecasting research.<\/li>\n<li>Use rolling origin evaluation, so as ensure that the reported figures are robust against particularly lucky (or unlucky) forecast origins and test sets.<\/li>\n<li>Use the forecast horizons and test sets indicated in Table 1, for each M3 subset.<\/li>\n<\/ul>\n<div class=\"table-responsive\"><table  style=\"width:90%;  margin-left:auto;margin-right:auto\"  class=\"easy-table easy-table-default \" >\n<caption>Table 1. M3 dataset<\/caption>\n<thead>\r\n<tr><th  style=\"width:50px;text-align:left\" >Set<\/th>\n<th  style=\"width:50px;text-align:center\" >No. of series<\/th>\n<th  style=\"width:50px;text-align:center\" > Horizon<\/th>\n<th  style=\"width:50px;text-align:center\" > Test set<\/th>\n<\/tr>\n<\/thead>\n<tbody>\r\n<tr><td  style=\"text-align:left\" >Yearly<\/td>\n<td  style=\"text-align:center\" >645<\/td>\n<td  style=\"text-align:center\" >4<\/td>\n<td  style=\"text-align:center\" >8<\/td>\n<\/tr>\n\r\n<tr><td  style=\"text-align:left\" >Quarterly<\/td>\n<td  style=\"text-align:center\" >756<\/td>\n<td  style=\"text-align:center\" >4<\/td>\n<td  style=\"text-align:center\" >8<\/td>\n<\/tr>\n\r\n<tr><td  style=\"text-align:left\" >Monthly<\/td>\n<td  style=\"text-align:center\" >1428<\/td>\n<td  style=\"text-align:center\" >12<\/td>\n<td  style=\"text-align:center\" >18<\/td>\n<\/tr>\n\r\n<tr><td  style=\"text-align:left\" >Other<\/td>\n<td  style=\"text-align:center\" >174<\/td>\n<td  style=\"text-align:center\" >12<\/td>\n<td  style=\"text-align:center\" >18<\/td>\n<\/tr>\n<\/tbody><\/table><\/div>\n<p>I used a number of benchmarks from some existing packages in R, namely:<\/p>\n<ul>\n<li><a href=\"https:\/\/cran.r-project.org\/package=forecast\" target=\"_blank\" rel=\"noopener\">forecast<\/a> package, from which I used the exponential smoothing (ets) and ARIMA (auto.arima) functions. Anybody doing forecasting in R is familiar with this package! ETS and ARIMA over the years have been shown to be very strong benchmarks for business forecasting tasks and specifically for the M3 dataset.<\/li>\n<li><a href=\"https:\/\/cran.r-project.org\/package=smooth\" target=\"_blank\" rel=\"noopener\">smooth<\/a> package. This is a less known package that offers alternative implementations of exponential smoothing (es) and ARIMA (auto.ssarima), which follow a different modelling philoshopy than the forecast package equivalents. If you are interested, head over to <a href=\"http:\/\/forecasting.svetunkov.ru\/en\/author\/config\/\" target=\"_blank\" rel=\"noopener\">Ivan&#8217;s blog<\/a> to read more about these (and other nice blog posts).\u00a0 forecast and smooth packages used together offer a tremendous flexibiltiy in ETS and ARIMA modelling.<\/li>\n<li><a href=\"https:\/\/cran.r-project.org\/package=MAPA\" target=\"_blank\" rel=\"noopener\">MAPA<\/a> and <a href=\"https:\/\/cran.r-project.org\/package=thief\">thief<\/a> packages, which both implement Multiple Temporal Aggregation (<a href=\"http:\/\/kourentzes.com\/forecasting\/2017\/04\/27\/multiple-temporal-aggregation-the-story-so-far-part-i\/\">MTA<\/a>) for forecasting, following to alternative approaches that I detail <a href=\"http:\/\/kourentzes.com\/forecasting\/2017\/07\/04\/multiple-temporal-aggregation-the-story-so-far-part-iii\/\">here<\/a> (for MAPA) and <a href=\"http:\/\/kourentzes.com\/forecasting\/2017\/07\/22\/multiple-temporal-aggregation-the-story-so-far-part-iv-temporal-hierarchies\/\">here<\/a> (for THieF). I included these as they have been shown to perform quite well on such tasks.<\/li>\n<\/ul>\n<p>The idea here is to give Prophet a hard time, but also avoid using too exotic forecasting methods.<\/p>\n<p>I provide the mean and median MASE across all forecast origins and series for each subset in tables 2 and 3 respectively. In brackets I provide the percentage difference from the ETS&#8217; accuracy. In boldface I have highlight the best forecast for each M3 subset. Prophet results are in <span style=\"color: #3366ff;\">blue<\/span>. I provide two MAPA results, the first uses the default options, whereas the second uses comb=&#8221;w.mean&#8221; that is more mindful of seasonality. For THieF I only provide the default result (using ETS), as in principle it could be applied to any forecast on the table.<br \/>\n<div class=\"table-responsive\"><table  style=\"width:90%;  margin-left:auto;margin-right:auto\"  class=\"easy-table easy-table-default \" >\n<caption>Table 2. Mean MASE results<\/caption>\n<thead>\r\n<tr><th  style=\"width:50px;text-align:left\" >Set<\/th>\n<th  style=\"width:50px;text-align:center\" >ETS<\/th>\n<th  style=\"width:50px;text-align:center\" > ARIMA<\/th>\n<th  style=\"width:50px;text-align:center\" > ES (smooth)<\/th>\n<th  style=\"width:50px;text-align:center\" > SSARIMA (smooth)<\/th>\n<th  style=\"width:50px;text-align:center\" > MAPA<\/th>\n<th  style=\"width:50px;text-align:center\" > MAPA (w.mean)<\/th>\n<th  style=\"width:50px;text-align:center\" > THieF (ETS)<\/th>\n<th  style=\"width:50px;text-align:center\" > <span style=\"color: #3366ff;\">Prophet<\/span><\/th>\n<\/tr>\n<\/thead>\n<tbody>\r\n<tr><td  style=\"text-align:left\" >Yearly<\/td>\n<td  style=\"text-align:center\" ><strong>0.732 (0.00%)<\/strong><\/td>\n<td  style=\"text-align:center\" >0.746 (-1.91%)<\/td>\n<td  style=\"text-align:center\" >0.777 (-6.15%)<\/td>\n<td  style=\"text-align:center\" >0.783 (-6.97%)<\/td>\n<td  style=\"text-align:center\" >0.732 (0.00%)<\/td>\n<td  style=\"text-align:center\" >0.732 (0.00%)<\/td>\n<td  style=\"text-align:center\" >0.732 (0.00%)<\/td>\n<td  style=\"text-align:center\" ><span style=\"color: #3366ff;\">0.954 (-30.33%)<\/span><\/td>\n<\/tr>\n\r\n<tr><td  style=\"text-align:left\" >Quarterly<\/td>\n<td  style=\"text-align:center\" ><strong>0.383 (0.00%)<\/strong><\/td>\n<td  style=\"text-align:center\" >0.389 (-1.57%)<\/td>\n<td  style=\"text-align:center\" >0.385 (-0.52%)<\/td>\n<td  style=\"text-align:center\" >0.412 (-7.57%)<\/td>\n<td  style=\"text-align:center\" >0.386 (-0.78%)<\/td>\n<td  style=\"text-align:center\" >0.384 (-0.26%)<\/td>\n<td  style=\"text-align:center\" >0.400 (-4.44%)<\/td>\n<td  style=\"text-align:center\" ><span style=\"color: #3366ff;\">0.553 (-44.39%)<\/span><\/td>\n<\/tr>\n\r\n<tr><td  style=\"text-align:left\" >Monthly<\/td>\n<td  style=\"text-align:center\" >0.464 (0.00%)<\/td>\n<td  style=\"text-align:center\" >0.472 (-1.72%)<\/td>\n<td  style=\"text-align:center\" >0.465 (-0.22%)<\/td>\n<td  style=\"text-align:center\" >0.490 (-5.60%)<\/td>\n<td  style=\"text-align:center\" >0.459 (+1.08%)<\/td>\n<td  style=\"text-align:center\" ><strong>0.458 (+1.29%)<\/td>\n<td  style=\"text-align:center\" ><\/strong>0.462 (+0.43%)<\/td>\n<td  style=\"text-align:center\" ><span style=\"color: #3366ff;\">0.586 (-26.29%)<\/span><\/td>\n<\/tr>\n\r\n<tr><td  style=\"text-align:left\" >Other<\/td>\n<td  style=\"text-align:center\" >0.447 (0.00%)<\/td>\n<td  style=\"text-align:center\" >0.460 (-2.91%)<\/td>\n<td  style=\"text-align:center\" >0.446 (+0.22%)<\/td>\n<td  style=\"text-align:center\" >0.457 (-2.24%)<\/td>\n<td  style=\"text-align:center\" ><strong>0.444 (+0.67%)<\/strong><\/td>\n<td  style=\"text-align:center\" ><strong>0.444 (+0.67%)<\/strong><\/td>\n<td  style=\"text-align:center\" >0.447 (0.00%)<\/td>\n<td  style=\"text-align:center\" ><span style=\"color: #3366ff;\">0.554 (-23.94%)<\/span><\/td>\n<\/tr>\n<\/tbody><\/table><\/div><\/p>\n<div class=\"table-responsive\"><table  style=\"width:90%;  margin-left:auto;margin-right:auto\"  class=\"easy-table easy-table-default \" >\n<caption>Table 3. Median MASE results<\/caption>\n<thead>\r\n<tr><th  style=\"width:50px;text-align:left\" >Set<\/th>\n<th  style=\"width:50px;text-align:center\" >ETS<\/th>\n<th  style=\"width:50px;text-align:center\" > ARIMA<\/th>\n<th  style=\"width:50px;text-align:center\" > ES (smooth)<\/th>\n<th  style=\"width:50px;text-align:center\" > SSARIMA (smooth)<\/th>\n<th  style=\"width:50px;text-align:center\" > MAPA<\/th>\n<th  style=\"width:50px;text-align:center\" > MAPA (w.mean)<\/th>\n<th  style=\"width:50px;text-align:center\" > THieF (ETS)<\/th>\n<th  style=\"width:50px;text-align:center\" > <span style=\"color: #3366ff;\">Prophet<\/span><\/th>\n<\/tr>\n<\/thead>\n<tbody>\r\n<tr><td  style=\"text-align:left\" >Yearly<\/td>\n<td  style=\"text-align:center\" >0.514 (0.00%)<\/td>\n<td  style=\"text-align:center\" >0.519 (-0.97%)<\/td>\n<td  style=\"text-align:center\" ><strong>0.511 (+0.58%)<\/strong><\/td>\n<td  style=\"text-align:center\" >0.524 (-1.95%)<\/td>\n<td  style=\"text-align:center\" >0.520 (-1.17%)<\/td>\n<td  style=\"text-align:center\" >0.520 (-1.17%)<\/td>\n<td  style=\"text-align:center\" >0.514 (0.00%)<\/td>\n<td  style=\"text-align:center\" ><span style=\"color: #3366ff;\">0.710 (-38.13%)<\/span><\/td>\n<\/tr>\n\r\n<tr><td  style=\"text-align:left\" >Quarterly<\/td>\n<td  style=\"text-align:center\" >0.269 (0.00%)<\/td>\n<td  style=\"text-align:center\" >0.266 <span style=\"color: #000000;\">(+1.12%)<\/td>\n<td  style=\"text-align:center\" >0.256<\/span> (+4.83%)<\/td>\n<td  style=\"text-align:center\" >0.278 (-3.35%)<\/td>\n<td  style=\"text-align:center\" ><strong>0.254 (+5.58%)<\/strong><\/td>\n<td  style=\"text-align:center\" ><strong>0.254 (+5.58%)<\/strong><\/td>\n<td  style=\"text-align:center\" >0.262 (+2.60%)<\/td>\n<td  style=\"text-align:center\" ><span style=\"color: #3366ff;\">0.388 (-44.24%)<\/span><\/td>\n<\/tr>\n\r\n<tr><td  style=\"text-align:left\" >Monthly<\/td>\n<td  style=\"text-align:center\" >0.353 (0.00%)<\/td>\n<td  style=\"text-align:center\" ><strong>0.348 (+1.42%)<\/strong><\/td>\n<td  style=\"text-align:center\" >0.351 (+0.57%)<\/td>\n<td  style=\"text-align:center\" >0.373 (-5.67%)<\/td>\n<td  style=\"text-align:center\" >0.352 (+0.28%)<\/td>\n<td  style=\"text-align:center\" >0.351 (+0.57%)<\/td>\n<td  style=\"text-align:center\" >0.351 (+0.57%)<\/td>\n<td  style=\"text-align:center\" ><span style=\"color: #3366ff;\">0.473 (-33.99%)<\/span><\/td>\n<\/tr>\n\r\n<tr><td  style=\"text-align:left\" >Other<\/td>\n<td  style=\"text-align:center\" >0.275 (0.00%)<\/td>\n<td  style=\"text-align:center\" >0.269 (+2.18%)<\/td>\n<td  style=\"text-align:center\" >0.270 (+1.82%)<\/td>\n<td  style=\"text-align:center\" ><strong>0.268 (+2.55%)<\/strong><\/td>\n<td  style=\"text-align:center\" >0.283 (-2.91%)<\/td>\n<td  style=\"text-align:center\" >0.283 (-2.91%)<\/td>\n<td  style=\"text-align:center\" >0.275 (0.00%)<\/td>\n<td  style=\"text-align:center\" ><span style=\"color: #3366ff;\">0.320 (-16.36%)<\/span><\/td>\n<\/tr>\n<\/tbody><\/table><\/div>\n<p>Some comments about the results:<\/p>\n<ul>\n<li>Prophet performs very poorly. The dataset does not contain multiple seasonalities, but it does contain human-activity based seasonal patters (quarterly and monthly series), changing trends and outliers or other abrupt changes (especially the `other&#8217; subset), where Prophet should do ok. My concern is not that it is not ranking first, but that at best it is almost 16% worse than exponential smoothing (and at worst almost 44%!);<\/li>\n<li>ETS and ARIMA between packages perform reasonably similar, indicating that although there are implementation differences, both packages have followed sound modelling philoshopies;<\/li>\n<li>MAPA and THieF are meant to work on the quarterly and monthly subsets, where, in line with the research, they improve upon their base model (ETS).<\/li>\n<\/ul>\n<p>In all fairness, more testing is needed on high frequency data with multiple seasonalities before one should conclude about the performance of Prophet. Nonetheless. for the vast majority of business forecasting needs (such as supply chain forecasting), Prophet does not seem to perform that well. As a final note, this is an open source project, so I am expecting over time to see interesting improvements.<\/p>\n<p>Finally, I want to thank <a href=\"http:\/\/www.lancaster.ac.uk\/lums\/people\/oliver-schaer\" target=\"_blank\" rel=\"noopener\">Oliver Schaer<\/a> for providing me with Prophet R code examples! You can also find some examples <a href=\"https:\/\/facebookincubator.github.io\/prophet\/docs\/quick_start.html#r-api\" target=\"_blank\" rel=\"noopener\">here<\/a>.<\/p>\n<div class=\"SPOSTARBUST-Related-Posts\"><H3>Related Posts<\/H3><ul class=\"entry-meta\"><li class=\"SPOSTARBUST-Related-Post\"><a title=\"Special issue on innovations in hierarchical forecasting\" href=\"https:\/\/kourentzes.com\/forecasting\/2020\/10\/25\/special-issue-on-innovations-in-hierarchical-forecasting\/\" rel=\"bookmark\">Special issue on innovations in hierarchical forecasting<\/a><\/li>\n<li class=\"SPOSTARBUST-Related-Post\"><a title=\"Automatic robust estimation for exponential smoothing: perspectives from statistics and machine learning\" href=\"https:\/\/kourentzes.com\/forecasting\/2020\/06\/04\/automatic-robust-estimation-for-exponential-smoothing-perspectives-from-statistics-and-machine-learning\/\" rel=\"bookmark\">Automatic robust estimation for exponential smoothing: perspectives from statistics and machine learning<\/a><\/li>\n<li class=\"SPOSTARBUST-Related-Post\"><a title=\"Elucidate structure in intermittent demand time series\" href=\"https:\/\/kourentzes.com\/forecasting\/2020\/05\/25\/elucidate-structure-in-intermittent-demand-time-series\/\" rel=\"bookmark\">Elucidate structure in intermittent demand time series<\/a><\/li>\n<\/ul><\/div><!-- AddThis Advanced Settings generic via filter on the_content --><!-- AddThis Share Buttons generic via filter on the_content -->","protected":false},"excerpt":{"rendered":"<p>Last February Facebook open sourced its Prophet forecasting tool. Since, it had appeared in quite a few discussions online. A good thing about Prophet is that it one can use it very easily through R (and Python). This gave me the opportunity to benchmark it against some more standard &#8211; and not! &#8211; forecasting models\u2026 <span class=\"read-more\"><a href=\"https:\/\/kourentzes.com\/forecasting\/2017\/07\/29\/benchmarking-facebooks-prophet\/\">Read More &raquo;<\/a><\/span><!-- AddThis Advanced Settings generic via filter on get_the_excerpt --><!-- AddThis Share Buttons generic via filter on get_the_excerpt --><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[41],"tags":[14,24,32,38,39],"_links":{"self":[{"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/posts\/1402"}],"collection":[{"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/comments?post=1402"}],"version-history":[{"count":10,"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/posts\/1402\/revisions"}],"predecessor-version":[{"id":1412,"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/posts\/1402\/revisions\/1412"}],"wp:attachment":[{"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/media?parent=1402"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/categories?post=1402"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/kourentzes.com\/forecasting\/wp-json\/wp\/v2\/tags?post=1402"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- WP Super Cache is installed but broken. The constant WPCACHEHOME must be set in the file wp-config.php and point at the WP Super Cache plugin directory. -->