In this article, we examine the usage of two well-known time-series algorithms for a specific dataset. Moreover, we explore how to develop a model customized for the specific dataset at hand in order to increase the performance of the prediction model.
How many items of merchandise should a company reorder? How often is article X bought in a certain season?
The so-called demand forecasts help to answer such questions: thereby, an attempt is made to draw conclusions about future demand from historical sales data. Important in this context are:
Demand forecasting uses statistical analysis to identify repeating sales patterns from historical data. This allows companies to make informed decisions about ordering goods and adjusting prices.
There are various models for statistical analysis, all of which have their advantages and disadvantages. In this article, we compare two models – the SARIMAX model and the ETS model – utilizing the data from a multi-location retail shop as an example, and then provide a guide into how to expand one of them to build a customized model for the use-case.
We use a data sample with sales information from the US retailer Walmart. The time span from which we draw the historical sales data ranges from January 2011 to June 2016.
In over 40,000 time-series, we look at information on more than 3,000 individual items sold from seven departments and three product categories (Food, Household, Hobbies) in ten shops within the states of California, Wisconsin and Texas. The dataset also includes information on prices and special events such as holidays, religious festivals, the Supplemental Nutrition Assistance Program (SNAP), and sporting events.
To reduce the complexity of the analysis, we focus on predicting total sales for combinations of categories and shops. This reduces the number of time-series from over 40,000 to 30.
In a first step, we examine the seasonality in the data. To do this, we calculate the percentage deviation of sales volume from its average on a weekly, monthly, and annual basis, as shown in Figure 1. The data are aggregated as combinations of states and product categories.
Figures 1. a, b, and c show a weekly pattern: sales fall at the beginning of the week, reach their minimum on Wednesday or Thursday, rise on Friday and peak at the weekend. The weekly seasonality is generally similar in all categories and shops.
The monthly patterns shown in Figures 1. d, e, and f vary greatly by category: For "Food" and partly "Household", there is a small peak in the middle of the month, followed by a slight increase. In contrast, in the “Hobbies” and "Household" categories, sales are higher at the beginning and end of the month. This might be due to the distribution of pay checks at the end of the month. In the “Food” category, there are some spikes in the first half of the month. These are likely related to SNAP food subsidies, which are paid several times in the first 15 days of each month.
Figures 1. g, h, and i illustrate yearly patterns which differ strongly depending on the category. The only common effect is a drop in sales in May. Sales in “Hobbies” increase in December, which is probably caused by Christmas gift sales. “Household” sales increase at the beginning of spring and in autumn, most likely due to outdoor products. “Food” sales remain about constant, except in the winter when they increase in Wisconsin, perhaps due to stockpiling for the winter.
In summary, the initial visualisation reveals:
Conclusion: For a sound analysis, yearly seasonality must be decoupled from the impact of events such as holidays, as higher sales can be attributed to them. We therefore require a flexible model that is capable of representing trends, multiple seasonalities as well as events.
A common algorithm for building time-series analysis models is called Autoregressive-Integrated-Moving-Average (ARIMA). Among the different ARIMA variants, Seasonal-Autoregressive-Integrated-Moving-Average Exogenous (SARIMAX) is best suited for modelling trend variables, multiple seasonalities and special events.
To ensure that all model components have an impact on the forecasts, we include a full year in the test dataset. After analysing our example data with the SARIMAX model, the results can be summarised as follows:
⇒ With an average relative mean error of 25 percent, the SARIMAX model performed rather poorly for this dataset.
Exponential Smoothing (ETS) models are another well-established class of time-series algorithms. There are several variants of ETS models to cover different structures of time-series data.
For the described dataset, we model a multiplicative error to prevent the error term from varying too much as the value of the predictor variable changes. We observe a linear growth which can best be modelled with an additive trend. Therefore, we choose to include an additive dampened trend. The dampening effect restrains the trend below the shop's usual sales cap. Lastly, we observed that multiplicative seasonality performed best for this dataset which means that changes in sales on a particular day of the week are proportional to the overall level of sales.
ETS models estimate these components sequentially by repeating an algorithm for each point in time. ETS models can be characterised as filters passing through the data and continuously updating their estimated components. Thus, they provide the best possible prediction for a step ahead of the position where the data is observed.
After analysing our sample data with the ETS model, the results can be summarised as follows:
⇒ The model performed worse than the SARIMAX model because it has a higher mean, median, maximum and minimum error. This is because the SARIMAX model evaluates additional information about special events.
Thus, we have so far two models that can capture the weekly seasonality, trend and level quite well, but not the extreme peaks that are mostly caused by special events. This can be seen in Figure 2:
To remedy this, we extend the ETS model to capture special events. Special events can be added to the ETS model as dummy variables in a regression manner.
When adding events to the model, it is important to determine the right level of granularity. Too high a granularity leads to the loss of too much relevant information, whereas too low a granularity reduces the predictive power of the model due to overfitting. The prediction model can be built for three different levels of granularity for this dataset:
In our example, the categorical grouping in ETSXC increased the prediction accuracy in comparison to SARIMAX or ETS base models. The modelling of individual events in ETSXI even outperforms the ETSXC model.
The inclusion of up to three days before and one day after single events in our example leads to 150 additional parameters. To keep the computation time for the algorithm and the risk of overfitting low, we optimise the model at the initial phase and only include the most significant events.
Figure 3 illustrates the ETSXIBADY model compared to the ETS base model and the real observation series around Thanksgiving.
After analysing our sample data with the extended ETS model, the results can be summarised as follows:
⇒ Including events in the ETS model, regardless of their granularity, leads to an increased accuracy of about 15 percent compared to SARIMAX and the ETS base model.