Online Magazine
Analyse sales and predict future demand

To make economic decisions, companies need to be able to predict future demand. There are various algorithms upon which such prediction models can be built.
by Parinaz Ameri

In this article, we examine the usage of two well-known time-series algorithms for a specific dataset. Moreover, we explore how to develop a model customized for the specific dataset at hand in order to increase the performance of the prediction model.
How many items of merchandise should a company reorder? How often is article X bought in a certain season?
The so-called demand forecasts help to answer such questions: thereby, an attempt is made to draw conclusions about future demand from historical sales data. Important in this context are:
- Information about demand in the past over multiple time periods
- Fluctuations that occur due to seasonal changes
- The effect of special offers
Demand forecasting uses statistical analysis to identify repeating sales patterns from historical data. This allows companies to make informed decisions about ordering goods and adjusting prices.
There are various models for statistical analysis, all of which have their advantages and disadvantages. In this article, we compare two models – the SARIMAX model and the ETS model – utilizing the data from a multi-location retail shop as an example, and then provide a guide into how to expand one of them to build a customized model for the use-case.
Starting point: Our Walmart dataset
We use a data sample with sales information from the US retailer Walmart. The time span from which we draw the historical sales data ranges from January 2011 to June 2016.
In over 40,000 time-series, we look at information on more than 3,000 individual items sold from seven departments and three product categories (Food, Household, Hobbies) in ten shops within the states of California, Wisconsin and Texas. The dataset also includes information on prices and special events such as holidays, religious festivals, the Supplemental Nutrition Assistance Program (SNAP), and sporting events.
To reduce the complexity of the analysis, we focus on predicting total sales for combinations of categories and shops. This reduces the number of time-series from over 40,000 to 30.
Initial analysis: weekly, monthly, and yearly seasonality
In a first step, we examine the seasonality in the data. To do this, we calculate the percentage deviation of sales volume from its average on a weekly, monthly, and annual basis, as shown in Figure 1. The data are aggregated as combinations of states and product categories.
Figures 1. a, b, and c show a weekly pattern: sales fall at the beginning of the week, reach their minimum on Wednesday or Thursday, rise on Friday and peak at the weekend. The weekly seasonality is generally similar in all categories and shops.
The monthly patterns shown in Figures 1. d, e, and f vary greatly by category: For "Food" and partly "Household", there is a small peak in the middle of the month, followed by a slight increase. In contrast, in the “Hobbies” and "Household" categories, sales are higher at the beginning and end of the month. This might be due to the distribution of pay checks at the end of the month. In the “Food” category, there are some spikes in the first half of the month. These are likely related to SNAP food subsidies, which are paid several times in the first 15 days of each month.
Figures 1. g, h, and i illustrate yearly patterns which differ strongly depending on the category. The only common effect is a drop in sales in May. Sales in “Hobbies” increase in December, which is probably caused by Christmas gift sales. “Household” sales increase at the beginning of spring and in autumn, most likely due to outdoor products. “Food” sales remain about constant, except in the winter when they increase in Wisconsin, perhaps due to stockpiling for the winter.
In summary, the initial visualisation reveals:
- Most series demonstrate an upward trend, although trends are not necessarily consistent over time.
- Monthly seasonality seems to capture SNAP days in the “Food” category. Higher sales are generally recorded at the beginning and end of the month.
- Yearly seasonality varies within categories, with the only common finding being a decline in sales in May.
- Overall, weekly seasonality is the strongest. Monthly and especially yearly seasonality are less consistent and therefore less pronounced.
Conclusion: For a sound analysis, yearly seasonality must be decoupled from the impact of events such as holidays, as higher sales can be attributed to them. We therefore require a flexible model that is capable of representing trends, multiple seasonalities as well as events.
The SARIMAX model: trends, multiple seasonalities and special events
A common algorithm for building time-series analysis models is called Autoregressive-Integrated-Moving-Average (ARIMA). Among the different ARIMA variants, Seasonal-Autoregressive-Integrated-Moving-Average Exogenous (SARIMAX) is best suited for modelling trend variables, multiple seasonalities and special events.
To ensure that all model components have an impact on the forecasts, we include a full year in the test dataset. After analysing our example data with the SARIMAX model, the results can be summarised as follows:
- Weekly seasonality, trend and level are captured relatively well by the model.
- The occasional peaks are not captured by the model.
- The accuracy of the prediction varies widely with a minimum error of 8.82 percent and a maximum error of 54.71 percent.
⇒ With an average relative mean error of 25 percent, the SARIMAX model performed rather poorly for this dataset.
ETS models: error, trend and seasonality
Exponential Smoothing (ETS) models are another well-established class of time-series algorithms. There are several variants of ETS models to cover different structures of time-series data.
For the described dataset, we model a multiplicative error to prevent the error term from varying too much as the value of the predictor variable changes. We observe a linear growth which can best be modelled with an additive trend. Therefore, we choose to include an additive dampened trend. The dampening effect restrains the trend below the shop's usual sales cap. Lastly, we observed that multiplicative seasonality performed best for this dataset which means that changes in sales on a particular day of the week are proportional to the overall level of sales.
ETS models estimate these components sequentially by repeating an algorithm for each point in time. ETS models can be characterised as filters passing through the data and continuously updating their estimated components. Thus, they provide the best possible prediction for a step ahead of the position where the data is observed.
After analysing our sample data with the ETS model, the results can be summarised as follows:
- Like SARIMAX, the ETS model captures trend, level and weekly seasonality quite well. This is not the case for series with abrupt changes in level.
- The model seems to consistently underestimate extreme observations, regardless of them representing surprisingly high or low turnovers.
⇒ The model performed worse than the SARIMAX model because it has a higher mean, median, maximum and minimum error. This is because the SARIMAX model evaluates additional information about special events.
Extension of the ETS model: Additional capture of special events
Thus, we have so far two models that can capture the weekly seasonality, trend and level quite well, but not the extreme peaks that are mostly caused by special events. This can be seen in Figure 2:
To remedy this, we extend the ETS model to capture special events. Special events can be added to the ETS model as dummy variables in a regression manner.
When adding events to the model, it is important to determine the right level of granularity. Too high a granularity leads to the loss of too much relevant information, whereas too low a granularity reduces the predictive power of the model due to overfitting. The prediction model can be built for three different levels of granularity for this dataset:
- ETSXC: Grouping of events according to their categories
- ETSXI: Modelling each specific event individually
- ETSXIBA: Adding three days before and one day after each event
In our example, the categorical grouping in ETSXC increased the prediction accuracy in comparison to SARIMAX or ETS base models. The modelling of individual events in ETSXI even outperforms the ETSXC model.
The inclusion of up to three days before and one day after single events in our example leads to 150 additional parameters. To keep the computation time for the algorithm and the risk of overfitting low, we optimise the model at the initial phase and only include the most significant events.
Figure 3 illustrates the ETSXIBADY model compared to the ETS base model and the real observation series around Thanksgiving.
After analysing our sample data with the extended ETS model, the results can be summarised as follows:
- The model with individual events (ETSXI) performs slightly better (about 5 percent) than the model that groups events into categories (ETSXC).
- Modelling with influential individual events (ETSXIBADY) leads to an even higher overall accuracy compared to grouping the events into categories.
- The ETSXIBADY model can successfully capture most of the peaks in the dataset.
⇒ Including events in the ETS model, regardless of their granularity, leads to an increased accuracy of about 15 percent compared to SARIMAX and the ETS base model.
Conclusion
- Analysing historical sales data to identify repetitive patterns and predict future demand empowers companies to make more economic decisions.
- Suggestion: For every dataset, it is important to build several commonly known algorithms and utilize them as a baseline for comparison.
- The model created using Exponential Smoothing (ETS) performed similarly to the SARIMAX model: both performed relatively well in capturing trend, level and seasonality, but failed to predict peaks caused by special events.
- Building a customised model by extending the ETS model to include the effect of the most influential events could result in an increase of the prediction accuracy by at least 15 percent (about 92 percent in total).
