Coffee Demand Prediction

Background

The considerably growing number of customers causes the coffee shop to encounter complications in inventory management. Moreover, the establishment of social restrictions by the government to prevent COVID-19 transmission leads to demand uncertainty and unpredictable number of customers.

Therefore, A forecasting model and its application was developed to implemented into the business in order to:

Avoid oversupply and undersupply of coffee beans
Establish a method to use the forecasting model into the business system
Be implemented as a fundamental step in deciding a supply order

Methodology

Steps in developing and implementing the model includes:

Data Collection: Gathering internal data (coffee demand) from POS System and external data (Weather, holiday, etc)
Data Pre-processing: Transform transaction data to coffee demands
Model Training: model selection, hyperparameter tuning, feature engineering
Perfomance Comparison: Compare the error and result generated by the models
Model Implementation: Dashboard creation and supply order planning

The machine learning including the data processing were performed using Python programmin, while PowerBI were used for dashboard modeling.

TThe training dataset consists of between '1/1/2020' to '31/5/2021', while the testing dataset consists between '1/6/2021' to '30/31/2020'

The dependent or variable is the coffee beans demand in grams, the value that will be predicted. The other variables are the independent variables or predictor variables.

Data Exploration

To understand the data further before the model training, the data exploration were performed for the following aspects:

Actual data plotting with information regarding the enforced social restrictions

Time series decomposition (trend and monthly seasonality)

Day vs average demand

Rain intensity vs average demand

Forecasting Result

The models were trained with two different datasets:

Dataset that consist of date variable only
Dataset that uses all the variables that was listed in the previous table

MLR model using all variables achieved the lowest MAPE with the value of 41.428, whereas DT model using all variables achieved the lowest RMSE and MAE value of 132.625 and 95.14 respectively.

Overall, the models that use all variables produce smaller errors than those that use only the date variable.

ANN-MLP forecast result resembles more the variation of the past data, where as FARIMA forecast values creates a single straight line in other words, almost identical forecast values.

Consequently, ANN-MLP with parameter of (4,0.2,1) was selected as the best model. This model was used to generate the forecast value of the next 30 days in the dashboard.

Model Implementation

Based on the prediction pattern and error measurements, the DT model has been selected as the best model to predict the future value of coffee beans demand.

The displayed predicted value was based on the prediction results of the DT model. The dashboard not only informs the number of demanded coffee beans in the future, but also the number of bags that is needed to order.

The FIFO method helps the estimation for the costs of buying coffee beans are determined and this could help to consider the company's income and expenses.

The total price shown is an estimate of the costs that need to be incurred for ordering coffee beans in July 2021.