Coffee Demand Prediction


Background

The considerably growing number of customers causes the coffee shop to encounter complications in inventory management. Moreover, the establishment of social restrictions by the government to prevent COVID-19 transmission leads to demand uncertainty and unpredictable number of customers.

Therefore, A forecasting model and its application was developed to implemented into the business in order to:

  • Avoid oversupply and undersupply of coffee beans
  • Establish a method to use the forecasting model into the business system
  • Be implemented as a fundamental step in deciding a supply order

Methodology

Steps in developing and implementing the model includes:

  • Data Collection: Gathering internal data (coffee demand) from POS System and external            data (Weather, holiday, etc)
  • Data Pre-processing: Transform transaction data to coffee demands
  • Model Training: model selection, hyperparameter tuning, feature engineering
  • Perfomance Comparison: Compare the error and result generated by the models
  • Model Implementation: Dashboard creation and supply order planning
Process Flow

The machine learning including the data processing were performed using Python programmin, while PowerBI were used for dashboard modeling.

TThe training dataset consists of between '1/1/2020' to '31/5/2021', while the testing dataset consists between '1/6/2021' to '30/31/2020'

Data Variables

The dependent or variable is the coffee beans demand in grams, the value that will be predicted. The other variables are the independent variables or predictor variables.

Data Exploration

To understand the data further before the model training, the data exploration were performed for the following aspects:

  • Actual data plotting with information regarding the enforced social restrictions
  • Process Flow
  • Time series decomposition (trend and monthly seasonality)
  • Time Series Decomposition Weekly Seasonal Monthly Seasonal
  • Day vs average demand
  • Average Daily Demand
  • Rain intensity vs average demand
  • Average Rain Condition

    Forecasting Result

    The models were trained with two different datasets:

  • Dataset that consist of date variable only
  • Dataset that uses all the variables that was listed in the previous table
  • Forecast Result

    MLR model using all variables achieved the lowest MAPE with the value of 41.428, whereas DT model using all variables achieved the lowest RMSE and MAE value of 132.625 and 95.14 respectively.

    Overall, the models that use all variables produce smaller errors than those that use only the date variable.

FARIMA Result
ANN-MLP Result

ANN-MLP forecast result resembles more the variation of the past data, where as FARIMA forecast values creates a single straight line in other words, almost identical forecast values.

Consequently, ANN-MLP with parameter of (4,0.2,1) was selected as the best model. This model was used to generate the forecast value of the next 30 days in the dashboard.

Model Implementation

Based on the prediction pattern and error measurements, the DT model has been selected as the best model to predict the future value of coffee beans demand.

Dashboard Prototype

The displayed predicted value was based on the prediction results of the DT model. The dashboard not only informs the number of demanded coffee beans in the future, but also the number of bags that is needed to order.

The FIFO method helps the estimation for the costs of buying coffee beans are determined and this could help to consider the company's income and expenses.

Dashboard Prototype

The total price shown is an estimate of the costs that need to be incurred for ordering coffee beans in July 2021.


Copyright © 2023 Giovanni Abel Christian