Forecasting the Items Consumption in the Hotel Storage with the Autoregressive Integrated Moving Average Method

In this era, hotel has storage as a storing space for every kind of items. Items stored in the storage are items being used for the needs of the staffs, also for the needs of hotel’s operational. The item consumption is running smoothly with resupply. However, there are often mistakes in resupplying the items. For preventing those several mistakes, a reference is needed to be used for controlling the amount of items arrival (monthly) with minding the amount of items in the storage should be. The reference to be used is the forecast of the item consumption every month. Forecasting was being done with Autoregressive Integrated Moving Average (ARIMA) method. There are five steps needed to build the ARIMA model, such as plot identification, model identification, model estimation, choosing the best model, and prediction (forecast). The input variable to be used in this research is the rime series from the data of storage’s item consumption starts from January 2018 until October 2020, and the output variable is the result of the prediction of item consumption in the next period, such as in November to December 2020. The results is subtracted with the number of items left in storage to obtain the minimum amount of item to be entered for the month.


I. INTRODUCTION
Storage is not a facility that can be separated from operational activities in a company. At this time, the hotel has storage as a room that is used to store various kinds of item, both food and non-food items. Operational activities in storage greatly affect the company's operations, both on the performance and daily needs of employees and visitors.
A common problem that can occur in hotel storage is a sudden shortage or out of stock of item, and the worst thing is the lack of confidence in adding stock (resupply) when receiving news from suppliers that the stock of item that the storage wants to order is insufficient or empty, and even it could also be due to limited budget owned by the company. Even though they know the number of employee's needs constantly, Pacific Palace Hotel storage staffs cannot know directly the needs of the visitors, because the number of hotel's visitors changes from time to time, and even the number of events held at the hotel. Both of the number of visitors and events held are also affect the hotel's administration works. In addition, one of the factors that causes unsure of resupply at Pacific Palace Hotel storage is excessive or uncontrolled resupply of item that may have a low consumption level or a consumption level that has not changed too significantly, or for item that have not paid attention to their expiration date, so that resulting in a lack of budget in resupply. Therefore, it is necessary to take the preventive action by resupply early to prevent these problems.
This action can be supported by prediction of the level of consumption of certain item with the help of processing consumption data from the past recorded previously. Predictions that are usually carried out by every company are periodic predictions, either annually, monthly, weekly or daily. For problems in resupply, monthly predictions can be made, by processing consumption data which is made in the form of monthly periods. The most important thing in making a predictions is not to trigger the company to make the wrong decision if the prediction results obtained have a large error value. This data processing can be done by applying various methods, one of which is Machine Learning with the Autoregressive Integrated Moving Average (ARIMA) method. Machine Learning is a part of Artificial Intelligence that plays a role in building a system with data. ARIMA is one of the excellent methods used to produce accurate short-term predictions, but not long-term predictions.
One of the studies involving the ARIMA method is a study entitled "Prediction of Short-Term National Staple Prices Using ARIMA". Research conducted by Rasyidi (2017) was based on his opinion that uncontrolled fluctuations in the price of basic commodities can cause losses to both consumers and producers. This research is one of the preventive steps in preventing these problems. The ARIMA model used in several prediction horizons produces a fairly accurate level of prediction, which is 97,78%, but the accuracy of the three main ingredients (shallots, red chilies, curly red chilies) which have the greatest error at the distance of the prediction horizon tends to change rapidly because of being influenced by the holidays on the Hijri calendar.
The researcher reiterates the problems that have been previously stated, that if no anticipation is made at the time of resupply, the hotel operational activities will certainly face obstacles.

II. METHODS
Machine Learning is an approach of artificial intelligence that mimics human behavior to solve problems or perform automation. Like Artificial Neural Network, Machine Learning also has a characteristic that is a training process, so it requires data to be studied (data training) (Putra, 2019).
Predictive research is quantitative research. Quantitative research methods are known as discovery methods because they can develop into a variety of new science and technology. The data used in this research method is in the form of numbers and statistics (Sugiyono, 2012). Predictive research can be carried out through trend studies, which is looking at developments over a certain period of time (present or past), which will reveal trends in the future (Sudaryono, 2015).
Inventory is a common term used to describe everything including the resources of an organization that are stored with the aim of anticipating demand fulfillment (Handoko, 1999). There is also an optimal supply, which means that the number of items stored is the best and profitable level (Lahu & Sumarauw, 2017). Furthermore, supply control is an activity aimed at not depleting existing supplies and maintaining an optimal level so that inventory costs remain stable (Indah et al., 2018).
The methods used in obtaining data were unstructured interviews and observations, which are the part of data collection method classified by Sugiyono (2012). The type of data used by researchers in this study is secondary data, where the data has been recorded in a report, and also it's time series data because of it's recorded with time history.
The method used in this research is the ARIMA method, known as Box-Jenkins. ARIMA is a prediction algorithm based on the idea that the information about past values contained in the time series itself can be used to predict future values (Prabhakaran, 2019).
The researcher use the programming language Python as a tools to do the data processing. Python has several advantages, which is it is open source, ease of writing and reading, where the syntax is almost close to human language (English), and also quite a lot of packages can be used for scientific computing by scientists and engineers, artificial intelligence, machine learning, and data science (Adriyan, 2019).
The data processing steps in the construction of the ARIMA model are as follows (Nofiyanto et al., 2015) :

Identify data plots
The data obtained were plot identified beforehand, so that from the data plot it can be seen the stationarity of the data. Displaying the plot is done with the pandas package (to read the .csv file), the matplotlip by importing the pyplot (to process and display the plot). If the data plot is not stationary, it is necessary to do a differencing process. Differencing a series is simply finding the difference of observations that are calculated mathematically to form a new time series (Rasyidi, 2017). The stationary identification of a plot is done with the Adfuller (ADF) test imported from the statsmodels package in Python. The data used is the cumulative consumption data from January 2018 until October 2020.

Identify the model
After the mean data reaches stationarity, identification of the terms on the ACF (Autocorrelation Function) plot is carried out for the AR (Autoregressive) terms and PACF (Partial Autocorrelation Function) for the MA (Moving Average) terms. The two plots can be used to identify several possible models that are suitable for use in time series predictions. Identification of the terms of the plot is done by looking at the lag that is outside of the shading area (not past the significant limit). The two plots are also generated from the statsmodels package in Python by importing plot_acf and plot_pacf.
3. Estimating the model If a suitable model has been determined and the estimation of the parameters has been made, the significance of the coefficient is tested. If the coefficient of the model is not significant, then the model cannot be used for prediction. Estimation on the model is done with the ARIMA summary generated from Python, by using ARMA and ARIMA which are imported from the statsmodels package.

Selection of the best model
Some things are needed to be considered in choosing a model: a. The parsimony principle, which means that the model must be as simple as possible, where the parameters contained in it are few so that the model is more stable. The model parameter chosen is the minimum AIC (Akaike's Information Criterion) value of all possible model parameters, which can be seen from the resulting ARIMA summary.
b. Ensure that the model must approach the underlying assumptions. Each parameter chosen must be considered whether the p-value of each AR and MA term listed in the ARIMA summary is significant or not. If there is a p-value in one of the orders that is not significant, then the model parameter cannot be selected.
The 3rd and the 4th step could also be switched by choosing a parameter with the minimum or the lowest AIC value of all possible parameters, then the significance of each terms (AR and MA) in the chosen parameter is tested.

Prediction (Forecasting)
The model that is considered the best can be used for prediction, especially the prediction of values in the next several periods. The predictions are processed also with the ARIMA imported from statsmodels package in Python, just like generating the ARIMA summary. The predicted value is used as a consideration to determine the amount of item that will enter in that period.

III. RESULT AND DISCUSSION
In this research, the highest level of consumption of 5 items was taken from 14 items used in hotel office operations, starting from January 2018 to October 2020. The highest consumption rate was taken from the total consumption of each item for 34 months, the highest consumption figure in between 34 months, and the average consumption per month is assisted by Microsoft Excel by using the Sum, Max, and Average functions. It turns out 6 items was taken because of the 1 st rank until the 4 th rank taken item are same (tissue, water gallon, pen and stapler filling) while the 5 th rank taken item in rank based on Sum and Average (HVS paper) is different with the Max one (masking tape). All time series datasets of 6 items are processed to identify the plot. In the plot of the six items, there was a decrease in the level of consumption starting in March, until it reached 0 in April 2020 and May 2020. The decrease was caused by the hotel experiencing a bad business condition caused by the COVID-19 pandemic, so the reduced number of hotel visitors had an impact on administrative activities at the hotel office, as well as a reduction in the number of employees working there. Due to the declining state of the hotel business, Hotel Pacific Palace decided to temporarily suspend its operational activities starting from April to May 2020.
Identification of stationarity on plots can be done with the Augmented Dickey Fuller (ADF) test. The null hypothesis (h 0 ) of the ADF test is that the time series is nonstationary. If the p-value obtained from the test is lower than the significant level (< 0,05), then h0 is rejected and it can be concluded that the time series is stationary (Prabhakaran, 2019). So, if the p-value > 0,05, it is followed by the search for the order of differencing. The following is a table of the results of the acquisition of p-value: The only time series that requires differencing is the series of water gallon (p-value > 0. However, even though the five other series are stationary, differencing is still needed for pen, stapler filling, masking tape, and HVS paper series because the each AR and MA term obtained respectively from the original data plot are 0 and 0, so that the prediction model parameter becomes (0, 0). These parameters cannot be used to build a prediction model.
The ideal differencing term is the minimum amount of differencing needed to obtain a series that is close to stationary, where the autocorrelation plot reaches zero rapidly. If in autocorrelation there is a positive value in a large number of lags, then the series needs to be further differencing. But if the first lag of autocorrelation is negative which is very large then the series is excessively differenced. Determination on the differencing term can also be done with ADF test for the differenced series. One example of research that follows this method is a study conducted by As'ad et al. (2017), in which the ADF test was also carried out on the differenced series, to get the differencing term (3), with the p-value from each 1 st , 2 nd and 3 rd differenced series are 0,4427, 0,06441 and 0,04239.   The following is the table of the possible terms of difference of the five evaluated series : Determining the AR (p) and MA (q) term respectively is done by identifying significant lag, which has a value that is outside the shading area (does not cross the significant limit) on the partial autocorrelation and autocorrelation plots. If the lag value is outside the shading area, it can be concluded that the lag is significant.

PACF (p) ACF (d)
Tissue 0 and 1 0  Based on the results of the evaluation of differencing, AR (PACF) and MA (ACF) terms, several possible parameters for the ARIMA model are : Table 6. Possible ARIMA Parameter

Item
Equation

Tissue
Water Gallon

Stapler Filling
Masking Tape

HVS Paper
The chosen parameters are used to create the prediction of item consumtion number of those 6 items (tissue, water gallon, pen, stapler filling, masking tape and HVS paper) for 2 months forward (November -December 2020). The prediction result are as follows, in which all decimal numbers are rounded up : The prediction results above can be implemented in the resupply of item at the Pacific Palace Hotel's storage.
The number of an item's quantity left at that time must be known. So, to count the quantity number needed for resupply, it was done by subtracting the quantity number got from the prediction result with the item's quantity left in the storage at that time. This predictions is conducted as a practice of applying the economic lot sizing (minimalize the costs per unit) and anticipation supply purpose (prepare the seasonal supply) (Handoko, 1999).
ARIMA processing in Python uses several packages that are usually used in general, which consist of pandas, matplotlib and statsmodels, then adding several other packages as a complement so that processing performance can run normally.
Management of the item resupplying in storage can be done by calculating the number of item that must be entered at that time (the month).
ARIMA method is a predictions method, but it is not suitable to be used to make a long term period. For the example, when the researcher processed the prediction for the following months, there are some predictions result in negative value. The researcher suggest that the ARIMA process redo is needed when there is any update in item consumption data.
For the research, it was the last 10 months (January -October 2020) of consumtion data used before the researcher updated it to the last 34 months (January 2018 -October 2020) data. The sample data was added (updated) because there are not any of AR and MA terms identified from the plots, even though the plots was differenced. In the end, those terms was identified after the sample data updated. ARIMA is the one of Machine Learning method, so it is better to use the large scale sample data to make the result of data processing more accurate. The researcher reaffirm that this research can be used as a reference to make the design of resupply decision making system.