Forecasting and Mapping Coffee Borer Beetle Attacks Using GSTAR-SUR Kriging and GSTARX-SUR Kriging Models

The research aimed to use Generalized Space Time Autoregressive (GSTAR) and GSTARX modeling with the Seemingly Unrelated Regression (SUR) approach and combine them with the Kriging interpolation technique in an unobserved location. The case study was coffee borer beetle forecasting in Probolinggo Regency, East Java, Indonesia, with Watupanjang Village as the unobserved location. The results show that GSTAR-SUR Kriging and GSTARX-SUR Kriging models can predict coffee borer beetle attacks in unobserved areas with high accuracy. It is indicated by the Mean Absolute Percentage Error (MAPE) values of less than 10%. The addition of exogenous variables (rainfall) into the model is proven to improve the accuracy of the model. The Root-Mean-Square Error (RMSE) value of the GSTARX-SUR Kriging model is smaller than the GSTAR-SUR Kriging model. The structure of the model produced from the research, GSTARX-SUR (1,[1,12])(10,0,0), can be used as a reference in modeling coffee borer beetle attacks in other regencies. Map of forecasting coffee borer beetle attack shows that the spread of coffee borer beetle attack is spatial clustering with the attack center located in the eastern region of Probolinggo Regency.


I. INTRODUCTION
Generalized Space Time Autoregressive (GSTAR) is a multivariate time series data model that involves location aspects. The GSTAR model is a natural generalization of Space Time Autoregressive (STAR) models, allowing the autoregressive parameters to vary per location. Hence, the GSTAR model is applicable to the heterogeneous characteristic of sample locations (Ruchjana, Borovkova, & Lopuhaa, 2012). The parameter estimation of the GSTAR model with Ordinary Least Square (OLS) has weaknesses when inter-location residuals are correlated. To overcome this issue, Seemingly Unrelated Regression (SUR) approach is done using Generalized Least Square (GLS) as a parameter estimation method for the model (Iriany, Suhariningsih, Ruchjana, & Setiawan, 2013). GSTAR model with the SUR approach is known as GSTAR-SUR.
The GSTAR model will undoubtedly be more informative and useful by adding exogenous variables to the model. Because exogenous variables are also affected by time, the model can use the transfer function approach. Modeling by entering exogenous variables into the GSTAR model is known as GSTARX. Suhartono, Wahyuningrum, Setiawan, and Akbar (2016) developed the GSTARX-GLS model in forecasting the inflation rate in four major cities in East Java. They used the increase in fuel oil as a non-metric exogenous variable. The results showed that GSTARX was better than the Vector Autoregressive Integrated Moving Average with Exogenous Variable (VARIMAX) model. Moreover, Astuti, Ruchjana, and Soemartini (2017) applied GSTARX model to predict an export volume of Crude Palm Oil (CPO) in several locations on the island of Sumatera, in which X is the international CPO prices. CPO export volume in one area was affected by the CPO export volume in the past in the same location, the volume of CPO exports in the past in other areas, and international CPO prices.
Meanwhile, Andayani, Sumertajaya, Ruchjana, and Aidi (2017) compared the Generalized Space Time Autoregressive Integrated Moving Average (GSTARIMA) and Generalized Space Time Autoregressive Integrated Moving Average with Exogenous Variable (GSTARIMAX) models in approaching rice price data in six provinces in Java. In the research, the exogenous variables used were metric data, namely milled dry grain price. The results showed that the GSTARIMAX was better than the GSTARIMA model. The three previous researchers prove that adding exogenous variables into the GSTAR model can improve forecasting accuracy.
GSTAR model normally can only predict an event in the future in locations where data are indeed used to build the model. The case that often happens is that a location does not have data or complete data. Hence, several alternatives can be done, one of which is to combine the GSTAR model with interpolation techniques. Research on this matter has been carried out by Abdullah et al. (2018). They combined GSTAR with Kriging interpolation, known as GSTAR-Kriging model. In this previous research, exogenous variables are not added to the model. According to Ruchjana et al. (2012), GSTAR model with autoregressive order (p) and spatial order (λ) can be written as follows: (1) The z (t) is (N×1) vector observations at the t-time. Then, λ k is spatial order from k-th AR, and is a diagonal matrix with diagonal elements as autoregressive (AR) and space-time for each location (Ф kl (1) , …, Ф kl (N) ). Last, is white noise with an average vector of 0 and a matrix of variance-covariance σ 2 I (Borovkova, Lopuhaä, & Ruchjana, 2008).
GSTARX model is a GSTAR model that considers the existence of exogenous variables. The variables are thought to affect modeled endogenous variables. GSTARX model with the autoregressive order (p), spatial order (λ), and the order transfer function (b, r, s) can be written as follows: ( 2) The is m x 1 variable vector X at t -b time, and is m xm diagonal matrix of the transfer function parameters, with and . Then, SUR is an equation parameter estimation using GLS (Nisak, 2016). SUR model consists of several equations in which the remainder does not correlate between observations in one equation. However, it relates one equation with another. To test the correlation between the equations, the Lagrange Multiplier test statistic is used (Greene, 2012). SUR model with m equations is stated as follows (Nisak, 2016). It shows Z as observations vector, X as diagonal matrix of independent variables, β as parameter model, and ε as residuals vector. (3) Assumptions that must be fullfiled in the SUR model are E(ε) = 0 and E(εε') = σ ij I T, where E(ε) is the expected value of residuals vector, E(εε') is the expected value of residuals vector multiplication, σ ij is variance matrix, I T is identity matrix, and i, j = 1,2, … , m. Variance-covariance matrix stated by Ω is in Equation (4). The matrix Ω sized is (N×T) × (N×T).
(4) According to Setiawan, Suhartono, and Prastuti (2016), parameter estimation in GSTARX-SUR model is done by applying the GLS method. It minimizes the number of general squares of . The results from GLS estimators for GSTAR-SUR and GSTARX-SUR models are obtained with Equation (5). The is parameter estimators, and Ω is variance-covariance matrix. (5) Next, Kriging interpolation is a mathematical function to estimate values at locations where the data are not available or not sampled based on the sampled points around it using the suitable semivariogram model. Semivariogram describes and models the spatial autocorrelation between data from a variable and functions as a variance measure (Gaetan & Guyon, 2010). Semivariogram is divided into two kinds: theoretical semivariogram and experimental semivariogram. There are various theoretical semivariogram models, such as spherical, exponential, gaussian, and circular models. An experimental semivariogram is based on the spatial correlation value between two variables separated by a certain distance. The experimental semivariogram is formulated in Equation (6) (Cressie, 1993). It shows s i as sample point location; Z(s i ) as observation value at the location of s i ; h as the distance between two sample points of s i ; s i + h as pair of sample points within h; and N(h) as the number of data pairs that have a distance of h.
After the experimental semivariogram values are obtained, parameters can be calculated for theoretical semivariogram calculations. Some parameters used to find values in theoretical semivariograms are sill, nugget, and range (Webster & Oliver, 1992). After obtaining the values of the three parameters, the theoretical semivariogram values are calculated and compared with the experimental semivariogram.
The research uses GSTAR model with SUR approach and tries to add exogenous variables to the model and integrate it with Kriging interpolation techniques in forecasting in unobserved locations. The case study is forecasting coffee borer beetle in Probolinggo Regency. This case study is chosen by considering that coffee is an essential commodity for the national economy and has a high economic value. The problems in the coffee industry are the low productivity of coffee plants and the low quality of coffee beans. The leading cause is the high attack of pests and diseases. One of the main pests that attack coffee plants is the coffee borer beetle. The damage by this pest can reach 4050% of the weight of coffee beans. One of the centers of coffee borer infestation in East Java is Probolinggo Regency.
The coffee borer beetle is a major pest in coffee plantations worldwide (Infante, Pérez, & Vega, 2012). It causes a decrease in productivity and quality of the beans. It is a dark black beetle borer known as Hypothenemus Hampei Ferr. It is also a major plant pest of coffee plants because of its rapid development (Baker, Barrera, & Rivas, 1992). It attacks young fruits and causes fruit declines. Meanwhile, the attacks on fairly old fruit cause defective holes and poor quality coffee beans (Damon, 2000;Jaramillo, Borgemeister, & Baker, 2006). It means that fruit borer attacks cause low production and reduce the quality of coffee beans, which results in increased costs for sorting defective seeds. Population dynamics and patterns of infestation by the coffee borer beetle are closely related to climate factors such as rainfall and relative humidity, and Forecasting and Mapping Coffee..... (Henny Pramoedyo et al.) the physiology of coffee plants (Jaramillo et al., 2006).

II. METHODS
The research is quantitative research. The data consist of secondary data and primary data. Secondary data are used to build the model (training data), and primary data validate the model obtained. The training data are 66 observations from January 2014 to June 2019. Meanwhile, the testing data are three observations from July to September 2019. Data are multivariate time series in ten villages in the six biggest coffee producing sub-districts in Probolinggo Regency. Secondary data are obtained from Balai Besar Perbenihan dan Proteksi Tanaman Perkebunan (BBPPTP) in Surabaya and the official website of the National Aeronautics and Space Administration (NASA). Then, the primary data are obtained from direct observation of the research location. The research variable is the percentage of the coffee borer beetle monthly attack area (in percentage) as a predictor variable and monthly rainfall (in millimeters) as an exogenous variable.
There are six steps of the data analysis in the research. First, perform the GSTAR and GSTARX model by identifying the GSTAR and GSTARX model orders and estimating the parameters of the GSTAR-SUR and GSTARX-SUR models. Second, forecast the next three months using the obtained GSTAR-SUR and GSTARX-SUR models. Third, model the GSTAR-SUR Kriging and GSTARX-SUR Kriging by determining nine observed locations and one unobserved location, estimating the parameters of the GSTARX-SUR model at observed locations with SUR and unobserved locations with Kriging interpolation, and establishing GSTARX-SUR Kriging forecasting model in all locations. Fourth, forecast the next three months using the obtained GSTAR-SUR Kriging and GSTARX-SUR Kriging models. Fifth, determine the best forecasting model based on the Root Mean Squared Error (RMSE) and Mean Absolute Percentage Error (MAPE) values. Last, make a forecast map of coffee borer beetle attacks in the Probolinggo Regency.

III. RESULTS AND DISCUSSIONS
The spatial order in the research is limited to λ = 1, and the order of time (p) is determined through the Matrix Partial Cross Correlation Function (MPCCF) and Corrected Akaike's Information Criterion (AICC) schemes. Based on the MPCCF and AICC schemes, it can be concluded that the order is p = 1. In addition to lag 1, GSTAR model time order also adds lag 12 by considering that data plots for coffee borer beetle attacks tend to have a 12-month seasonal pattern. Hence, the GSTAR model (λ, p) with spatial lag 1 is expressed as GSTAR (1, [1,12]). Furthermore, the order of the transfer function is identified using the Cross-Correlation Function (CCF) plot. The structure of the GSTARX-SUR model can be expressed by GSTARX-SUR (1, [1,12]) (10,0,0). Estimation of GSTAR model parameters is done by using the OLS method first to generate the residuals. From the residuals of the OLS method, the obtained matrix of variance-covariance will be used in estimating model parameters using the SUR method. In summary, the estimation results of the GSTAR-SUR model parameters are presented in Table 1. From the estimation results of the GSTAR model parameters, there are locations where the coffee borer beetle attack is affected by the attack one month earlier at that location. Meanwhile, some attacks are affected by the attack in one month and one year earlier at that location, and some are affected by the one attack in the previous month at the surrounding location. It shows that the coffee borer beetle attack in Probolinggo Regency is influenced not only by the time aspect but also by the spatial aspect.
Similar to the GSTAR, the GSTARX model parameter estimation is performed using the OLS method to generate the remainder. Furthermore, from the remainder of the OLS method, a matrix of variance-covariance will estimate the model parameters using the SUR method. In summary, the estimation results of GSTARX-SUR model parameters are presented in Table 2.  From the estimation results of the GSTARX-SUR model parameters in Table 2, the results of the significance testing of the GSTARX-SUR model parameters between locations also vary considerably. Based on the results of the parametric test of ω 10 , the rainfall in ten months earlier significantly influences the coffee borer beetle attack in nine locations. It shows that the coffee borer beetle attack in Probolinggo Regency is affected by the time and spatial aspects and rainfall from the previous ten months. Based on the estimation results of the model parameters in Table 1 and Table 2, GSTAR-SUR and GSTARX-SUR models can predict the coffee borer beetle attacks in ten villages in six sub-districts in the Probolinggo Regency. Next, using the GSTAR-SUR and GSTARX-SUR models, the forecasting of coffee borer beetle attack can be done. Figure 1 presents a plot of prediction data and actual data in Segaran Village, Tiris Sub-district.
In the research, from ten villages in Probolinggo Regency, there are nine villages with completely available data. However, the data of Watupanjang Village, Krucil Sub-district village are not available. This village is chosen as an unobserved location. The location of coffee plantations in the area is difficult to reach due to difficult road access. GSTARX-SUR model in nine locations is observed using stages as well as GSTARX-SUR model in ten locations. The forecasting stage is done after the Kriging interpolation process to get the estimated value of the GSTARX-SUR Kriging model parameters in an unobserved location.
Estimating model parameters at unobserved locations is done by Kriging interpolation using the estimated values of model parameters at nine observed locations. Comparison of the estimated results of the GSTAR Kriging and GSTARX Kriging model parameters with the ordinary GSTAR and GSTARX models in Watupanjang Village is presented in Table 3 and Table 4.  -0,021 -0,277 ω 10 (7) 0,037 0,043 In Table 3, the results of the estimated parameters of the GSTAR-SUR Kriging model do not differ greatly from the ordinary GSTAR-SUR model. Meanwhile, in Table 4, the estimation results of the GSTARX-SUR Kriging model parameters for the three parameters are relatively not much different, and the other two parameters are slightly different. However, the difference is not so big. The results obtained are similar to Abdullah et al. (2018). The estimation results of the model parameters in both the observed locations and one unobserved location are used to form the GSTAR-SUR Kriging and GSTARX-SUR Kriging models to predict the coffee borer beetle attacks in ten locations.
The goodness of the GSTAR-SUR Kriging and GSTARX-SUR Kriging can be seen by looking at the forecasting of MAPE value. It can also compare RMSE values between the GSTAR-SUR Kriging and GSTAR-SUR and GSTARX-SUR Kriging and GSTARX-SUR. Forecasting models are reliable if they have a MAPE forecasting value of less than 10%. The best model selection criteria are performed using MAPE and RMSE values. The best model is the model with smaller MAPE and RMSE values. Table 5 presents the MAPE and RMSE values of the GSTAR-SUR Kriging and GSTARX-SUR Kriging models.  Table 5, both GSTAR-SUR Kriging and GSTARX-SUR Kriging have MAPE values of less than 10%, so both of them are appropriate to predict coffee borer beetle attacks in Probolinggo Regency. When those are compared, the accuracy of the GSTARX-SUR Kriging is better than the GSTAR-SUR Kriging. It is indicated by the overall MAPE and RMSE values of the GSTARX-SUR Kriging, which are smaller than the GSTAR-SUR Kriging.
The researchers then map the forecasting of coffee borer beetle attacks in eight coffee-producing districts in Probolinggo Regency. It is the forecasting result of interpolation of the coffee borer beetle attack in ten locations in six coffee-producing districts in the Probolinggo Regency and the results of the GSTAR-SUR Kriging and GSTARX-SUR Kriging models. From each forecasting model, two forecasting maps are formed, namely the forecasting map of coffee borer beetle attack for July 2019 and August 2019. Meanwhile, forecasting results for September 2019 are not mapped because the attacks are relatively low and evenly distributed in all districts. The map of forecasting coffee borer attack in July and August 2019 from GSTAR-SUR Kriging is presented in Figures 2 and 3.
From the forecasting map of the coffee borer beetle attacks resulting from the GSTAR Kriging model, coffee borer beetle attacks in July are predicted to be quite high in the eastern region of Probolinggo Regency, especially in the Tiris area. The coffee borer beetle attack in the eastern area of Probolinggo Regency is relatively higher than the western area. It considers that the type of coffee planted in the eastern area is generally Robusta coffee, which is more susceptible to the coffee borer beetle attacks. The coffee borer beetle attacks are predicted to begin to decline in August in almost all regions. The coffee borer beetle attacks in September are predicted to decrease, and the intensity of the attack is no more than 7%. The map of forecasting coffee borer beetle attack for July and August 2019 using GSTARX-SUR Kriging model is presented in Figures 4  and 5.
In Figures 4 and 5, the coffee borer beetle attacks in July are predicted to be quite high in the eastern region of Probolinggo Regency, especially in the Tiris, Krucil, and Gading areas. The coffee borer beetle attacks in the eastern region are relatively higher than in the western region. The coffee borer beetle attacks are predicted to start to decline in August in almost all regions. Only in the southern part of Tiris, which has relatively high attack intensity. The coffee borer beetle attacks in September are predicted to decrease, and the intensity of the attack is no more than 7%. Comparing the forecasting map, the GSTARX-SUR Kriging provides a slightly higher forecasting value than the GSTAR-SUR Kriging. Based on the forecasting map of coffee borer beetle attacks, it is recommended to the Probolinggo Regency plantation office together with farmers to carry out Integrated Pest Control (IPM) during June. It is to prevent the high intensity of the coffee borer beetle attacks, which are predicted to occur during July in the Tiris, Krucil, and Gading areas. It also needs to be done considering that the distribution pattern of coffee borer beetle attacks tends to be in a seasonal pattern. Thus, the successful pest control this year will affect the coffee borer beetle attacks in the next year's harvest. Another effort that can be done is by cleaning the remaining attacked coffee fruit (not harvested) and not leaving it to break the life cycle of the coffee borer beetle attacks. The long dry season during 2019 also has the potential to slow down the flowering phase. It is estimated that the harvest period in 2020 will be delayed so that the peak of the attack next year slightly changes. If the usual peak of the coffee borer beetle attacks occurs in June and July, in 2020, it will shift slightly to July and August 2020.

GSTAR-SUR
Kriging and GSTARX-SUR Kriging models can predict coffee borer beetle attacks in unobserved locations with high accuracy. It is indicated by MAPE values of less than 10%. The addition of exogenous variables (rainfall) into the model is proven to improve the accuracy of the model. The RMSE value of the GSTARX-SUR Kriging model is smaller than the GSTAR-SUR Kriging model. The structure of the model produced from the research, GSTARX-SUR (1, [1,12])(10,0,0), can be used as a reference in modeling coffee borer attacks in other regencies. Map of forecasting coffee borer beetle attack shows that the spread is spatial clustering with the attack center located in the eastern region of Probolinggo Regency.
The GSTAR-SUR Kriging and GSTARX-SUR Kriging models have limitations. The forecasting in an unobserved location still requires observational data at that location. The data can be observational data at one and twelve months before. Therefore, for further research, modifications can be made in the flow of interpolation. Future researchers can conduct Kriging interpolation first to predict time series data in unobserved locations. After completing data in all locations, GSTAR-SUR or GSTARX-SUR models can be formed as usual. In addition, they can also combine the GSTARX-SUR model with other interpolation techniques besides Kriging, such as Inverse Distance Weighted (IDW) or Spline.