Volatility Fitting Performance of QGARCH (1,1) Model with Student-t, GED, and SGED Distributions

The research had two objectives. First, it compared the performance of the Generalized Autoregressive Conditional Heteroscedasticity (1,1) (GARCH) and Quadratic GARCH (1,1) (QGARCH)) models based on the fitting to real data sets. The model assumed that return error follows four different distributions: Normal (Gaussian), Student-t, General Error Distribution (GED), and Skew GED (SGED). Maximum likelihood estimation was usually employed in estimating the GARCH model, but it might not be easily applied to more complicated ones. Second, it provided two ways to evaluate the considered models. The models were estimated using the Generalized Reduced Gradient (GRG) Non-Linear method in Excel’s Solver and the Adaptive Random Walk Metropolis (ARWM) in the Scilab program. The real data in the empirical study were Financial Times Stock Exchange Milano Italia Borsa (FTSEMIB) and Stoxx Europe 600 indices over the daily period from January 2000 to December 2017 to test the conditional variance process and see whether the estimation methods could adapt to the complicated models. The analysis shows that GRG Non-Linear in Excel’s Solver and ARWM methods have close results. It indicates a good estimation ability. Based on the Akaike Information Criterion (AIC), the QGARCH(1,1) model provides a better fitting than the GARCH(1,1) model on each distribution specification. Overall, the QGARCH(1,1) with SGED distribution best fits both data.


I. INTRODUCTION
Volatility is an essential part of making economic analysis decisions, such as determining the option prices (Bi, Yousuf, & Dash, 2014;Huang, Wang, & Hansen, 2017). According to Abdalla and Winker (2012), statistical volatility can be interpreted as a standard deviation of changes in the value (return) of the asset with a specific period. The volatility of time series data can exhibit heteroscedasticity, which means that the volatility varies over time.
A popular model that can be used to model the heteroscedastic volatility is Generalized Autoregressive Conditional Heteroscedasticity (1,1) (GARCH) of Bollerslev (1986). The GARCH (1,1) model has been modified and extended to provide an asymmetric relationship between volatility and return. According to Francq and Zakoian (2019), asymmetric means that a past positive or negative return of the same absolute value implies a different effect on current volatility. Several asymmetric GARCH models have been proposed in the literature, such as Asymmetric GARCH (AGARCH) of Engle and Ng (1993), Glosten-Jagannathan-Runkle GARCH (GJR-GARCH) of Glosten, Jagannathan, and Runkle (1993), Exponential GARCH (EGARCH) of Nelson (1991), and Quadratic ARCH of Sentana (1995). Among them, the research focuses on the Quadratic GARCH (QGARCH) model which includes an additional term to describe the skewness property (Takaishi, 2009).
In addition to the asymmetrical nature in data, many financial studies have found that the returns distribution of financial assets has heavy-tails and skewness characteristics. To overcome the heavy-tail in the context of the ARCH/ GARCH model, Bollerslev (1987) applied the Student-t distribution, and Nelson (1991) proposed a Generalized Error Distribution (GED). Meanwhile, to overcome tail thickness and skewness, Theodossiou (2015) introduced Skewed Generalized Error Distribution (SGED). To the best of the authors' knowledge, none of the studies compares those distributions in the context of the QGARCH model. Therefore, the first contribution of the research is to provide the empirical comparison of the GARCH and QGARCH models with Normal, Student-t, GED, and SGED distributions. In this research, the volatility fitting performance of competing models is investigated by using the log-likelihood ratio test and Akaike Information Criterion (AIC). Furthermore, the used method to estimate the model parameters is the Generalized Reduced Gradient (GRG) Non-Linear method provided by Excel's Solver and the Adaptive Random Walk Metropolis (ARWM) method in Markov Chain Monte Carlo (MCMC) scheme implemented in the Scilab program. Both methods are successfully applied by Nugroho et al. (2019a) and Nugroho, Susanto, Prasetia, and Rorimpandey (2019b) in estimating other GARCH models. Therefore, the second contribution is to evaluate the ability of GRG Non-Linear method in Excel's Solver in estimating the QGARCH models.

II. METHODS
The real data used in the application are Financial Times Stock Exchange Milano Italia Borsa (FTSEMIB) and Stoxx Europe 600 (Stoxx600) indices. The FTSEMIB is the primary benchmark stock index for the Italian stocks exchange market. This index measures the performance of 40 most-traded stock classes on the exchange market. Meanwhile, the Stoxx600 index is derived from the Stoxx Europe Total Market Index and covers the largest 600 stocks across 17 countries of the European region. The daily returns of both indices cover the daily period from January 2010 to December 2017 (consisting of 4.433 and 4.502 observations for FTSEMIB and Stoxx600, respectively), which are publicly available in Oxford-Man Institute's realized library. The daily percentage returns at time t are calculated using Equation (1), where S t denotes the asset price at time t. (1) The statistics summary for both returns data is in Table 1. The Jarque-Bera (JB) normality test shows that returns data are not normally distributed. It is indicated by the greater JB statistical values than the critical value of 5,99 at the 5% significance level. The critical value is obtained from the chi-square distribution table since the JB statistic asymptotically follows a chi-square distribution with 2 degrees of freedom (Jarque, 2011). The existence of heavy-tailedness in returns data is indicated by greater kurtosis values than 3, whereas the asymmetry in data is indicated by skewness that is not equal to zero. Therefore, the assumption of Student-t, GED, and SGED distributions for return error is expected to be more appropriate than Normal distribution. QARCH is one type of asymmetric ARCH models that allows an asymmetric relationship between past returns and current volatility (Sentana, 1995). The QGARCH(1,1) is basically similar to the GARCH(1,1) with an additional parameter to capture the relationship of volatility-return. The QGARCH(1,1) model is expressed as follows: (2) The ω > 0, α ≥ 0, β ≥ 0, and satisfying ensures positivity of the conditional variance and 0 ≤ α + β < 1 as a requirement of variance stationarity. The parameter of γ denotes an asymmetric effect. When it is γ = 0, the model is reduced to the GARCH(1,1) model. When it is γ > 0, the current variance will increase more as implied by the past positive return than the past negative return. Otherwise, when it is γ < 0, the past negative return means the current variance increase more than the past positive return. This phenomenon is known as the leverage effect.
When the return error (ε t ) follow the Normal distribution, the total log-likelihood function of the model is given by: The denotes the likelihood function of data conditional on parameter. Following Bollerslev (1987), the total log-likelihood function for the model with ε t follow Student-t distribution is expressed as (4) The ν < 2 illustrates the degree of freedom that affects the tail thickness. If ν goes to infinity, the distribution is close to Normal distribution. Meanwhile, the smaller degrees of freedom means heavier tails (Blangiardo & Cameletti, 2015).
The ν > 0 describes the tail thickness. If it is ν = 2, the distribution is reduced to the normal distribution. If it is ν < 2, the distribution has thicker tails than the Normal distribution.
Finally, the total log-likelihood function for the model with SGED is expressed by: Parameter of κ > 0 denotes a shape parameter by controlling the height and tails of the distribution, and parameter of λ means the slope parameter on condition of ₋1 < λ < 1When it is κ = 2 and λ = 0, the SGED is reduced to Normal distribution.
Solver is an add-in Microsoft Excel to solve and analyze optimization models (maximum or minimum), including non-linear problems. Financial practitioners commonly use Excel's Solver because computer programming knowledge is not necessary to solve numerical optimization. The researchers choose the GRG Non-Linear method in Excel's Solver to find the parameter values that maximize log-likelihood. This method is often the preferred choice for general use (Rothwell, 2017). In the GRG Non-Linear scheme, each existing value of the decision parameter will be taken as the initial solution. A small change in parameter will improve the objective value (Powell & Batt, 2014).
Next, MCMC is a Bayesian algorithm consisting of two main steps: generating random variables as Markov chains and applying a Monte Carlo approach to calculate statistical values from Markov chains (Marin & Robert, 2014). Within the Bayesian framework, the random variable is generated from the posterior distribution. The posterior distribution for the parameter of θ conditional on data y is in Equation (12). The ρ(θ) denotes the prior distribution.
A method for generating a Markov chain is ARWM method. The ARWM method is statistically efficient in estimating the GARCH(1,1) models (Nugroho, 2018). The ARWM procedure is employed as follows. First, the initial conditions are the parameter of θ 0 and the step size of s 0 . Second, it starts on iteration n = 1, by generating proposal of where η n~N (0,1), calculating Metropolis ratio of , and generating u~U(0,1). If u < r, then θ n = θ n-1 , else θ n = θ * . Third, given s n ϵ[s min , s maks ], it calculates The m(θ * ) defines the frequency of proposals acceptance for θ * . If it is w < s maks , it will be s n = w, else s n = s maks .

III. RESULTS AND DISCUSSIONS
For the GRG Non-Linear scheme in Excel's Solver, the initial values of the model parameters are set to as follows: Then, the estimation steps follow Nugroho, Susanto, and Rosely (2018). In the MCMC scheme, the ARWM method is implemented in the Scilab program by writing its code. With the starting values as Excel's Solver, the MCMC algorithm runs with 6.000 iterations for each parameter to generate Markov chains. The first 1.000 samples are discarded to reduce the non-stationarity caused by initial values. The remaining 5.000 samples are saved to calculate the posterior mean and the 95% of Highest Posterior Density (HPD) interval of Chen and Shao (1999). The prior distribution on parameters ( ω, α, β, and κ) is left-truncated, and the Normal distribution of N(0,1000) is as in Ardia and Hoogerheide (2010). Meanwhile, on the parameter of λ is the Normal distribution of N(0,1000), and parameter of ν with exp(0,01) distribution as in Deschamps (2006).
The estimation result by the MCMC method is used as a particular reference to see the estimation accuracy of the GRG Non-Linear method in Excel's Solver. Hence, the sampling efficiency of the MCMC method will be considered first. In this case, the sampling efficiency can be seen through a visual inspection based on the trace plot of the estimated values for each parameter (Turner, Sederberg, Brown, & Steyvers, 2013). The trace plot for each parameter is a time series plot showing samples (the realizations of the Markov chain) at each iteration against the number of iterations (Roy, 2020). The diagnostic by viewing the trace plot is the most common method to assess the Markov chain convergence graphically. It can be done by viewing the chain mixing. When a trace plot tends to be stable within the whole parameter space (values of samples), the chain is said to be well mixing, and it will take faster to convergence (Tsikerdekis, 2016).
For example, the research reports the sampling efficiency only for the most complicated model case, i.e., the QGARCH(1,1) model with SGED. Figure 1 presents the trace plots of the last 5.000 samples of the estimated parameters in the QGARCH(1,1) model with SGED adopting the FTSEMIB data. Trace plots show that the chains fluctuate around their average (or it is said to be stationary). It indicates that the chains are well mixed and converge to their posteriors. Therefore, the ARWM method in the MCMC scheme is efficient for estimating the model. This result supports the results of Nugroho (2018).
Next, Tables 2 and 3 show the results of estimated parameters for all models adopting FTSEMIB and Stoxx600, respectively. First, the researchers notice that the value of λ in the case of GED distribution is not obtained directly. It is calculated using Equation (6) based on the estimated value of ν. Although Excel's Solver does not provide strict conditions for the ">" or "<" sign, the violation of α + β =1 does not occur in the research. Such violation is found by Nugroho et al. (2019b).
Furthermore, from observation of the difference (bias) between estimated values obtained from Excel's Solver and MCMC, it shows that both methods have close results (relatively) for the estimated values. Therefore, it can be said that the GRG Non-Linear method in Excel's Solver has a good ability to estimate all models even though the objective function (log-likelihood) has a complicated form. So, Excel's Solver tool can be recommended for financial practitioners who have little computer programming knowledge. A disadvantage of the GRG Non-Linear method in Excel's Solver is the unavailability of parameter significance. It means that the estimation value is really different or statistically significant. Meanwhile, the MCMC algorithm can provide the significance of estimated parameter values based on a generated Markov chain. Table 4 presents 95% of HPD intervals for the asymmetric parameter of γ in forming the QGARCH model, the parameter of ν in developing the GED, and the parameters of λ and κ in forming the SGED. In each distribution, 95% of the HPD interval for γ does not include zero. It indicates that the asymmetric parameter is significant and needs to be incorporated into the GARCH model. In particular, the values of γ are negative, which means that the current variance will increase more as implied by the past negative return than the positive return of the same magnitude. So, the leverage effect exists in the Stoxx600 data.
For the parameter of ν with GED, the 95% of HPD interval excludes 2 and particularly less than 2 in fitting on both real data. It means that both real data support the GED rather than the Normal distribution as the preliminary analysis. This result suggests the necessity of the skewness feature in explaining the skewed characteristic for both real data.  Regarding the parameters in SGED, the 95% of HPD interval for λ does not include 0, and the 95% HPD interval for κ is not equal to 2. It indicates that the parameters of λ and κ are significant. Both real data provide supporting evidence for the SGED rather than the Normal distribution as the preliminary analysis. This result suggests the necessity of both skewness and kurtosis features in explaining the skewed and heavy-tails characteristics for both real data.
In the model evaluation, when two competing models are nested-one model contains the others, the goodness of fit can be assessed using the Log-likelihood Ratio Test (LRT). The LRT between a basic model (M 0 ) and an alternative model (M 1 ) is based on the following statistic (Francq & Zakoian, 2019): The denotes the log-likelihood value for M i . Since the distribution of LRT is chi-square, the critical values at the significance levels of 1%, 5%, and 10% are 6,64, 3,84, and 2,71, respectively, for 1 degree of freedom. For 2 degrees of freedom, it is 9,21, 5,99, 4,61, respectively. In this case, degrees of freedom refers to the difference in the number of parameters between two competing models. The LRT rejects the basic model if the value of the LRT statistic is greater than a critical value.    Regarding distribution, the research has two cases of nested models, i.e., both GED and SGED nest the Normal distribution. Based on LRT, Table 5 reports the performance comparison of the GED and SGED specifications against the Normal distribution, which are applied to the GARCH(1,1) and QGARCH(1,1) models. The LRT is done by assuming the hypothesis as follows: H0 : M0 (Model with Normal distribution) H 1 : M 1 (Model with either GED or SGED) The results show that the LRT statistic values for all cases are greater than the critical value at a significance level of 1%. This finding indicates the rejection of H 0 . In other words, the GED or SGED is more appropriate for all observed data than the Normal distribution. It supports the previous results related to the significance of the skewness and heavy-tailedness parameters in Table 4.  Regarding all specifications, since all candidate models are non-nested-neither can be obtained from the other, the AIC of Akaike (1998) can be used to determine the best-fitting model. An AIC score is calculated using Equation (14) (Snipes & Taylor, 2014). The k is the number of estimated parameters, and is the likelihood value for the estimated parameters. A lower AIC indicates a better fit.
(14) Table 6 presents the AIC values for all cases. First, considering the comparison among four distributions, the results show that the SGED distribution best fits the Stoxx600 data in each model, followed by Student-t, GED, and Normal distributions. This finding confirms the previous results related to the significance of parameters in GED and SGED distributions in Table 4. Meanwhile, in adopting the FTSEMIB data, the SGED distribution provides the best data fit for the QGARCH model only. So, it suggests applying the flexible distribution, which treats heavy-tails and skewness in returns distribution. Second, comparing the GARCH (1,1) and QGARCH(1,1) models in each distribution, the results show that the QGARCH(1,1) model fits better than the GARCH (1,1) model. Finally, AIC suggests the QGARCH(1,1) model with SGED distribution as the best fitting model.

IV. CONCLUSIONS
The research analyzes the volatility fitting performance of GARCH(1,1) and QGARCH (1,1) models based on the FTSEMIB and Stoxx600 indices by assuming that the return error follows Normal, Student-t, GED, and SGED distribution. The models are estimated using the GRG Non-Linear method in Excel's Solver and the MCMC's ARWM method implemented in the Scilab program. Based on estimation results, it finds that the GRG Non-Linear method in Excel's Solver produces similar estimation values to MCMC's ARWM method. Therefore, the first main research finding is that GRG Non-Linear method in Excel's Solver has a reliable ability to estimate the complicated models. Based on LRT, in particular, GED and SGED statistically outperform the Normal distribution. Furthermore, the second main finding is that QGARCH(1,1) with SGED distribution provides the best fitting in AIC. Overall, the QGARCH(1,1) with SGED distribution best fits both data.
These findings contribute to the existing literature by using Excel's Solver for the financial practitioners who have limited programming knowledge, and the expansion of QGARCH models, and their fitting performance in the stock market. Therefore, the results have practical implications for financial practitioners to use a simple alternative estimation method and to improve the optimality of investment strategy.
The research has two-fold limitations. First, the research considers three non-Normal distributions. The assumption of other distributions will make the model fit the data better. The second limitation is that the model solely relies on daily returns. The incorporation of realized measures in the variance process will improve the modeling and forecasting of financial volatility. As a possible extension of the research, the assumption of other non-Normal distributions and incorporation of realized variance as a term in the conditional variance process can be applied. These aspects are the topics of the authors' current research.