Winsorized Modified One Step M-Estimator as a Measure of the Central Tendency in the Alexander-Govern Test

This research dealt with making comparison of the independent group tests with the use of parametric technique. This test used mean as its central tendency measure. It was a better alternative to the ANOVA, the Welch test and the James test, because it gave a good control of Type I error rates and high power with ease in its calculation, for variance heterogeneity under a normal data. But the test was found not to be robust to non-normal data. Trimmed mean was used on the test as its central tendency measure under non-normality for two group condition, but as the number of groups increased above two, the test failed to give a good control of Type I error rates. As a result of this, the MOM estimator was applied on the test as its central tendency measure and is not influenced by the number of groups. However, under extreme condition of skewness and kurtosis, the MOM estimator could no longer control the Type I error rates. In this study, the Winsorized MOM estimator was used in the AG test, as a measure of its central tendency under non-normality. 5,000 data sets were simulated and analysed for each of the test in the research design with the use of Statistical Analysis Software (SAS) package. The results of the analysis shows that the Winsorized modified one step M-estimator in the Alexander-Govern (AGWMOM) test, gave the best control of Type I error rates under non-normality compared to the AG test, the AGMOM test, and the ANOVA, with the highest number of conditions for both lenient and stringent criteria of robustness.


INTRODUCTION
In this study, five different tests were used, namely: (i) Alexander-Govern test (AG) (ii) Modified One Step M-estimator (MOM) (iii) Winsorized Modified One Step M-estimator in Alexander-Govern test AGWMOM (iv) t-test (v) ANOVA. These tests are performed under two, four and six group conditions, with each of the gand hdistribution.
The gand hdistribution is used for determine the level of skewness and kurtosis in a data distribution. The ANOVA is very useful in different areas of life, for example in agriculture, sociology, banking, economic and in medicine as stated by Pardo, Pardo, Vincente and Esteban (1997). Three basic assumptions must be satisfied before the ANOVA can work rightly. They are: homogeneity of the variance, normality of the data and independent observations. The ANOVA is very useful for comparing the differences between three or more means. It is applicable in testing the equality of the central tendency of a data set and is robust to little deviations from a normal data, mainly when the sample size is large to guarantee normality as explained by Wilcox (1997;2003).
Researchers such as Yusof, Abdullah, Yahaya and Othman (2011) discovered that variance heterogeneity and non-normality are the problems affecting the ANOVA. This makes the Type I error rates to be increased and the power would be decreased. The problem of variance heterogeneity has been addressed by few researchers and some alternatives have been provided. Welch (1951) introduced the Welch test, for testing the hypothesis of two populations with equal means. It has been mentioned in different literatures as good alternative to the ANOVA (Algina, Oshima & Lin, 1994).
The Welch test gives a good control of Type I error rates when the variances are not equal. It is a better alternative to parametric method that uses heteroscedasticity. However, for a small sample size, the Welch test fails to give a good control of Type I error rates, as the group sizes increases (Wilcox, 1988). The James test was introduced by James (1951) as a better alternative to the ANOVA for variance heterogeneity. This test is used for weighing the sample means and it has been discussed in many literatures as a better alternative to the ANOVA (Oshima & Algina, 1992;Wilcox, 1988).
When the sample size is small under non-normal data, the James test fails to control Type I error rates. Both the Welch test and the James test are used for analysing a non-normal data with variance heterogeneity (Brunner, Dette, & Munk, 1997;Krishnamoorthy, Lu, & Matthew, 2007;Wilcox & Keselman, 2003). The Alexander-Govern test was proposed by Alexander-Govern (1994) to handle the problem of heterogeneity of variance under normal data. But the test is not robust to nonnormality. Scholars such as Schneider and Penfield (1997) and Myers (1998) suggested that the Alexander-Govern test is a better alternative compared to the James test and the Welch test respectively. Myers (1998) admitted that the Alexander-Govern test gives an outstanding control of Type I error rates, for variance heterogeneity under a normal data. Lix and Keselman (1998) proposed a better alternative to the mean with the introduction of trimmed mean in few robust test statistics that increases the performance of the test under non-normality.
A better alternative to the use of trimmed mean is a highly robust estimator called the modified one step M-estimator. Othman et al. (2004) explained that the MOM estimator trims the extreme data set only, depending on the type of the data distribution. Under a skewed data distribution, the amount of trimming should not be the same at both tails of the distribution. For example, when the distribution is skewed to the right tail, more of the right tail of the distribution would be trimmed. When using any estimator that uses trimming, one thing that is significant is the process of trimming itself. Trimmed means assists to trims data symmetrically without any regard on the nature of the distribution. While the MOM estimator specializes in trimming only the data that is observed as outliers. When both tails of the distribution are detected as outliers, the data distribution would be trimmed symmetrically, otherwise if it is one side of the distribution is detected as outlier, it would be trimmed asymmetrically, meaning that only one tail of the data set would be trimmed. A non-normal data is a condition whereby a data is not normally distributed. In addition, Schneider and Penfield (1997) admitted that the Alexander-Govern test is a better alternative to the ANOVA under variance heterogeneity compared to the Welch test and the James test due its' less complexity in calculation and having a good control of Type I error rates. It also produces high level of power under most experimental situations, referring to different levels of examination, when the test was applied in a data distribution, in order to identify its effectiveness in a data distribution. However, when there is variance heterogeneity under normality it was only good for normal data, but not suitable for nonnormal data, as discussed by Myer (1998).
According to scholars such as Ochuko, Abdullah, Zain, and Yahaya (2015) explained that the Winsorization process is making a substitution or an exchange for the outlier detected value with a preceding value closest to it. Winsorization has greater advantages over the trimming technique in the data distribution namely: (1) it makes a replacement or an exchange for an outlier detected value with the closest value to the position where the outlier is located (2) the sample size of the data remains the same (3) it helps to prevent loss of information.
One of the recommended estimator as a better substitute for the trimmed mean is the MOM estimator that is capable of detecting the appearance of outliers in a data distribution (Yusof, Abdullah, Yahaya, & Othman, 2011). The MOM estimator does empirically trim only extreme data sets (Othma, Keselma, Padmanabhan, Wilcox, & Fradette, 2004). However, the main constraint in using the MOM estimator as a central tendency measure in Alexander-Govern test is that it fails to give an excellent control of Type I error rates when g = 0.5 and h = 0.5. This study uses the Winsorized modified one step M-estimator in Alexander-Govern test as its central tendency measure to strengthen its weakness under non-normality, in the presence of variance heterogeneity, for g = 0.5 and h = 0.5, to give a remarkable control of Type I error rates and to produce high power for the test.

METHODS
The Alexander-Govern test is introduced by Alexander-Govern (1994) and the test uses mean as its central tendency measure. Under normality, it gives a remarkable control of Type I error rates and high power under variance heterogeneity, but the test is not robust to non-normal data. This test is used for comparing two or more groups and its test statistic is derived using the following procedure.
The procedure in obtaining the test statistic for the Alexander-Govern test begins by first ordering the data distribution, with population sizes of j (j = 1, …, J). In each of the data sets, the mean is calculated by using the formula below: Where ij X represents the observed ordered random sample with j n as the sample size of the observations. The mean is used as the central tendency measure in the Alexander-Govern (AG) test. After obtaining the mean, the usual unbiased estimate of the variance is obtained by using the formula: Where j X − is used for estimating j μ for the population j. The standard error of the mean is calculated using the formula below: The weight ) ( j w for the group sizes with population j of the observed ordered random sample is defined, where ∑ j w must be equal to 1. The weight ) ( j w for each of the groups is calculated using the formula below: The null hypothesis testing for the Alexander-Govern (1994) for the equality of mean, under heterogeneity of variance is expressed using: The alternative hypothesis contradicts the statement made by the null hypothesis. The variance weighted estimated of the total mean for all the groups is calculated using the formula below: Where, j w , is the weight for each of the independent groups in the data distribution and j X − is the mean of each of the independent groups in the observed ordered data sets. The t statistic for each of the independent groups is calculated by using the formula: Where j X − is the mean for each of the independent group, ∧ μ is the grand mean for all the independent groups with population j, the t statistic with n j -1 degrees of freedom. Where ν is the degree of freedom for each of the independent groups in the observed ordered data sets. The t statistic is calculated for each of the groups and is converted to standard normal deviates using the Hill's (1970) normalization approximation formula in the Alexander-Govern (1994) The test statistic for the AG test is defined as: After obtaining the test statistic for the AG test, at α = 0.05 at ) 1 ( − j chi-square degree of freedom is selected. If the p-value obtained for the AG test is > 0.05, the test is regarded as not significant, otherwise the test is said to be significant.
Let the observed ordered data sets of , ..., , , 2 1 n X X X with sample n and group sizes j.
Firstly, the median of the data set is calculated by selecting the middle value from the observations. The As stated by Wilcox and Keselman (2003) the constant value of 0.6745 is used for rescaling the MAD estimator with the purpose of estimating the σ when taking samples from a normal distribution. Outliers in a data distribution can be detected by using: Where j X is the observed ordered random sample, M is the median of the ordered random samples and n MAD is the median absolute deviation about the median. The value of K is 2.24. This value was proposed by Wilcox and Keselman (2003) in detecting the appearance of outliers in a data distribution, because it has a very small standard error, when the sample of the data is normal. Equation (12) and (13) is also referred to as the MOM estimator that is used for detecting the appearance of outliers in a data set. In this research, the mean is replaced with the modified MOM estimator as a measure of the central tendency in the Alexander-Govern test.
The WMOM estimator is applied on the data distribution, where the outlier value detected is replaced or exchanged with a preceding value closest to the position where the outlier is located. The WMOM estimator is calculated by averaging the Winsorized data distribution. It is expressed using: The WMOM estimator is a replacement for mean as a central tendency measure in the Alexander-Govern test, due to several reasons. First, to remove the appearance of outliers from the data distribution. Second, to make the Alexander-Govern test to be robust to non-normal data.
The Winsorized sample variance is defined as: The symbol ) ( * show that * x is not the exact value of x, but it refers to a resampled version of x. In estimating the standard error of the bootstrap samples, the number of B falls within the range of (25 -200). According to Efron and Tibshirani (1998) bootstrap sample of size of 50 is sufficient to give a reasonable estimate of the standard error of the MOM estimator. In this research, the same quantity of sample size was used to estimate the standard error of the MOM estimator.
Secondly, the bootstrap replications equating to each of the bootstrap samples is defined as: The weight j w for the Winsorized data distribution is defined using: The variance weighted estimate of the total mean for the Winsorized data distribution for all the groups is expressed as: Where j w is expressed as the weight for the Winsorized data distribution and WMOMj X − is expressed as the mean of the Winsorized data distribution. The t statistic for each of the group is defined as: Where, WMOMj X − , ∧ μ , and e S is the Winsorized MOM, the total mean for the Winsorized data distribution and the standard error of the Winsorized data distribution respectively. In the Alexander-Govern technique, the j t value is transformed to standard normal by using the Hill's (1970) normalization approximation formula and the hypothesis testing for the Winsorized sample variance of the WMOM estimator for j μ is expressed using: For j = (j = 1, …., J) The normalization approximation formula for the Alexander-Govern (AG) The test statistic of the Winsorized Modified One Step M-estimator in the Alexander-Govern test for all the groups in the observed random data sample is defined using: The test statistic for the AGWMOM test follows a chi-square distribution at 05 . 0 = α level of significance with J -1 chi-square degree of freedom. The p-value is obtained using the standard chisquare distribution table. When the value of the test statistic for the AGWMOM is < 0.05, the test is referred to as significant. Otherwise the test is considered not significant.
The variables used in this research are balanced and unbalanced sample sizes, equal and unequal variance, group sizes, nature of pairing and types of distribution. All these variables were manipulated to show the strength and weakness of the AG test, the AGMOM test, the AGWMOM test, t-test and the ANOVA respectively. Table 1 Characteristics of the gand h-Distribution

g-(Non-negative content) h-(Non-negative content) Skewness Kurtosis
Types of Distribution 0 0 0 3 Standard normal 0 0.5 0 11986.20 Symmetric heavy tailed 0.5 0 1.81 18393.60 Skewed normal tailed 0.5 0.5 120.10 18393.60 Skewed heavy tailed Source: Wilcox (1997) The Type I error rates of the five different tests that were used in this research must fall under three criteria of robustness. They are (i) those tests that fall within the stringent criteria of robustness (ii) those tests that fall within the lenient criteria of robustness and (iii) those tests that do not fall on neither stringent criteria of robustness nor the lenient criteria of robustness and are regarded as not robust. This research considers the stringent criteria of robustness, within the interval of (0.042 -0.058), to judge the robustness of the tests (Lix & Keselman, 1998) and also considers the lenient criteria of robustness, to judge the robustness of the tests that fall within the interval of (0.025 -0.075) as explained by Bradley's (1978). These intervals of robustness are selected in this research to see those tests that can give remarkable control of Type I error rates. Table 2, 3, 4 and 5, define type I error rates for two groups condition. Next, Table 6, 7, 8, and 9 show type I error rates for four groups condition. Table 10, 11, 12 and 13 explain type I error rates for six groups condition. Within those tables, the bolded and italized values are those values that falls strictly within the stringent criteria of robustness. The bolded values are those values that are within the lenient criteria of robustness. The un-bolded values are referred to as not robust.          Across the distribution and across the whole group for both stringent and lenient criteria of robustness, the AGWMOM test produced 60 out the total 84 conditions of pairing that is within the stringent and lenient criteria of robustness. The AG test has 56 out of 84 conditions of pairing that falls within the lenient and stringent criteria of robustness. The AGMOM test has 51 out of 84 conditions of pairing that are within the interval of stringent and lenient criteria of robustness. The ANOVA has a total of 34 out of 84 conditions of pairing that falls within the lenient and stringent criteria of robustness.

CONCLUSIONS
The AGWMOM test gave the best control of Type I error rates under non-normality, compared to the AG test, the AGMOM test and the ANOVA because the test always gives the highest number of conditions for both stringent and lenient criteria of robust.