## Use of Equal Variance Assumption with ANOVA

The equal variance assumption is important in statistics because it applies to two of the most widely used tools, the two-sample t-test, and Analysis of Variance (ANOVA). Both of these tools are used to test whether there are differences in population means, based upon the evidence present in samples of data taken from the respective populations. Please continue reading to learn more about the Equal Variance Assumption, or EVA.

Since this assumption plays a prominent role when utilizing these popular tools, it is often suggested that tests be run to check on the validity of this assumption. When doing so, one should always keep in mind the following points:

__1. Robustness when samples sizes are equal__
Both the two-sample t-test and ANOVA are very robust to the equal variance assumption when the sample sizes are equal, or nearly equal.

__2. Alternative procedures__
The two-sample t-test can be used either with or without the assumption of equal variances. In fact, the default method in Minitab does not assume equal variances.

__3. Data transformations__
If the variances of the samples are correlated with the size of the data (as Y increases, the variance of Y increases), it may be possible to use a log transformation (or the Box-Cox power transformation) to correct the problem.

__4. The effects of unequal variances__
In all cases, unequal variances affect the overall estimate of the error variance. This in turn affects the corresponding t or F statistics, which in turn affects the reported p-values. The p-value (probability of making a Type I error) is underestimated when the assumption of equal variances is violated. In other words, the true p-value is somewhat larger than the reported p-value.
__5. Parametric tests vs non-parametric tests__
There are non-parametric alternatives to both the two-sample t-test and ANOVA. However, it should be noted these also assume that the underlying distributions are symmetric, with the same shape. Stating that the distributions have the same shape is tantamount to stating that they have the same variance. Since both the parametric and the non-parametric tests assume basically the same thing, the parametric tests are preferred, since they are uniformly more powerful.

__6. Differences in variance are also desirable__
One of the major goals of every project should be to reduce the variation in the CTQ, or Y. Don’t forget that the two-sample t-test and ANOVA are both methods for detecting changes in the mean of Y. If changes in the variation are observed, these are important, perhaps more important than the changes in the mean one is testing for. These differences should be studied to determine if they are consistent.

What is the bottom line? If you use a statistical tool that assumes equal variance, you can and probably should test this assumption. Remember that if the sample sizes are equal, or nearly equal, this assumption can be relaxed a great deal. Also, as a rule of thumb, even when the sample sizes are not nearly equal, there is usually no problem provided the largest sample standard deviation is not more than twice the size of the smallest sample standard deviation. When there is a problem, remember that what may be a “problem” as far as testing for differences in the mean of Y can also be a “solution” for determining ways to reduce the variation in Y. Be sure to check whether the variances increase as the size of the data increases. This is not at all uncommon, and can be remedied easily with a log or Box-Cox power transformation of the data. If all else fails, remember that the p-value you see will be smaller than it actually should be. This is only cause for concern when the p-value is marginally significant. You might want to run the non-parametric alternative (if there is one) and see if the results agree. And, consider the practical significance as well as the statistical significance.

**How it Fits With the Breakthrough Strategy (DMAIC)**

* Analyze Phase*In the Analyze Phase of a Black Belt project, the Black Belt must isolate variables which exert leverage on the CTQ. These leverage variables are uncovered through the use of various statistical tools designed to detect differences in means, differences in variances, patterns in means, or patterns in variances, in the case where the CTQ is continuous.

Two of the prominent tools for detecting differences in means are the two-sample t-test and ANOVA. The assumptions for both of these are the same, since the two-sample t-test is just a special case of one-way ANOVA. Both of these tools also have non-parametric alternatives, at least in some cases. The t-test also has an alternative where equal variances are not assumed. There are non-parametric alternatives for one-way and two-way ANOVA. As mentioned earlier, one should use good sense and judgement when deciding which test is more appropriate. It is a good idea to test for equal variances. If the test fails, check the ratio of the largest sample standard deviation to the smallest sample standard deviation. If the sample sizes are equal, or nearly equal, there should not be a problem unless this ratio is larger than 4. Even when the sample sizes are not nearly equal, there should not be a problem unless this ratio is greater than 2. And, more importantly, remember that you are using these tools to detect changes in the mean of Y, and the “problem” you are having is that you have detected changes in the variation of Y at the same time. Of course, that may help you satisfy the ultimate goal of reducing the variation of Y. When using these tools, don’t forget to try data transformations, especially when the sample variances increase as the sample means increase. Keep in mind that the p-values you see may be underestimated if the variances are not equal. If a t-test or ANOVA are used and the reported p-value is marginally significant, then the actual p-value may be marginally insignificant. When using ANY statistical tool, one should ALWAYS consider the practical significance of the result as well as the statistical significance of the result before passing final judgement.

__Improve Phase__

In the Improve Phase, the Black Belt will often use designed experiments to make dramatic improvements in the performance of the CTQ. A designed experiment is a procedure for simultaneously altering all of the leverage variables discovered in the Analyze Phase and observing what effects these changes have on the CTQ. The Black Belt must determine exactly which leverage variables are critical to improving the performance of the CTQ, and establish settings for those critical variables.

In order to determine whether an effect from a leverage variable, or an interaction between 2 or more leverage variables, is statistically significant, the Black Belt will often utilize an ANOVA table, or a Pareto chart of effects, or a normal plot of effects. The results from all of these are based upon the estimate of the error variance, which is affected when the sample variances are not equal. Fortunately, most designed experiments are balanced, in other words, the sample sizes are all equal. Thus, the equal variances assumption can be relaxed for balanced experiments. Still, it is good practice to check for radical departures from the equal variance assumption. In order to do so you must remember one thing – it is the variances of the residuals that are assumed to be equal, not the variances of the Y values. This means that you would need to organize the residuals into groups based upon the values of the factors, one group for each term in the model, and then get an estimate of the variance from each group. For a one-way ANOVA this is not too hard to do. For a two-way ANOVA, it is still not difficult. But, for more than two factors, this becomes increasingly difficult to do. An alternative would be to examine the residuals for extreme outliers, or plot the residuals against each factor as well against the fits. These plots will at least show whether any of the factors have main effects on the variability of the residuals, as well as determining whether the residual variance is correlated with the response.

Also, bear in mind that the p-values in the ANOVA table may be slightly underestimated, and that the cutoff line in the Pareto chart is slightly higher than it should be. In other words, some effects that are observed to be marginally statistically significant could actually be marginally insignificant. In any case where an effect is marginally significant, or marginally insignificant, from a statistical point of view, one should always ask whether it is of practical significance before passing final judgement.
**Considerations for the Equal Variance Assumption**

The use of statistical tools does not follow some exact cookbook. By their very nature, there is always an element of error associated with these tools. The same is true about the assumption of equal variances. Since the Y variable for both the two-sample t-test and ANOVA is the same for all samples, it is not likely that the variances will differ greatly from one sample to another. On the other hand, since one of the goals of a good project is to reduce the variation in Y, extreme differences in variability should be studied to determine why they occurred. It is not uncommon to observe that the variance of Y increases as the mean of Y increases, a condition that can be easily remedied with a data transformation. So, when assessing the validity of the equal variances assumption, keep in mind the points outlined earlier:

1. Robustness when samples sizes are equal.
2. Alternative procedures.
3. Data transformations.
4. The effects of unequal variances.
5. Parametric tests vs non-parametric tests.
6. Changes in variation can be desirable.
Cookbook for Checking the **Equal Variance Assumption**

**Minitab** uses a homogeneity of variance test for assessing equal variances.
To check equal variances for two or more samples of data
1. Go to the Stat menu, select **ANOVA**.
2. Select Test for Equal Variances
3. In the Response box, enter the column containing the Y data.
4. In the Factors box, enter the factor or factors. For a two-sample t-test, the factor is simply an indicator variable which states which sample the each Y value belongs in. For **ANOVA**, there are 2 or more factors.
5. Click Okay.
6. You will see a display as shown below. Check the p-value associated with the Bartlett’s test, Levene’s test, or F test. Also check the confidence intervals for the standard deviations.

Please don\’t hesitate to comment should you have any additional questions regarding the Equal Variance Assumption

## Use of Equal Variance Assumption with ANOVA

*Used with permission from sixsigmaz.com*

## Comentários