An Introduction to ANOVA

Analysis of variance, or ANOVA, is a statistical technique to find if there is significant difference or change between means of two or many comparison groups.

There are various types of ANOVA tests, the common ones are: One-Way ANOVA, Two-Way ANOVA and N-Way ANOVA. One-Way ANOVA is used to test differences between groups based on one independent variable. Two-Way ANOVA is used when there are two independent variables. N-Way ANOVA, you can guess from its name, is used when there are more than two independent variables.

ANOVA tests the null hypothesis:

The alternative hypothesis is:

The formula for One-Way ANOVA is shown in ANOVA table as follows

Source of Variation Sum of Squares Degree of Freedom Mean Squares (MS) F
Between SSB = Σ j=1 k n j ( j - ) 2 df 1 = k - 1 MSB = SSB df 1 F = MSB MSE
Error SSE = Σ j=1 k Σ i=1 n j ( X ij - j ) 2 df 2 = n - k MSE = SSE df 2
Total SST = Σ j=1 k Σ i=1 n j ( X ij - ) 2 df = n - 1
where
Xij is individual observation.
j is sample mean of the jth group.
is overall sample mean.
k is the number of independent comparison groups.
nj is the number of observations or sample size in jth group.
n is total number of observations or total sample size.

F is the ratio of the mean squares between groups (MSB) to the mean squares error (MSE). Another way to look at this is

F ratio is used to determine if we shall accept or reject null hypothesis. Under the null hypothesis, the two variations are expected to be roughly equal which produces F-statistic close to 1. A larger F ratio indicates the variation between sample means is greater than the variation within the samples thus, an indication of the evidence that there is a difference between the group means.

There are three important ANOVA assumptions:

Under these assumptions, the F ratio follows F statistic distribution. With the distribution, we can calculate the probability of observing an F-statistic that is at least as high as the value we obtained. This probability is known as the p-value and is the probability that we reject the null hypothesis when it is true.

Usually we need a small p-value to safely reject the null hypothesis. A typical level used is 0.05, which means, on average, a 1 in 20 chance that we reject the null hypothesis when it is in fact true.

< Previous Next >