Let's practice this.

The social class variable we have been considering has four levels.

That is, k = 4, and if the original alpha was .05, what

is going to be the modified significance level for the multiple comparisons test?

Each group needs to be compared to three other groups,

resulting in a total of 12 comparisons.

But, if you have already compared group one to group two, say middle class to

lower class, you don't need to go back around and compare lower to middle again.

So, the total number of comparisons can be cut in half.

Hence, the formula k times k minus one over two,

which results in six total comparisons, and

we can use this number to correct our significance level down to 0.0083.

This is the significance double we're going to use for the pairwise comparisons,

to see if two pairs of means are different from each other.

There are a couple other considerations when doing

these multiple comparisons after ANOVA.

First, it's related to the constant variance condition.

Since for

ANOVA, we need to meet this condition, we need to now rethink the standard error and

the degrees of freedom to be used in the multiple comparisons test.

And, of course, now we have a new modified significance level that we're going to

compare to the p-values of these tests to determine significance.

So what are these consistent standard error and

degrees of freedoms we need to use?

The formula for the standard error should seem familiar to you,

except that instead of the individual group variances,

we're actually using mean squared error from the ANOVA table.

Remember, the mean's squared error is actually the average within group

variance, so

we're still getting at the same thing, the individual group variances, but now,

we have a consistent measure that we can use for all of the tests.

If, indeed, the constant variance condition is satisfied,

this value should be very close to your group variances anyway.

And the consistent degrees of freedom is going to be the DF

error from the ANOVA output, as opposed to the minimum of the sample sizes minus one,

from the two groups that we're comparing.

So let's put all of this information together.

Pick one of the pairs of groups and do the comparison.

Is there a significant difference between the average vocabulary scores

between middle and lower class Americans?

Our hypotheses are, in the null hypothesis, there is no difference,

the averages are equal, and the averages are different for

the alternative hypothesis.

We calculate our T-score as the difference between the two group means,

minus zero, our null value, divided by the standard error,

calculating the mean square error from the ANOVA output.

That is 6.76 for the average middle class score, minus 5.07 for

the lower class score, divided by the mean square error,

which yields a point estimate of 1.69, and

a standard error of 0.315, for an overall T-score of 5.365.

And the degrees of freedom comes from the ANOVA table, 791.

Let's clean up our plate here, write down our test statistic and

the degrees of freedom, and

before we can finally get to the p-value, remember to always draw the curve.

So here is our T-distribution with 791 degrees of freedom, which by the way,

remember, is going to be just like the normal distribution at this point,

because of the really high degrees of freedom.

And a T-score that's so high,

that's five standard deviations away from the center,

is going to result in a really, tiny, tail area, because it's really unusual

to get an observation that's more than five standard deviations from the mean.

It's also clear from the sketch with the really skinny

tail areas when you get that far from the center.

We can also use r for this.

We use the pt function, and remember to multiply one of your tail areas by

2 to account for the 2-sided alternative hypothesis, and

also remember that the significance level we're going to use for

this test is that modified significance level we calculated earlier.

So even though that was very conservative and stringent, because we

have a pretty tiny p-value, we actually can reject the null hypothesis again.