Normality and Homogeneity Testing

#1
Okay, here's the scenario: We have a study with a negative control group, a solvent control group, and 5 treated groups. The statistical protocol calls for: 1) pooling the control groups for comparison to the treated groups, and 2) comparing the treated groups to the solvent control alone. The first thing we do is run a Shapiro-Wilk's test for normality and Bartlett's for homogeneity, then move on to ANOVA and Bonferroni. This is done using the entire data set with pooled controls. For the next analysis (to solvent control only), the negative control data is removed from the data set; so now we've got the solvent control and 5 treated groups. The analysis is done by going directly to an ANOVA, with no Shapiro-Wilk's or Bartlett's to check for normality and homogeneity. The reasoning is that if the data set is normal and homogeneous with the negative control data included, then it is not necessary to check it again after this data is removed from the set; it will be normal and homogeneous without it too. Is this legit? I have my doubts, but the powers that be are telling me, "Ahhh..no problem". Opinions? Thanks.
 

JohnM

TS Contributor
#2
Should be OK.

ANOVA is a pretty robust procedure, unless both the normality and homogeneity of variance assumptions are violated. If it's just one, then it's usually not a problem - if it's both, then you need to transform data, look for outliers, etc.
 
#3
John

John:

Thanks. I pretty much figured ANOVA would be OK with it. I guess my main concern had to do with the logic behind it. I mean, just because a data set is shown to be normal and homogeneous, it is a valid leap (statistically speaking, not intuitively) to make the conclusion that subsets of it will also be normally distributed and homogeneous? It seems to me that once you start removing data points or groups of data points, you've created a whole new set of values with its own characteristics and qualities. Or am I being too picky?
 

JohnM

TS Contributor
#4
The whole data set itself, ignoring "treatment group membership," does not need to be from a normally distributed population - it's the groups...

If the whole data set was assumed to come from the same normal population, then we would very rarely see significant effects...
 
#5
John

John:

Excuse my dunce-ness. So...when we're testing for normality (e.g. Shapiro-Wilk's), we're really checking that the individual groups are normally distributed, not the whole set of data points from ALL the groups??? And if the critical value is exceeded, it means that (at least) one of the GROUPS is not normally distributed?

Does the same hold true for homogeneity of variance?
 

JohnM

TS Contributor
#6
Each group should be separately tested for normality. There's no way to "collectively" test all of the groups for normality.

For homogeneity of variance, the variances for each group should be approximately equal.
 

mcw

New Member
#7
Hi

Just to tag onto this thread.. My data is ordinal, so I plan to use a non-parametric test (Wilcoxon-signed ranks) but am aware that the homogeneity of variance assumption should not be violated. What is the best test for checking for homogeneity of variance across two groups (i.e. pre and post test)? I'm not sure, but I think that the Levene test can be computed as part of a two sample t-test in SPSS, however is there a better method?

Thanks in advance.
 

mcw

New Member
#8
Tests for homogeneity of variance

Ok. Sorry. Just found reference saying that these particular non-parametric tests don't in fact need to meet the equal variances assumption :shakehead, so will presume that to be the case...Thanks again.