Bonferroni Correction

#1
I have been using SPSS for about four months. I am not a statistician. I continue to learn something new every minute that I am working with this program. I am looking at scores for achievement levels based on race. When I run the data, here are my steps:
  1. Analyze
  2. multivariate
  3. DV= Reading, Math Score
  4. IV = Race, Gender, Special Education Status, Dummy Race W, Dummy Race AA
    post hoc = Bonferroni Correction

    Once I run the data, I get .188 significance level for reading and .052 for math on the pairwise comparison for race. If I continue and run a Bonferroni correction, I get .001 for reading and .000 math which of course shows significance.

    But when I run the Dummy Race W and Dummy Race AA, I get a significance level of .027 for reading and .007 for math on the pairwise comparison. These scores are based on estimated marginal means.

    I guess my questions would be, can I run a Bonferroni Correction if no significance is noted on the initial analysis for race. Is it acceptable to divide (disaggregate) the race data to a more specific level, of White and African American, where an initial run demonstrates significance? Are those levels of significance acceptable? When should I use a Bonferroni Correction and when should I not use it?

    If anyone has an answer or assistance, I am wide open for it. Thanks!!
 
#2
Hey,

I'm not too much of a statistician myself. But from I know post-hoc tests are only conducted if you find a significant interaction in an omnibus test (in this case, the ANOVA). The very purpose of doing post-hoc testing is to see where the differences lie, because an ANOVA tells us there is a difference somewhere but we don't know exactly where. So we do follow up analyses or post-hoc tests.

The Bonferroni correction is just a method of controlling for the familywise error rate since the error rate is inflated when one does multiple comparisons (if there are only 2 levels, then there is only 1 pairwise comparison so it's fine, but as the number of the levels increases so do the pairwise comparisons). Also not sure if you aware of the different corrections for multiple comparisons. Some consider Bonferroni to be too conservative.

Hope that helped !!
 
#3
Thank you for your reply. You have confirmed what others have told me. I was not a believer, but here I am. My second part of my question was that I created these dummy variables and when I compare White students to African American students, I come up with the significance that I noted above. Do you see any problems, irregularities with this procedure?

And I know that the Bonferroni Correction is a conservative procedure, but I need people to believe that these differences do not occur by chance. So I chose the most conservative procedure I could find.

Again, thank you for your response.
 
#4
I cant quite gather what your analysis structure is. Is it a 5 way MANOVA ? How many levels are there of each factor ? Consult Andy Field's "Discovering Statistics usign SPSS" book. Very clear and engaging
 
#5
Just ordered the book! Thanks for the tip..... I am looking at two dependent variables and six independent variables; DV math and reading scores. IV Black-White, Male/Female and Special Ed/Non Special Ed.
 
#6
I'm not too familiar with MANOVAs so don't want to mislead you. But I think you have 3 independent variables (or factors) with each having two levels.
 

spunky

Can't make spagetti
#8
I guess my questions would be, can I run a Bonferroni Correction if no significance is noted on the initial analysis for race. Is it acceptable to divide (disaggregate) the race data to a more specific level, of White and African American, where an initial run demonstrates significance? Are those levels of significance acceptable? When should I use a Bonferroni Correction and when should I not use it?
you can certainly do it, but it feels to me as if you're "fishing for significance", as we say in the business. you're profiting from the fact that the initial analysis (which i pressume, as rohanp16 said, is a 3-way MANOVA) has less statistical power than taking just a sub-section of your data and running a far simpler analysis on it.

so sure you can do it, but if i were to read your paper or manuscript or something i would immidiatley say "well... and why did the author ignored all other data?"
 
#9
Spunky,

You are correct. That is the very thing I am trying to avoid. I would like to look at this data in every direction to give an honest assessment of what is happening with these test scores. It is crazy to see students score below one group every year without wondering what is happening in the school system? Without showing significance, no one cares, they believe that the status quo is acceptable when it is not......

A Bonferroni Correction is the most conservative method of significance is it not? Would this not give more integrity to the process if it were used? I am trying to dig as deep into the data as possible without others coming to a conclusion, "oh well, he can make his stats say anything that he may want."

Thank you for your suggestion, it was exactly what I needed to hear!
 

spunky

Can't make spagetti
#10
ok, so let's take it step by step...

you open your file, you run your MANOVA and do you get significance? how do good ol' pillai's trace, wilks' lambda, roy's root and the hotelling-lawley trace look like?

then (if you follow the often-criticised default SPSS method of breaking the MANOVA into several univariate ANOVAs), what happens? how do your significant findings look like?
thing is i am getting a little bit confused on how you described things earlier so i just wanna make sure my advice is as accurate as i can..
 
#11
I am showing you the results that I received after running the MANOVA.

I did not put in the dummy variables.

After looking at the data again, I think I am ok with just running the MANOVA, and conducting a Bonferroni as well without any issues of integrity.

Please do not judge the tables....I have issues with making them..:) Is this enough information?
 

spunky

Can't make spagetti
#12
After looking at the data again, I think I am ok with just running the MANOVA, and conducting a Bonferroni as well without any issues of integrity.
well... i really dunno about that. none of your multivaraite statistical tests are significant (and roy's largest root doesn't really count in this case because SPSS handles it as a lower bound on the signifcance, which isn't particularly useful if you ask me)

you do seem to have some sort effect for caucasians versus african americans there... but my concern is that your effect size is really small... and yeah, i'm aware of the drawbacks that partial eta-squared has but even considering them, it's just really, really small...

what's your sample size, just out of curiosity?
 
#13
Can't I use at pairwise comparisons?

I have attached my sample size and pairwise comparisons for my dummy variables for White and African American.......

I am a lot discouraged........:(:confused:
 
#14
My original work was with one school district, but I have the data for the entire state......I think this is what you are talking about. I have included the Multivariate test and Sample Size........

I believe this will show significance........
 

spunky

Can't make spagetti
#15
oh god... it's a little bit more complicated than i thought. anyways, here's my recommendation.

you simply have WAY too much statistical power. your sample size is more than 4000 and, with those numbers, anything can come out as significant. that's why your effect sizes are so tiny and i couldn't fathom why until i saw that last table you attached. but what's going on here is that, with such a huge sample size, even slight, very, very small differences come out as statistically significant.

another big problem that you have is unequal sample sizes. if you look at the number of white/caucasian students versus the number of african american students, you can see that you have more than 3 times caucasian students than african american students. that could be a reason as for why you're finding statistical significance there.

all in all, my conclusion is that it is very, very difficult to tell whether there is something going on or not. your statistically significant findings could very well be nothing more than a mathematical artifact of such a big sample and how unequal your sample sizes are in tche omparisons you are making. your effect size should be able to point you in that direcion, since less than 0.005% of the variance in your results is explained by race.

now, you also should consider why are you doing this analysis. if this is some sort of school project or homework, you should be able to get away with it and just comment on the differences. but if this is potentially intended to be published or like an official report of sorts, i'm pretty sure your reviewers are going to give you trouble because of the flaws in your design. do keep in mind that, regardless of what this is for, you should mention both the big sample size and the very unequal sample sizes as potential shortcomings.
 
#16
Ok, to recap.......You feel I do not have significance with the first sample size.

The second sample you feel is too large......

Can I use the pairwise comparison in the first group?

Doesn't the SPSS program take into account the differences in sample size?

I appreciate your assistance with this. It allows me to think aloud with these types of problems.......
 

spunky

Can't make spagetti
#17
The second sample you feel is too large......

Can I use the pairwise comparison in the first group?

Doesn't the SPSS program take into account the differences in sample size?
big sample sizes are generally a good thing, but in your case you have very, very small effect sizes so my intent is to ask you to critically appraise your SPSS results. if so little of the variance is being explained by group membership, what would you say is more likely... that there really is a difference among them or that you just have such a huge sample size (5027 if i am correct) that even very small changes in there are staistically significant?

you are correct. SPSS does take into account sample size differences, but unfortunately i've seen people believe that SPSS somehow fixes the problem when, in reality, it just masks it. through the use of type-3 sums of squares (as opposed to type-2) and because SPSS can't explore the data for you, it weights all of your cell means the same way, regardless of the variability in them. everytime you have unequal sample sizes (unless the inequalities are very small or very scattered around) you (or anyone), as a responsible researcher, should have a closer look at the data and be very careful in the analysis. in your case, with regards to your racial groups, your there's more than 3 times caucasian students than african american students in sample 1... just think this one with me logically... wouldn't you expect those to be different just by such a big difference among group numbers? you have more than 250 white students than black students in just one group. what if that black-students group happen to include one or two extreme points that move the mean differences one way or another when compared to white students? because you dont have as many people, you dont have as many data points that would counterbalance that. or, maybe, you do have a difference which is even greater than what you have now... the problem is you dont really know.

you can go ahead and do the pairwise comaprisons with bonferroni adjustment if you want. to me, that does not solve the situation that your results could very well be nothing more than a statistical artifact. i guess you can go ahead with your analysis but just keep in mind that being a responsible researcher means mentioning drawbacks such as this sample situation. i dont know what this particular analysis you are doing is for, but i would feel very worried if people were to quote your results and say something like: "see? white students outperformed black students!!! (i can see that becuase the difference of means for white students is positive) we have the data to show that!!! let's do this or that based on that data!!" when, in fact, you could well only have unequal (and huge) sample sizes and, hence, there may or may not be a difference.

just try to be careful with what you say here... or i guess if you are planning to submit this for a review, dont be surprised if they mention something along these lines and be prepared to justify why your research, in spite of this, can still make a contribution.