+ Reply to Thread
Results 1 to 5 of 5

Thread: Bonferroni correction in correlations: is it necessary?

  1. #1
    Points: 2,494, Level: 30
    Level completed: 30%, Points required for next Level: 106

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Bonferroni correction in correlations: is it necessary?




    Hi all!
    I am new here and a beginner in statistics, so please help with this one:

    My objective is to classify texts in two classes, fiction and non-fiction, i.e. yes-no class as the dependent variable (DV). I use a large number of independent variables (IV), 100 continuous IVs, which describe textual characteristics, e.g. sentence and word length. I have a training set of 400 texts and use discriminant analysis.

    As a preliminary step and in order to reduce the number of IVs entered in discriminant analysis, I investigated the correlation of each IV to the DV and reported the biserial correlation coefficient and p-level. So I decided to exclude from further investigation the IVs with p-value > .05. Here lies my problem.

    My tutor thinks that this p-value is prone to type I errors, since I report 100 correlations (each IV with the DV), and suggested to use a cut-off point of 5%/100=.0005 (Bonferroni correction).

    My question is: Is Bonferroni correction necessary in correlations?

    My opinion is that this is too strict a criterion in the above problem. All my IVs are theoretically justified, itís not like Iíve been fishing for variables out there. My classification model is not directly affected, since this is only a preliminary step to reduce the number of variables entered in discriminant analysis.

    So, do I have to use the Bonferroni correction to report the p-value of the correlations??? From what Iíve read so far, Bonferroni correction is important sometimes in biomedical research and in different statistical tests (e.g. t-test). Is it really crucial for my problem???

    What do you think? Any ideas would be much appreciated. Thank you!

  2. #2
    TS Contributor
    Points: 10,867, Level: 69
    Level completed: 5%, Points required for next Level: 383
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    1,517
    Thanks
    64
    Thanked 263 Times in 194 Posts
    Hi there! Interesting question and an interesting project

    The Bonferroni correction issue is a very contentious one. As far as I know it is as appropriate to correlations as it so studies of group differences etc - but is it appropriate at all? If you run a search on Google Scholar you will probably find a lot of articles offering various sides of the debate.

    A (very) quick summary thereof as I understand the pros and cons:

    Advantage: Use of the Bonferroni correction can avoid situations such as an author examining 20 relationships, finding one that is significant at the 0.05 level, and claiming this indicates an important relationship (although a p value of 0.05 indicates a one in chance of a Type 1 error, so given no actual relationships in the population one would still expect one "significant" correlation). This doesn't stop unscrupulous authors from running the above analysis and then pretending that the "significant" relationship was the only one they were looking at all along.
    In general the correction is a (yes, very) strict way to help avoid Type 1 errors.

    Disadvantages:
    -Use of the correction creates a paradox wherein if the above example involved 20 different studies each looking at one of the examined correlations, it'd be perfectly acceptable for each author to not use a Bonferroni correction (indeed, how could they?) - but if one author examines all the relationships at once, he/she is meant to use the correction!
    -Use of the correction may exacerbate the problem of publication bias - cause more studies to have "insignificant" findings and therefore not be published, resulting in a greater literature bias towards larger effects and inaccurate metaanalytic findings.
    -Much reduced statistical power

    A couple of useful articles arguing against the Bonferroni correction:
    Nakagawa, 2004
    Perneger, 1998

    Overall, the question avoiding occasional Type 1 errors really worth a much inflated Type 2 error rate? Using the correction helps ensure no Type 1 error is made - but also means you have a much greater chance of ignoring a relationship that is real (and maybe even important!) Imho the potential publication bias resulting from Bonferroni corrections making findings "insignificant" and the resulting bias of effect sizes in the literature as a whole is a far more worrisome problem than single studies making possible Type 1 errors. (Admittedly I may not quite be NPOV here!)

    However... This may all be a bit academic, because actually I don't think looking at the bivariate zero-order correlations is the right way to select IV's to generate your final discriminant function. Why not use the discriminant analysis itself to decide which IV's to keep? As far as I know you can run DA's with all the usual types of statistical variable retention criteria - backward, forward, and stepwise entry, with a choice of alpha levels. This way you can select IV's that offer a statistically significant contribution to discriminating between the DV groups in the multivariate model. I suppose one could theoretically use a Bonferroni correction here to make your entry/removal criteria more stringent (can you? would you? anybody?) but I've never seen it done - Bonferroni corrections seem to come up more in bivariate analyses.

    Hope this helps By the way - what is the actual theoretical bent of the analysis? I'm sure you're not actually trying to develop a measure to help librarians to decide which section books belong in
    Last edited by CowboyBear; 07-19-2009 at 09:19 PM. Reason: Terms cleanup, formatting, added refs.

  3. #3
    Points: 2,494, Level: 30
    Level completed: 30%, Points required for next Level: 106

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks a million for such a thorough reply!
    I finally decided to drop the bonferroni correction. I had also tried entering all the IVs in discriminant analysis, but the model created could not be intuitively/theoritically well justified. That's why I decided to use the preliminary step and choose IV's that are both statistically meaninful and theoritically better expained.
    And no, the objective was not to make the lives of librairians more easy
    It's mainly a psycho-linguistic inquiry about surface text features and a writers' choices when writing both fiction and non-fiction...

    Again thanks CowboyBear!

  4. #4
    Points: 4,356, Level: 42
    Level completed: 3%, Points required for next Level: 194

    Location
    Switzerland
    Posts
    26
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I wonder why you don't use logistisc regression (LR). The advantage of LR is that you don't have so strong assumptions as in DA (e.g. normality) and often software tools have implemented variable selection algorithm like AIC or BIC which you can use for LR-Models. I don't know if they are also available for DA.

  5. #5
    TS Contributor
    Points: 10,867, Level: 69
    Level completed: 5%, Points required for next Level: 383
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    1,517
    Thanks
    64
    Thanked 263 Times in 194 Posts

    Quote Originally Posted by giordano View Post
    I wonder why you don't use logistisc regression (LR). The advantage of LR is that you don't have so strong assumptions as in DA (e.g. normality) and often software tools have implemented variable selection algorithm like AIC or BIC which you can use for LR-Models. I don't know if they are also available for DA.
    This is a good point. The DA will be more statistically powerful IF its plethora of distributional assumptions are satisfied, but otherwise binary logistic regression will do the same thing while making fewer demands on the data.

    Discriminant analysis assumptions here:
    http://faculty.chass.ncsu.edu/garson...rim.htm#assume

+ Reply to Thread

           




Similar Threads

  1. Calculating Bonferroni correction
    By ctobola in forum Psychology Statistics
    Replies: 5
    Last Post: 10-17-2012, 07:47 AM
  2. Bonferroni Correction
    By BarriettSmith in forum SPSS
    Replies: 17
    Last Post: 04-17-2011, 11:16 PM
  3. Replies: 4
    Last Post: 01-18-2011, 09:45 PM
  4. Bonferroni correction
    By scs in forum Statistics
    Replies: 2
    Last Post: 08-04-2007, 02:49 AM
  5. Bonferroni Correction
    By e.stevens23 in forum Statistics
    Replies: 0
    Last Post: 04-20-2007, 06:38 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats