+ Reply to Thread
Results 1 to 2 of 2

Thread: Multiple testing: Rate of false positives

  1. #1

    Multiple testing: Rate of false positives

    Hello all,

    I hope you're well. I have an elementary question that I hope someone can help me with.

    I am reading this protocol:

    Love MI, Anders S, Kim V and Huber W. RNA-Seq workflow: gene-level exploratory analyses and differential expression [version 1; referees: 2 approved]. F1000Research 2015 4:1070.

    There, I have reached this paragraph:

    "In high-throughput biology, we are careful to not use the p values directly as evidence against the null, but to correct for multiple testing. What would happen if we were to simply threshold the p values at a low value, say 0.05? There are 5722 genes with a p value below 0.05 among our 29391 genes [This result comes from an analysis they did previously], for which the test succeeded in reporting a p value.

    Now, assume for a moment that the null hypothesis is true for all genes, i.e., no gene is affected by the treatment with dexamethasone. Then, by the definition of the p value, we expect up to 5% of the genes to have a p value below 0.05. This amounts to 1470 genes. If we just considered the list of genes with a p value below 0.05 as differentially expressed, this list should therefore be expected to contain up to 1470/5722 = 26% false positives."

    What I don't understand is the last result: How does 1470 (expected p < 0.05) / 5722 (obtained p < 0.05 ) define the false positive fraction? I would have thought, intuitively, that the false positives should be the other 74% "non-expected" ones.

    Can anyone help me understand this, please? I would be very thankful


  2. #2
    Omega Contributor
    Points: 38,432, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Not Ames, IA
    Thanked 1,186 Times in 1,147 Posts

    Re: Multiple testing: Rate of false positives

    Yeah, we should all give up because these terms are always couched in misuse are confusing definitions.

    I understand what you are saying, the way they define things 1470 would be by chance sample selection, false positives. So if that is what you expect, then (5722 - 1470) / 5722 are your true significant findings. But they say the null is actually true, so is the counterpart false positives. The writing is just confusing enough, that I don't know.
    Stop cowardice, ban guns!

+ Reply to Thread


Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Advertise on Talk Stats