Multiple hypothesis testing

noetsi

Fortran must die
#1
I am reading one of those books that make my head hurt, because it raises many issues and disagrees with what to me is established views on a topic (that is things I think are settled).

A concern is sometimes expressed that if you test a large number of hypotheses, then your bound to reject some [even if they are right].....From out data analysis perspective, however, we are not concerned about multiple comparisons [and thus about corrections like Bonferonni]. For one thing, we almost never expect any of our 'point null hypotheses' (that is hypotheses that a parameter equals zero, or that two parameters are equal) to be true, and so we are not particularly worried about the possibility of rejecting them too often.....There is no need to correct for the multiplicity of tests if we accept that they will be mistaken on occasion."
That just seems wrong:p

They go on to comment on something I had long wondered about.

"The second problem [with statistical significance] is that changes in statistical significance are not themselves significant. By this, we are not merely making the commonplace observation any particular threshold is arbitrary[so a 5 percent significance level is really not that different than a 4.9 percent level].....Rather we are pointing out that even large changes in significance levels can correspond to small, nonsignficant changes in the underlying variable."
 
Last edited:

Miner

TS Contributor
#2
They go on to comment on something I had long wondered about.

"The second problem [with statistical significance] is that changes in statistical significance are not themselves significant. By this, we are not merely making the commonplace observation any particular threshold is arbitrary[so a 5 percent significance level is really not that different than a 4.9 percent level].....Rather we are pointing out that even large changes in significance levels can correspond to small, nonsignficant changes in the underlying variable."
I would phrase this differently. Statistical significance does not link directly to the effect size. Therefore, you could have a p-value of .049 with a very large effect (due to small sample size or large variation within groups), and another p-value of 0.001 with a very small effect (due to large sample sizes and small variation within groups).
 

noetsi

Fortran must die
#3
That makes sense miner although I am not sure the authors agree or are making that point. The sense I get is they see significance levels generally as unimportant. Its the same issue of, assuming statistical power was not an issue, if a lower p value means something is more certain. So if you get a p value of .02 you are much certain the null should be rejected than if it was .04. Or worse that this says anything about relative impact or importance of effect size.

Having said that I, following a suggestion I found in the literature, use the size of the wald statistic in logistic regression to rank the relative impact of predictors so I violate my own rule (the wald statistic used to do the statistical test of predictors that is).
 

hlsmith

Not a robit
#4
I will just chime in on the latter quote, in particular it is saying p-values don't tell you the direction or magnitude of effects.
 

CowboyBear

Super Moderator
#5
That just seems wrong:p
There's lots of debate about whether corrections for familywise Type 1 error rate such as the Bonferroni are necessary (e.g., Google Scholar search). I don't think it's a particularly resolvable debate - or one where you can say that one position is "wrong". I say that because ultimately deciding whether a correction is necessary depends on what risks you are willing to take when publishing findings (or more specifically, the relative costs of Type 1 and Type 2 errors, and by extension what risk of committing each you're willing to live with). It's kinda like asking if someone is "wrong" to try sky-diving - it depends entirely on how much you value the fun of jumping out of a plane vs. the risk of ending up squashed on the ground.

Can you cite the source when giving quotes please?
 

noetsi

Fortran must die
#6
There's lots of debate about whether corrections for familywise Type 1 error rate such as the Bonferroni are necessary (e.g., Google Scholar search). I don't think it's a particularly resolvable debate - or one where you can say that one position is "wrong". I say that because ultimately deciding whether a correction is necessary depends on what risks you are willing to take when publishing findings (or more specifically, the relative costs of Type 1 and Type 2 errors, and by extension what risk of committing each you're willing to live with). It's kinda like asking if someone is "wrong" to try sky-diving - it depends entirely on how much you value the fun of jumping out of a plane vs. the risk of ending up squashed on the ground.

Can you cite the source when giving quotes please?
Its not from an on line source. It is from the book "Data Analysis Using Regression and Multilevel/Hierarchical Models" by Andrew Gelman and Jennifer Hill. I usually don't cite sources that are not online because I don't think any are going to get a book to look at the comments.

It is one of the early chapters although I did not write the page number down so I am not sure where specifically it is sited. BTW although I have just starting reading this I think its excellent although pieces go way over my head (probably won't be an issue for more sophisticated readers here).

I do accept the logic that corrections are necessary - which seems to me to be the position theoretically of most authors (certainly I had not run into a previous writer who suggested it was not desirable). Thus my comment that not doing it is wrong.
 

hlsmith

Not a robit
#7
I didn't want to do this, because the last thing you need is more information to reference on how you think all stats is lumpy and not standardized or unified in ideology, but at the below link (posted yesterday) is Gelman talking about pvalues for over an hour. Knock yourself out:


http://www.econtalk.org/
 

CowboyBear

Super Moderator
#8
I do accept the logic that corrections are necessary - which seems to me to be the position theoretically of most authors (certainly I had not run into a previous writer who suggested it was not desirable).
Well... click the Google Scholar search link I showed you. Lots of people disagree that corrections are necessary. It's a contentious issue.
 

Miner

TS Contributor
#9
Another thing to consider is the author. Andrew Gelman is a Bayesian, so he doesn't buy into the concept of p-values in the first place. Second, he focuses on what he terms Type M (errors in magnitude) and Type S (errors in sign or direction) errors. Both of these will influence his views on frequentist statistics. This isn't to say he's wrong. I've found him to be very insightful on many topics, but he is definitely approaching it from a non-frequentist perspective.