# bonferroni correction in multivariate regression

#### Karabiner

##### TS Contributor
If we accept the idea for a moment that true hypotheses "b=0" exist,
then a singificant F test in mutiple regression suggests that at least
1 regression coefficient deviates from zero. If we have a model with
12 coefficients and 3 of them are "signficant", then 2 of them could
be false-positives (the actual probability that there are false-positives
depends on additional factors, but basically is > 0). So why don't we
have to adjust for multiple testing? Somewhere it was asked why we
have to adjust in ANOVA post hoc tests but not in case of regression
with dummy coding.

With kind regards

K.

#### rogojel

##### TS Contributor
Hi Karabiner,
exactly! That is what is bugging me too.

BTW I am just reading a book http://www.amazon.de/Data-Mining-Bu...=UTF8&qid=1417335786&sr=8-1&keywords=Ledolter

and it has a whole chapter about this problem. The author basically suggest using the lasso regression to select parameters . The pb. is the lasso selected parameters might have a high p-value if the model is checked traditionally - I am not sure what to do then.

I already eliminated the non-significant terms to get a model with low p-values and this seemed to work, but I believe the other alternative is also accceptable, i.e to keep all the terms and screw the p-values.

regards

#### noetsi

##### No cake for spunky
Think of what substantively a slope of 0 means. It means that empirically (and in practice this commonly means correlations not experimental test ) one variable has abolutely no relationship with another variable. It seems unlikely that a researcher would include many if any variables that are not somehow in the same domain in their analysis because placing totally non-sensical variables serves no research purpose and violates the concept of parsimony. Realistically IV and DV will be in the same dimension commonly even if the author feels one will not drive the other. So some correlation is always likely even if not very much.

Also by shear random chance it is likely that there might be some spurious correlation between variables(even if you had a population, but even more so if you have a sample). This might be pure noise, but it would likely occur. This ignores of course measurement error and movement over time which makes the problem worse.

So non-zero slopes are probably unrealistic - certainly in correlational studies which I imagine most regression addresses.

Last edited:

#### Injektilo

##### New Member
@injektilo - I do not see the practical difference. If I can not reject the null hypothesis that would mean, in practice,mthat I have no basis, for example, to request a new investment in a plant for some machine that will control that factor. If I can prove the null hypothesis false, I have the needed arguments to request such an investment.

regards
rogojel
The difference appears like splitting hairs, but it is nonetheless important. The difference is that you have to conclude that you don't have enough information to say the null hypothesis is false, rather than conclude by saying the null hypothesis is true. It's like saying that because you don't have enough evidence to say that something is not equal to 0, then it must be equal to 0. It's a leap in logic.

#### CB

##### Super Moderator
The difference is that you have to conclude that you don't have enough information to say the null hypothesis is false
Yep. The logical argument:
I don't have enough information to reject the null hypothesis. That is, this test statistic would be quite probable if the null hypothesis was true (p > 0.05)
Therefore the null hypothesis is probably true.

Is obviously a fallacy. But what about the other possibility? I.e., What happens if the p value is statistically significant? Then you have this logical argument:

This test statistic would be improbable if the null hypothesis was true (p < 0.05)
Therefore the null hypothesis is probably false

But that's a logical fallacy too... (probabalistic modus tollens).

#### noetsi

##### No cake for spunky
I don't know about baysian approaches, but I was always taught you could never conclude the null was true. It is either rejected or not rejected. That is why researchers set up the alternate hypothesis to test what they really believe is true. They are hoping to reject the null, because if they don't they have learned a lot less than if they do.

Not good statistics, but good if you want to get published since not finding something is a lot harder sell then finding something.

#### CB

##### Super Moderator
I don't know about baysian approaches, but I was always taught you could never conclude the null was true. It is either rejected or not rejected.
It depends a little on the framework. In Fisherian NHST, and the "hybrid" method most people learn now, you can't conclude the null is true.

But in Neyman-Pearson NHST, you can "accept" the null hypothesis if p > alpha. That said, one isn't strictly saying that the null hypothesis is true, just that you have evidence to justify acting as if it were true (i.e., you have grounds for a decision to guide behaviour).

In some Bayesian tests - e.g., some of the Bayes Factor tests being developed at the moment - it is possible to provide evidence to support a point null hypothesis. Which is useful if you're doing something like testing for extrasensory perception, where the exactly null hypothesis being tested might actually be true. But in Bayesian estimation more generally you typically specify a continuous prior probability distribution for the estimated parameters, which means you're implicitly saying that the probability that the null hypothesis is exactly true is zero.

That is why researchers set up the alternate hypothesis to test what they really believe is true.
In obscure tidbits of information for the day: Another way to use NHST is the "strong" form. I.e., You have a theory that makes a quantitative prediction about what the exact true value of the parameter should be (which might not be zero). You then specify this value as the null hypothesis, and see if you can find evidence to reject it. This use of NHST fits more with a Popperian approach to science (i.e, you specify a theory and then try to falsify it). I've never seen it done in practice - theories in psych are almost never specific enough to predict the exact value of a parameter. But apparently this approach has been used in physics.

#### noetsi

##### No cake for spunky
In classes in social science it was drilled into my head over and over again you can only reject the null In no research that I have been involved in, all social sciences, would you reasonably know an exact value for the null.