Bonferoni test

#1
In my example, I have about 50 statistical analysis, so is it feasible to use bonferoni test in this case?. 05/50 will be very very small value and it will be impossible for one type of algorithm to significantly outperform the other. Thanks
 

Karabiner

TS Contributor
#2
In my example, I have about 50 statistical analysis, so is it feasible to use bonferoni test in this case?.
Well, it depends.
05/50 will be very very small value
It depends. In some genome studies, for example, this would be a big value.
and it will be impossible for one type of algorithm to significantly outperform the other. Thanks
Would this be a problem for you?

Maybe some information about your studiy (topic, research questions, study design, sample size, practical and/or
theoretical relevance) would be useful.

With kind regards

Karabiner
 
#3
Well, it depends.

It depends. In some genome studies, for example, this would be a big value.

Would this be a problem for you?

Maybe some information about your studiy (topic, research questions, study design, sample size, practical and/or
theoretical relevance) would be useful.

With kind regards

Karabiner
Thanks for your reply.

Yes it would be a problem. For example, algorithm A significantly perform better than B with the p value of 0.0001, it means algorithm A is quite better than B, but after the bonferoni analysis, we would end with no algorithm performed better than the other.
 

Karabiner

TS Contributor
#5
Yes it would be a problem. For example, algorithm A significantly perform better than B with the p value of 0.0001, it means algorithm A is quite better than B,
No, it just tells you that you can reject the Null hypothesis "the difference between A and B is = 0.00000000000000000000000000".
Small p values do not indicate a large effect. Usually, they are due to large sample sizes.

Maybe some information about your study (topic, research questions, study design, sample size, practical and/or
theoretical relevance) would be useful.

With kind regards

Karabiner
 
Last edited:

ondansetron

TS Contributor
#6
No, it just tells you that you can reject the Null hypothesis "the difference between A and B is = 0.00000000000000000000000000".
Small p values do not indicate a large effect. Usually, they are due to large sample sizes.

Maybe some information about your studiy (topic, research questions, study design, sample size, practical and/or
theoretical relevance) would be useful.

With kind regards

Karabiner
Just reposting this because anyone who reads now in the future should see the emphasis that p-values tell you nothing about "A is quite better than B."
 

ondansetron

TS Contributor
#8
Indeed, but it explains at least A and B have significant difference
This is a pretty useless thing to explain, in general. P-values have limited information to convey and it's a misconception that "significance" is some targeted endpoint with tons of value.

It also sounds like in your OP that your goal is to have something be significant since your concern is that [one won't be able to outperform the other] if you use a smaller alpha level per test. This should not be your goal.
 
#9
This is a pretty useless thing to explain, in general. P-values have limited information to convey and it's a misconception that "significance" is some targeted endpoint with tons of value.

It also sounds like in your OP that your goal is to have something be significant since your concern is that [one won't be able to outperform the other] if you use a smaller alpha level per test. This should not be your goal.
If not p values, then what is the alternative? How can we perform analysis for significant differences?
 

Karabiner

TS Contributor
#10
Due to the nearly complete lack of information about the study, we don't know the research design; not even the scale level of the dependent variable; or why and what for the study is undertaken. It is difficult to suggest solutions if the problem is described so poorly.

Maybe you can perform all comparisons in one analysis (perhaps repeated measures ANOVA or mixed ANOVA or multilevel modeling, if the dependent variable is interval scaled), and attach 95% confidence intervals to the estimated parameters. Such confidence intervals will give you an impression about how reliable the estimations are.

What you then consider a "significant" difference (in the sense of important/relevant/remarkable..., I suppose?) will be up to your own judgement. No statistical procedure can take this task off you.

With kind regards

Karabiner
 
Last edited:
#13
Are you evaluating two treatments or are you evaluating two algorithms?
Algorithms.. . I am working on software development effort estimation where different algorithms like linear regression, support vector regression are applied to perform predictions. Data generated is not normally distributed. I performed Wilcox test to get p values.
3 algorithms compared with each other to find its predictive accuracy. These comparisons are repeated 4 times for 4 different datasets. So for each dataset, the comparisons are 3, but overall the comparisons are 12 i.e. 3 algorithms * 4 datasets.
Now is it possible I divide the 0.05 by 3 (comparison for each dataset) rather than 0.05/12 (comparison for all datasets and algorithms).
Thanks for understanding
 
#14
If you have two treatments and one algorithm that compute the median and an other algorithm that computes the mean. Do you say that if the mean algorithm computes a smaller p-value (in comparing the two treatments) that you then have "shown" that the mean algorithm is "better"?

I hope you agree that this is absurd.

Usually one creates a model, and the model should fit the data.

Then you choose an estimator that is appropriate, e.g. least squares or maximum likelihood.

Then you choose an algorithm tha can compute the estimator.

Of course you can call all three steps an "algorithm" but the data must still fit the model and the estimator must be relevant.

- - -

Besides, four different data sets are not much. And you cant really define a population from which the data sets are taken from. So what are you doing inference about?