What is n in bonferoni test

#1
I want to compare and calculate the prediction accuracy of 4 algorithms, linear regression, support vector regression, bagging and multilayer perception. So each algorithm will be compared with other three. I have to evaluate their performance with 4 different datasets, so in total 3 algorithms * 4 datasets = 12 experiments for each algorithm. I have to perform the bonferoni analysis after the p values are calculated, so my question is, for. 05/n, what will be the n? It will be 3 (because each algorithm will be compared with other 3 algorithms) or it should be 12, for all algorithms * all datasets. Thanks
 

obh

Active Member
#2
Hi Jave,

n is the number of tests you perform.
I think it is better you use the Sidak Correction (similar but more accurate and a bit more power)
 
#3
Hi Jave,

n is the number of tests you perform.
I think it is better you use the Sidak Correction (similar but more accurate and a bit more power)
Thanks obh, I will definitely look into it for my next project. Currently, I need to use bonferoni as it is a requirement. I want to ask here that do I need to divide the. 05 by 3 or by 12, based on the situation I explained in my post. Regards
 

obh

Active Member
#4
Hi Jave,

What test do you want to run?
How many tests?

For example, if you compare 4 algorithms: A, B, C, D:

If you run the following tests:
A-B (for example t-test to compare A average to B average)
A-C
A-D
B-C
B-D
C-D

In this example case, you run 6 comparisons so n=6.
 
#5
Hi Jave,

What test do you want to run?
How many tests?

For example, if you compare 4 algorithms: A, B, C, D:

If you run the following tests:
A-B (for example t-test to compare A average to B average)
A-C
A-D
B-C
B-D
C-D

In this example case, you run 6 comparisons so n=6.
Thanks again for your help.

I run A with B, A with C and A with D for dataset 1.
Then A with B, A with C and A with D for dataset 2. Similarly, for 4 datasets. So it means for a particular algorithm A, I will have 3*4=12 experiments, so n will be 12,right?
First I thought, n will be 3, because for each dataset, A is compared with 3 algorithms.
 

obh

Active Member
#6
Correct.

What test do you run for each pair?

If for example, you use a significant level of 0.05 in each test
As the end in each test (pair) you run the allowed probability for type I error is 0.05 but the potential maximum allowed probability in all the test
So the probability not to get a type I error in one test is: (1-0,05)=0.95
The probability not to get a type I error in all the tests is 0.95^12 (but this is the worst case that h0 is correct in all the tests)
together is α'=1 - (1 - 0.05)^12=0.4596

If you use α=1−(1−0.05)^(1/12)=0.004265
The overall α'=0.05

Bonferroni was a bit lazy (just for the joke) and instead of the exact calculation used an approximation 0.05/12=0.00416
He didn't have a computer so it is much easier to calculate manually using the approximation.
This value is a bit small then I calculated.

But when talking about Bonferroni it is generally the "method", I would still use the accurate calculation.
http://www.statskingdom.com/doc_anova.html#sidak