# Conditional logistic regression, multiple testing.

#### Nisse

##### New Member
Hi everyone!

I have 10 exposure variables that i would like to analyze univariate. I've applied the conditional logistic regression model to each exposure, univariate, in the program R. I then applied the summary function to each model to get the test statistic, confidence interval and p-value. The summary function prints out 3 test statistics and 3 corresponding p-values for each exposure. The output for 2 of 10 exposures looks like;

Call:
coxph(formula = Surv(rep(1, 136L), case) ~ exposure1 + strata(nam),
method = "exact")

n= 120, number of events= 33
(6 observations deleted due to missingness)

coef exp(coef) se(coef) z Pr(>|z|)
-0.1238 0.772 0.4223 -0.281 0.7681

exp(coef) exp(-coef) lower .95 upper .95
0.772 1.126 0.4781 2.432

Rsquare= 0.001 (max possible= 0.496 )
Likelihood ratio test= 0.08 on 1 df, p=0.7684
Wald test = 0.08 on 1 df, p=0.7681
Score (logrank) test = 0.08 on 1 df, p=0.7843

Call:
coxph(formula = Surv(rep(1, 136L), case) ~ exposure2 + strata(nam),
method = "exact")

n= 119, number of events= 31
(7 observations deleted due to missingness)

coef exp(coef) se(coef) z Pr(>|z|)
0.1356 1.2461 0.3897 0.327 0.842

exp(coef) exp(-coef) lower .95 upper .95
1.2461 0.1356 0.6293 2.538

Rsquare= 0.001 (max possible= 0.481 )
Likelihood ratio test= 0.11 on 1 df, p=0.7442
Wald test = 0.11 on 1 df, p=0.8421
Score (logrank) test = 0.11 on 1 df, p=0.7532
.
.
.

And so on...

I want to compare the p-values (and other values) between the 10 exposure variables. My question is: do i need to adjust the p-values? (Multiple testing)

I don't fully understand when to use multiple testing. I read that one should use it when several tests are conducted, to control the 1 error rate. Should i use it in this case?

I would very much appreciate any help or guidance.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
I wouldn't typically use it here, as long as you had reason to examine all of the exposures. If it was more of a fishing expedition and you did not have reason to suspect them, then a correct may be warranted. I typically use a correction when I am examining different subgroups in a group. Tall versus Average, Tall versus Short, Average versus Short.

You are just planning to superficial compare them. You can't make too many conclusions since you did not test them (control/condition) at the same time, so you don't know their mutual relationship with the outcome. Which, with 33 outcomes you would have a sparsity of data in some subgroups and your model may not be statistically powered for that. Though good job using exact methods with your small sample. Lastly, we can see via the model coefficients and Likelihood tests these variables are not that great. Also, keep in mind you may have a different number of missing data in each model (second model has 1 fewer person, but two fewer events????).

#### Nisse

##### New Member

Yes, i didn't think so either. But lets say it was more of a fishing expedition and i wanted to correct with bonferroni. How would i do it?
Im a beginner in multiple testing and my literature only mentions "bonferroni" multiple testing briefly. It only says that one should correct when several tests are conducted. If, for example, one perform 100 hypothesis tests then the adjusted p-value would be (pvalue*100).
In this case the summary function prints 3 test statistics for every exposure, does it mean that i should multiply every p-value with 3?
Maybe i'm wrong, but like i said I'm a beginner .

Yes i noticed it too, i think something went wrong in R when i printed it.

Last edited:

#### hlsmith

##### Less is more. Stay pure. Stay poor.
I am not familiar with this output, but you should only have one p-value per variable. So yes, you would multiply the p-value by the number of tested variables. Which you may put >0.9999 for some of these since they are so large and beyond the bounds of p-values. Also, if you opt to correct the p-values, this would mean that the confidence intervals above would not be correct or adjusted.

Perhaps there is an adjustment option in the program that you can use.

#### Nisse

##### New Member
"Multiply the p-value by the number of tested variables", thats the thing i don't really understand. In each model there is only one exposure variable tested. I run a simple logistic regression model for each of the 10 exposure variables univariate, separately.
So i get 10 models total with only one exposure variable in each model. But i get 3 tests with 3 corresponding p-values for each model/variable;

Likelihood ratio test= 0.08 on 1 df, p=0.7684
Wald test = 0.08 on 1 df, p=0.7681
Score (logrank) test = 0.08 on 1 df, p=0.7843

So i get 10 of these total. Im interested in the wald test p-value (the p-value in the middle) for every exposure variable. So Should i multiply the 10 (wald test) p-value by 3 or by 10?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
These three p-values are all for model fit. The tradition measure of interest is the is the:

z Pr(>|z|) = 0.842.

#### Nisse

##### New Member
Yes, that is true. The p-value from the wald test has always the same value as z Pr(>|z|) , atleast in all of my 10 models.
But should i multiply every p-value ( Pr(>|z|)) with 3 or 10?
In a solution for an identical exercise they present pvalues for every exposure and also the adjusted p-values.
Unfortunately , I have not their data or code. So i can't controll how they calculated it.
I would not have thought that i needed to adjust the p-values if i had not seen the solution for this other exercise. I can't find similar examples..

Last edited:

#### hlsmith

##### Less is more. Stay pure. Stay poor.
the whole correction is optional given a person's situation. if you are confused just don't correct, but explicitly state it in your results, that you did not correct.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
I just wanted to go back to the fishing expedition scenario. Say you find an association between two variables, that association may be significant just by chance or there also could have been an unknown confounder. In the latter scenario, if we do not condition on the confounder variable, then we will have a biased interpretation of the variable of interest. These are two reasons why examining variables without a rationale can provide potential spurious results.