Adjust for which confounders

#1
Hello!

I have done a logistic regression to compare pregnancy rates amongst various ordinal categories of a specific blood hormone measurement. From the previous literature, I know that there are at least 5 potential confounding factors and I have data for all of these...

When I look at these confounding variables stratified by each hormone level category, 4 out of 5 are distributed differently amongst groups...so, I'm not sure if I should adjust for all 5 potential confounders or only for the 4 which are different?


Thanks a lot!
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
Hello!

When I look at these confounding variables stratified by each hormone level category, 4 out of 5 are distributed differently amongst groups...so, I'm not sure if I should adjust for all 5 potential confounders or only for the 4 which are different?


Thanks a lot!
What do the associated p-values look like for the interaction terms? So you are saying one of the potential confounders does not seem to have an impact in your full model?
 
#3
Thank you for your response!

From your question I noticed that I wasn't clear in my explanation....sorry for that. Let le clarify:

- Before applying the logistic regression, I did a Kruskall-Wallis + post-hoc analysis to see if any of the 5 potential confounding variables were heterogeneously distributed amongst the ordinal intervals of hormone blood level value. Four out of 5 of these potential confounding variables were indeed present unequally amongst the groups;

- After, I performed a logistic regression to compare pregnancy rates (the binary outcome) according to the ordinal interval of hormone blood values...I used the highest value as a reference value, since the objective of my study was to see if there is a "too low" hormonal threashold that can be as detrimental to pregnancy as "too high" levels.

- I have tried the logistic regression not controlling for confounders, controlling for all potential confounders and controlling only for the 4 unequally distributed confounders...my results are always the same (which is nice for my thesis :)), in which the first and last hormonal levels are equally detrimental to pregnancy achievement (p=0.20). The p-values of the intermediate values, comparing to the last interval vary from 0.001 and 0.04.


My problem is more of a theoretical basis. When I was taught statistics, I learnt that confounders are sought for during the statistical analysis and adjusted for when found. In the medical field though, it is current practice to adjust "ad inicium" for potential confounders which have a physiological rationale behind them, regardless if they are homogeneously present amongst the groups compared or not. The researchers I work with have conflicting opinions on what is better (a>to adjust for all potential confounders, which might cause loss of significance due to over-stratification or b> only to adjust for confounders which are unequally distributed in the comparing groups)...I was curious to see what your input would be.


Thank you a lot!
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
A little more clarification, these potentially confounding variables are ordinal, and are their Type I and Type III effects coming up significant.

Might help to actually post your results (de-identified).

Also, are they still potential confounders, but just underpowered in your study?
 
#5
Here goes:

I'm comparing pregnancy rates (binomial variable - pregnant versus non-pregnant) amongst 6 ordinal regular interval levels of progesterone prior to ovulation (levels 1 to 6). Previous studies have told us that age (which I divided in to ordinal groups of 3 year intervals, <25;25-28;28-31, 31-34 and over 34), concentration of 2 other hormones (FSH and Estradiol, both which I divided in to ordinal equally intervalled groups), stage of the embryo (day 3 or day 5 of evolution) and the amount of embryos (1 or 2...which, although numeric, I considered as non-continuous and ordinal) affect pregnancy outcome also (so, they act as confounders).

When I did the evaluation of the confounders in 6 levels of progesterone, age (continuous variable evaluated using Kruskall-Wallis), levels of FSH (continuous variable evaluated using Kruskall-Wallis), levels of Estradiol (continuous variable evaluated using Kruskall-Wallis) and amount day 5 embryos (pearson Chi2 of a dicotomous Day 3 vs Day 5) were different amongst the 6 levels of progesterone. On the other hand, the amount of patients with 2 embryos was not (pearson chi2... only a minorty, about 18%, of pregnancies were with more then one baby...this is in an IVF clinic, so there are a lot of twins!).

This is what I get if I control for all variables:

progesteone | Odds Ratio chi2 P>chi2 [95% Conf. Interval]
-------------+-------------------------------------------------------------
level 1 | 2.985859 1.68 0.1949 0.525242 16.973782
level 2 | 2.300867 4.13 0.0420 1.006566 5.259458
level 3 | 2.779715 6.48 0.0109 1.222276 6.321657
level 4 | 2.506451 5.30 0.0213 1.115158 5.633550
level 5 | 4.046512 6.61 0.0102 1.275204 12.840501
level 6 | 1.000000 . . . .
---------------------------------------------------------------------------
Score test for trend of odds: chi2(1) = 2.28
Pr>chi2 = 0.1311


Hence, the difference is significant except when comparing levels 6 and 1, but with a CI very close to the unit in group 2.

If I don't control for the amount of embryos, the comparison results are the same, but the odds-ratios and confidence intervals are better (my theory is that since twins are a "rare" event of 18%, the over-stratification "ruins" everything).




Progesterone | Odds Ratio chi2 P>chi2 [95% Conf. Interval]
-------------+-------------------------------------------------------------
level 1 | 4.139169 2.85 0.0916 0.689299 24.855299
level 2 | 2.753272 6.64 0.0100 1.232656 6.149733
level 3 | 2.688303 6.80 0.0091 1.239542 5.830358
level 4 | 2.948541 7.59 0.0059 1.315549 6.608568
level 5 | 4.279392 6.75 0.0094 1.294115 14.151133
level 6 | 1.000000 . . . .


My objective is to show that low progesterone (level 1) is as detrimental as high progesterone (level 6 - the reference group).

Now, the question is...since, in my sample, the amount of twins is "only" 18% and homogenous amongst groups, can I not control for it, even though I know it is a potential confounder?


Again, thanks a lot! Let me know if I wasn't clear.