Also, I meant for the title to say "Test for linear trend in odds ratio in logistic regression"
I often see epidemiological studies where a continuous variable has been divided into quantiles and the odds ratios for each quantile compared with a reference quantile are reported. In many of these studies the authors report a test for linear trend in the odds ratios, often termed a "P-trend." While most studies don't indicate what methods they used to determine this value, I've seen some studies create a new variable that is the median value of each quantile and use this variable in the regression to determine "P-trend." For example, the five level categorical variable would be (5 4 3 2 1) and the new variable would be (56.1 45.6 23.3 12.6 7.5). I haven't found any papers describing this method and so I'm not convinced it's appropriate, but I wanted to see if anyone else had come across this or could offer any input.
Also, I meant for the title to say "Test for linear trend in odds ratio in logistic regression"
Good question. I commonly see this as well and had wondered what test they were running.
I know you can plot these as well as splines but don't know what they use for the test. A work around I have used is pairwise comparisons or a trend test in an effect statement for a categorical variable, but neither are what they are likely doing.
Stop cowardice, ban guns!
I've also read about comparing the deviance between the model with the variable as a categorical predictor and the model with the variable as a continuous predictor, but I've yet to use this method. I find it irritating that author's don't describe their methods in detail, which seems fairly common.
Saw this last night when I was researching your other question on interaction of binary and 4 category variables.
"The problem with leaving it as categorical is that you'll get a test with 4df that will have less power. In epidemiology it's common to perform a 'test for trend' by treating an ordinal covariate as continuous, but then to report the estimates and CIs for the model with it categorical. – onestop Jan 16 '11 at 14:07
"
Located halfway down the page at: http://stats.stackexchange.com/quest...gorical-factor
Not completely sure what procedures they are referencing in particular.
Stop cowardice, ban guns!
Thanks for the link, it sounds similar to the test in "Statistics for Epidemiology" by Nicholas Jewell, in which you run the logistic regression model but exclude the predictor from the class statement and use the P-value for the predictor as a test for linear trend. I'm not quite sure how exactly that corresponds to testing for a linear trend, Jewell notes on page 227 that it "almost exactly replicate[s] the (linear) test for trend Chi-square statistic."
hlsmith (08-14-2014)
Will check that reference out.
Stop cowardice, ban guns!
Apparently the equivalent Chi-squared trend test he refers to is on page 166:
where is the number of observations at exposure level , is the number of diseased at the level of exposure, is the number of non-diseased, the number of diseased, the row total at the level of exposure, and the sample size
Last edited by Disvengeance; 08-15-2014 at 01:22 PM.
I am not really sure about what the original poster asked about, but to test that linear relation I would use a likelihood ratio test.
If a model has, for example an income variable in 5 levels, then a constrained model could be:
log(p/(1-p) = b0 + b1*income
that is, include income as a "regression variable". The logit would then be constrained to be linear in income. An alternative model, and un-constrained model would be
lot(p/(1-p)) = bo + b1*D1 + b2*D2 + b3*D3 + b4*D4
where the D:s are dummys for the income category. In this model the different b:s would allow for that the effect to not be on the line. So a likelihood ratio test could be:
-2*[logL(constrained) - logL(unconstrained) ] is chi-squared with degrees of freedom as the difference of the number of estimated parameters and where logL means the log-likelihood for respective model.
I think that it is reasonable to check if there is a linear relation in the restricted model and not just assume that. But it seems like, from the links hlsmith provided that it can cause some irritation.
Maybe the formula Disvengeance gave above is a special case of the likelihood ratio test. I can't see through that.
I have read a little about the likelihood ratio test for a linear trend, though I don't see it used that often or authors are just not explicitly stating that. Is there a way to use such a test to determine whether there is a linear relationship for levels of another variable in the regression model? Such as in the example you provided, is it possible to use the likelihood ratio test to determine if there is a linear trend in income in both males and females if there is an indicator variable for sex in the model? Another thing, why not just use the test for when the variable is included in the model as a categorical predictor?
Yeah, I know this is an old post, but the gender example in post #10 seems like it could be examined as an interaction. So perhaps if the variable was deemed linear, then you could follow-up with another -2LL test adding the interaction term between the linear term (not in a class statement) and gender.
If the chi-sq is significant then adding the interaction term helped.
I have an idea how else to do this with coding, but need to look something up.
Tweet |