Graphically check linearity in logit regression

What is the link test in SAS and what do you mean by RHS.
The basic idea of a linktest is that if your model is properly specified, then, regressing your independent variable on the prediction of the model plus its squared will yield that the latter has no explanatory power


I am using SAS for my logistic regression analysis and don't seem to know the code for the boxtid test. I already did the link test and it shows that I have a specification error in my model, but I also do not know which RHS variable has the issue or which variables i should interact. Any help?

A bit out of context given that, as I understand, this is the subforum for Stata and a thread about graphing linear relationships :p Anyways, the best thing you can do is following the theory on your subject matter.
Hi all, I am performing a random effect logistic regression and I wanted to check for the linearity. I wanted to know whether it is possible to perform box-tidwell test for already log transformed independent variables. I originally log transformed my independent variables as they had a skewed distribution. Thank you all in advance.


Less is more. Stay pure. Stay poor.
Well, I took some time to investigate this topic last night, and I found it to be very interesting.

A proposed method to graphically check the linear relationship between the expected values (log odds, AKA logit) and a continuous independent covariate is the following:

-Plot a loess model to get smoothed probabilities for the binary outcome controlling for the continuous predictor. So instead of a logistic model use a loess model.

-Convert outputted smoothed probabilities into log odds (i.e., log[p/(1-p)]).

-Now plot log odds on the y-axis against the raw values of the continuous independent variable on the x-axis.

-Examine the linear relationship, you can play around with the smoother used in the loess model in the earlier step to clean up the shape of the plotted line.

-If no breach in linearity assumption, then run normal logistic model with original terms.

Side notes, the smoothed log odds can be a little wobbly at the extreme values of the continuous independent variable used in the plot. This can be due to data sparsity. You can slightly disregard this wobbliness, just remember it if you make generalizations (this kind of reminded me of very low or high propensity scores). If there is not a linear relationship seen in the plot, consider including a spline term in logistic regression or using a general additive model. Another option is trying to transform the continuous variable and refitting the relationship and plot.

Also, if the line has a distinct change or if there is a suspected interaction, then this process can be used to examine for a continuous IV and categorical IV interaction. I believe you could just split your dataset into subgroups of the categorical variable and run multiple loess models for the unique categorical variables and compare the generated lines using plots. If there is an interaction, you can potentially add it to the logistic model along with the continuous or spline version of the continuous variable (if appropriate), but remember to include the main effect terms for the interaction term in the model as well, like in traditional logistic regression. A final option may be comparing the -2logL for the candidate models - if they are nested.

noetsi, I found some SAS code to do this at the following link: