I forgot to clarify, both 'CCT' and 'survival' are dichotomous variables.
Hello, and thank you for reading this post.
I am a medical doctor with limited statistical knowledge, so bare with me...
I am trying to answer the question: Does dispatching a critical care team (CCT) to out-of-hospital cardiac arrest improve survival? I have a dataset of 2000 cases and survival rates are 7.5% overall.
In reality, both 'CCT' dispatch and 'survival' are influenced by additional continuous and categorical variables:
Age
Location of arrest
Witnessed arrest
Bystander CPR
Initial cardiac rhythm
Ambulance response time
To control for these variables, we tried case-control matching but experienced too much data loss. I had a junior colleague run multiple logistic regression on STATA, and the results seem intuitively correct.
I have two questions regarding multiple logistic regression and interactions:
1. As I am only interested in the effects of 'CCT' on 'survival', do I need to check for and include interactions between the other variables? For example, I know that 'bystander CPR' has an effect on 'initial cardiac rhythm'. Will adding an interaction variable for 'bystander CPR' and 'initial cardiac rhythm' have an effect on the odds ratio of 'CCT' for 'survival'?
2. I guess I will have to include interactions between 'CCT' and variables I suspect to influence CCT dispatch, such as 'age' and 'location of arrest'? Do I do this one variable at a time and then check after each addition to see if the interaction actually improves the model?
Any comments much appreciated, I hope that I am not too far off target with all of this!
Regards,
Johannes
I forgot to clarify, both 'CCT' and 'survival' are dichotomous variables.
Hello!
You do have quite a complex model here (considering the number of controls). Assuming that your primary task is to examine the impact (constant effect) of CCT on survival (likelihood of survival), you do not need to include any interactions. There is always has to be theory to support interactions. Besides, if they are not of your interest, you do not need them. If you do include the interactions, then the main effects become "conditional" effects, that is the effect of say X on Y at mean value of Z.
Also, to control for other variables, you simply include all of the in the model before the variable of interest. Say, you have X1, X2, X3 - controls and X4(CCT), then your logit model would be: logit Y X1 X2 X3 X4(CTT). If the effect of X4 is significant, then you are good to go If not, then it is just not a significant predictor of survival (or the model is misspecified).
Hope this helps
P.S. Keep in mind that in Stata you can either obtain coefficients, or likelihood ratios. The latter would be easier for you to interpret.
jujufish (03-07-2015)
Thanks!
Regarding the interactions. I went ahead in the meantime and tested them anyway, as it wasn't as complicated as I thought. When testing the model for specification error (_hatsq), it showed significant error initially. I went through the theoretically plausible interactions and identified two major ones which improved the _hatsq from p<0.05 to p around 0.3. So this would be a better model with interactions as far as I understand (the Hosmer and Lemeshow's goodness-of-fit test is also good).
Regarding controls. Is the order of controls important? You suggested logit Y X1 X2 X3 X4(CCT). Should there be no difference to logit Y X4(CCT) X1 X2 X3?
Using ORs as output at the moment.
Tweet |