Noob question

#1
why when modeling do some variables have significant p values yet they are not included in the model ?

I was doing a cox regression model and table of variables not included in the model have some with p <0.05

Collinearity or what?
 

hlsmith

Omega Contributor
#2
Perhaps - perhaps immutable to interve tion. Perhaps a huge sample size creating significance in the face of a very small effect size.
 

hlsmith

Omega Contributor
#4
Well if a sample is small enough it may not be reflective of the population under investigation and you might get near complete seperation (all people with an event are male when this may be an artifact).

What is the sample, how many events, and how many covariates including dummies?
 

Miner

TS Contributor
#5
It could depend on the field and application of the model. For example, in industrial statistics, statistically significant terms with small effect sizes may be omitted in order to simplify the model making it easier to apply. In a field such as that, the emphasis is not on theory, it is totally on application (i.e., does it work?).
 

hlsmith

Omega Contributor
#6
I don't believe it can create significance, you may get some collinearity or suppresson. The issue is that the model may get instable with too many variables or you could get overparameterization resulting in an overfitting of the model.

If you are reviewing a Cox regression within a paper, the authors should state their inclusion/exclusion criteria. I typically use a predefined level of significance (e.g., 0.05) and also add a line about clinical significance (in order to control for specific variables of interest). It all depends on their goals.
 
#7
Well if a sample is small enough it may not be reflective of the population under investigation and you might get near complete seperation (all people with an event are male when this may be an artifact).

What is the sample, how many events, and how many covariates including dummies?
Sample 52, binary event, lots of covariates !!I tied reducing their number, but with a size of 52 not much room was given for maneuvering.

Roughly how many covariates could I use without the model become unstable? i know it depends on many things , but can we put a range, roughly.
 

hlsmith

Omega Contributor
#8
In multiple logistic regression, they say 1 variable for every 10-20 individuals in the smallest of the two binary outcome groups. So if you had n = 52; 15 events and 37 nonevents you could have one predictor. If a predictor is categorical and it has more than 2 groups it becomes a dummy variable and counts as k-1 variables (with k = the number of groups, so 3 categories would = 2 variables).

That is for logistic, I would assume some similar rules may be postulated for Cox, but I don't know them off hand. You may get just a little more wriggle room since you have time to event, but not much. If you find some general rules, I would be interested in hearing about them.