Am I introducing a selection bias in my prognostic biomarker study ?

#1
Hi all,

I'd be glad to get the advice from someone who's better than me at biostatistics, especially regarding selection biases in survival studies.

I have 300 patients with a specific disease, an available biopsy at study inclusion, and prospective follow-up.
I want to study if the expression of 3 biomarkers is an independent factor of survival, i.e., independent of other - already known- cofactors of survival in this disease, that is age of the patient and stage of the disease.
I plan on fitting a Cox proportional hazards ratio model on survival including as explanatory variables: age of the patient, stage of the disease, and my 3 biomarkers as measured in the patient's biopsy.
I would need around 50 events (10 events per variable), as I have 5 variables in the model.
Now, I can't study the whole cohort of patients (it's too expensive). I can only study 100 patients. If I randomly select 100 patients from that cohort, I would have around 30 events, which is not enough to fit the model.
So I want to select primarily patients with an event, that is, dead patients. I would select 50 dead patients and 50 patients who're still alive with comparable follow-up times.

Is the methodology correct ?
Am I introducing a bias in this study, that would make my results definitely irreproducible ?
(if yes, I would be happy to find a solution !)

Thanks for your help !!
 

hlsmith

Omega Contributor
#4
So say you randomly select from deceased, would the competing event possibly proceed the event of interest during the study followup period?
 

hlsmith

Omega Contributor
#6
That probably affects things. I would look up the use of a case-control study design when performing survival analysis with competing events. See what others have done.


I have the feeling that you can run case-controls models with Proportional Hazard models, and the estimates are fine. Though you probably need to correct the model intercept, since you are forcing a false prevalence on the outcome of the model (say 50%). You issue would be that you also have to address competing events. Which I have not done personally, but I believe if there is a competing event you have to let the model know and there follow-up time gets weighted differently.