No comparator group

Dom

New Member
#1
Dear all, very new to statistics but keen to learn as starting research.

I have a data set looking at only those who have died/been removed from a much larger group and have a lot of data about them (i.e. age, BMI, sex, 4 symptom types, distance from hospital, smoker/ex/never smoker- probably have around 80-90% of this data for the 120 or so in this group).

I do not have the data on those who went on to have curative treatment other than total number which is perhaps around 800, (900+ with the drop out group).

I would like to describe this (drop out) cohort in as much depth as possible. I can of course describe univariate analysis with simple mean, mode, IQR etc. But wondered whether there is more I can achieve without all the data from the group that progressed through treatment, eg. Multiple regression??
many thanks for any advice.
 

hlsmith

Not a robit
#2
You have data for people with a positive death status. Of these patients you have basic characteristic data as well. a subset of these people you have whether they had a curative treatment. But overall some individual "dropped out" and you want to examine ????

I follow some of what you are doing but not the subset and why you don't have complete data. Please provide some more information.

It is probably too early to say, but if you have time to event perhaps a survival regression may come into play.
 

Dom

New Member
#3
Many thanks for the reply. Sorry, let me try to be more clear. I have let's say 1000 patients of which 100 died. Of the 900 alive (they survived long enough to receive curative treatment). I have data on when they received the curative treatment, their sex and age demographics and little more. Of the 100 who died I have the date they were put on and removed (either too I'll or died) from the waiting list for curative treatment.

Also on the 100 (with as you say: a positive death status) I have their age, sex, weight, BMI, categorical symptom data for 4 common symptoms which are binary -they have the symptom or don't have it, distance from hospital, smoking status, grip strength.

The reason, unfortunately, I don't have all data for everyone (and annoyingly it is dotted about - so I might have more or less complete data on one subject bar smoking data but for another I'll be missing several pieces of data, is that this is a retrospective review and in the past not all data points were recorded well.

I hope this explains a little more?

survival regression would be another name for Kaplan Meier curve?
 

Dom

New Member
#5
Fundamentally I want to ask whether there is anything that contributed to the death of those who did not make the curative treatment, given the data on each individual I have.

I guess for that to be truly possible I need to control each subject and each of the data points with a subject who went on to receive the curative treatment? It may be possible to gather the data on a 100 from the 900 who received treatment, but how to choose those is perhaps not straight forward (and time consuming).

Is there a statistical test I could apply to each of the variables to say how likely it is that each individual in the group who did not receive treatment was likely to have that factor? e.g. those who died were ? likely to have been smokers or lived ? miles from the hospital and give confidence intervals and a p value for this likelihood.
 

hlsmith

Not a robit
#6
This question is still not completely well-defined, but perhaps look at survival analysis with your outcome being received treatment and death being a competing event. Ideally you use all of your data, but random sampling can be a method to acquire a smaller and representative subsample.