Disease progression analysis

#1
Hi

In a study of age-related cancer progression, I investigate patient's ages in different cancer stages (stage 1-4). Cancer stage is defined by growing tumor size in centimeters. Data looks something like this:

Patient number................ Stage 1 (<1cm).............Stage 2 (1-2cm)............Stage 3 (2-3cm).............Stage 4 (>3cm)
1.......................................35 years........................37 years........................43 years.........................44 years
2.......................................41 years
3...................................................................................................................48 years........................55 years
4......................................30 years
5............................................................................32 years
6 .....................................25 years...............................................................48 years
7......................................29 years
8............................................................................36 years.........................41 years.........................45 years

As you can see above, age progresses along with the cancer stage/tumor size. However, for some of the patients the age at earlier stages are unknown because they were diagnosed at a later stage, and some patients have not yet progressed to the later cancer stages.

I want to compare stage 2-4 to patients in stage 1. Which test could I use? Am I correct to use a paired test for nominal outcome?
 

hlsmith

Less is more. Stay pure. Stay poor.
#3
I would just start with a scatter plot of size vs age and visualize. This may influence how you move forward. Of note, these types of studies are crazy tricky given time bias and events (e.g., left censored, right censored, competing risks, lead bias, length bias, etc.) not to mention survival bias.
 
#4
Compare with regard to what?
What is your research question?

Dear Karabiner

I wish to compare ages at diagnosis of patients in stage 2-4, to patients in stage 1 at a 95% level of significance.
My research question is that the age at diagnosis depends on the cancer stage, and that patients in late disease stages are diagnosed later in life than patients in earlier stages.

Kind regards, Anders
 

Karabiner

TS Contributor
#5
So, patients 1, 2, 4, 6, 7 belong to group "first diagnosis at stage 1",
patients 5 and 8 belong to "first diagnosis at stage 2", and patient 3
belongs to "stage 3"?

If sample size is sufficient, you could perform t-tests for pairwise
comparisons of stage 1 with stage 2, stage 3, and stage 4, respectively.

If your supervisor demands something more sophisticated, then
maybe a oneway analysis of variance with factor "stage at first
diagnosis" (4 levels) and a linear contrast for the 3 comparisons.

Alternatively, you could perform 3 Mann-Whitney U tests.

With kind regards

Karabiner
 

katxt

Well-Known Member
#6
One snag with the t type test, say age between stage 1 and 2 is that some of the data are paired and some are not.
A possible way round this is to do a Monte Carlo bootstrap test where you resample the patients with replacement and get the distribution of the differences in stage means. You should probably also consider some adjustment of critical p values to allow for the multiple comparisons.
 

katxt

Well-Known Member
#7
Thinking more about this. The data is paired, even though some is missing. Also, the second age is always greater than the first. This means that the differences are always positive even though you may not know what the difference is for every patient. So it follows that the difference is always significantly greater than zero.