Kapler-Meier survival analysis or a simple t-test? Orthodoxy necessary?

#1
So...

In a two-arm clinical trial with approximately 100 participants per arm and remission rates around 50% (RR significantly different both in a direct comparison and 95% CI crossing a pre-defined non-inferiority limit) I have compared the risk of relapsing depending on the treatment received. The risk is very similar irrespective of treatment group.

Comparing those that do relapse within a three-month period (group 1: 19/43; Group 2: 25/55) participants in the first treatment arm have significantly fewer days to remission when running an independent samples t-test. The mean time, in days, to relapse is 23 and 41 days.

While the study was not powered for this comparison and given that the numbers are small, an almost three week difference in time to relapse (the patients have received antidepressant treatment, and the mechanisms and explanations underlying relapsing are complex), the time difference in days - if true - is clinically relevant.

However, the proper thing to do is of course to do a Kapler-Meier analysis. Such an analysis unequivocally supports a null hypothesis of similar survival rates.

Not being a statistician, merely a humble researcher, I would appreciate some input on the issue.

I understand that in some circumstances the exclusion of right-censored (hope I got the lingo right there) cases is by necessity an exclusion of data that (eventually will) have a specific value. Everyone I assume will eventually die. In my current analysis it is not so - likely more individuals will relapse but in theory it is possible that all remaining remitters will remain well ad infinitum. Does this have any bearing?

By mistake I at first ran the analysis without censored cases, i.e. the data set contained only those remitters that relapsed. Depending on analysis method (which as I gather put different emphasis on different parts of the survival curve?), the survival rates are border-significant with the p-values range from between 0.02-0.07 or thereabout.
To me it is not clear why taking the censored individuals into consideration, which the analysis obviously does, is relevant. What follows is that if fewer of our patients remitted - say only those that later relapsed - we would claim that the relapse time did differ and was in fact shorter in patients receiving treatment X. Why should the the size of the non-relapsing proportion (which might even be identical between groups) influence the comparison of values in the group of relapsers?

Personally, I lean towards the t-test asking "among those remitters that do relapse, is there a difference in the number of days they stay in remission?"
I will report both results from the survival analysis and the t-test, so readers can draw their own conclusions. Many readers are I guess even less proficient than myself in interpreting statistical outcomes and I would greatly appreciate any explanation as to why the use of a t-test here is fundamentally wrong (if that is the case) or why a Kapler-Meier survival analysis (or perhaps some other related test is better) is self-evident here?

I hope I have been able to make my points clear and look forward to some feedback.
 

Karabiner

TS Contributor
#2
So you have nearly identical relapse rates at the end of your observation period,
a result based on an analysis which was pre-specified; and within the 3-month period
those who relapsed did so earlier in one of the groups, a result based on an analysis
which was not pre-specified. If the idea for comparing means of time to relapse
came to you before you had eyeballed he data, the result is maybe not entirely based
on p-hacking. You could report the first result as your study result, and the second as a
suggestion for further investigations. I suppose that it will never be replicated anyway.

With kind regards

Karabiner
 

Miner

TS Contributor
#3
By excluding censored results you are, in effect, stating that 100% of the population that you are studying will follow the curve of the non-censored group. In reality, they do not represent the entire population. They only represent a portion of that population.

This is similar to the effect of non-response bias in surveys. The responders do not accurately represent the entire population.
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
I would look to the KM curve and its 95% CIs to make conclusions about seperation. @Miner makes a good point. I will note that you usually lose info when running cross-sectional analyses on time-to-event data!
 
#5
By excluding censored results you are, in effect, stating that 100% of the population that you are studying will follow the curve of the non-censored group. In reality, they do not represent the entire population. They only represent a portion of that population.

This is similar to the effect of non-response bias in surveys. The responders do not accurately represent the entire population.
Ah, that is a very important point. I did not think of it really in those terms, which of course is valid.

But given that the event that define them - experiencing a relapse - would never occur (say we had life-long follow-ups) in the censored group, as a thought experiment (unlikely but actually true for a large proportion of the patients) would exclusion of the censored cases then be reasonable?

Is there an underlying assumption - survival kind of implies it - that given enough time survival will reach 0%?

What I think is confusing me is the "interaction" between the values being compared between relapsers in the two treatment groups (days until relapse, days being well) and the proportion of patients that do relapse. From a purely clinical perspective it is one thing to know that the risk of relapsing is irrespective of treatment, but that there is a difference of 2, 5 weeks in time to relapse that when only looking at those patients that do relapse.

Assuming that the difference we see in our admittedly modest sample is true it wouldn't really matter if the proportion of patients that relapse is 10, 50 or 90% But it would give very different statistical outcomes.

Am I thinking backwards here?
 
#6
So you have nearly identical relapse rates at the end of your observation period,
a result based on an analysis which was pre-specified; and within the 3-month period
those who relapsed did so earlier in one of the groups, a result based on an analysis
which was not pre-specified. If the idea for comparing means of time to relapse
came to you before you had eyeballed he data, the result is maybe not entirely based
on p-hacking. You could report the first result as your study result, and the second as a
suggestion for further investigations. I suppose that it will never be replicated anyway.

With kind regards

Karabiner
No, not really. I was unclear.

Primary outcome was remission. Secondary outcomes included relapse risk and time to relapse. The analyses were not pre-specified for either secondary outcome.

I have never worked with survival analysis data and ran the pairwise comparison intuitively. "How many days until they relapse? I´ll calculate it and compare it" Simple as that. When writing the first draft of the manuscript I had non KM-analysis at all, but a co-author prompted on its use.

So, p-hacking? Well, the next clinical trial design will likely be better in regards to pre-specifiying analyses to be done. I really do hope the data is replicated - we ourselves have data for longer time periods, will hopefully conduct a very similar study in a related population. I have hopes other groups will look at the relapse times. The corresponding studies performed by the industry, with vested interests, have used a very different design - one I assume they have chosen to be maximally likely to yield profit, whereas our study had a less financially motivated goal.

In my answer to Miner I have tried to explain what I find difficult to grasp then it comes to how the ratio of non-censored / censored cases impacts the result.