Case cohort sampling

Dear all,
I would like to perform a case-cohort study, starting from a cohort of about 2,800 individuals. Of those, 50 develop the outcome of interest during follow-up; in selecting the subcohort I would like to prioritize the presence of a covariate Z that will be needed for further effect modification analysis. Normally, with a case-cohort design I would proceed by creating the subcohort through random sampling (stratified on the covariate Z) and then add the cases that were not included in the subcohort. My problem is that in this way there is no control over the final sample size (it would depend on how many cases are included in the subcohort), whereas I would need to end up with an exact number (say 300 individuals), in order to plan for costs associated with exposure measurement.

This is the distribution of cases by levels of the covariate Z:

Z0: 8 cases / 356 non-cases
Z1: 42 cases / 2,455 non-cases

Any help would be appreciated.


I haven't done one of these, why not keep all cases and randomly sample non-cases? Then correct imbalances in estimates if needed (predicted probailities) via wieghts?
Hi hlsmith,
In a case-cohort setup you are supposed to sample cases and non-cases randomly (subcohort) and then add the remaining cases. Participants are then weighed according to the probability of sampling. I have done these before but my approach does not allow to accurately determine sample size beforehand.


Why can't you accurately determine sample size before? You seem to know the descriptive stats for the sample? Also, with 50 cases an examination for effect modification will be difficult to power - will need to hope for a very large and apparent effect.
Leaving aside the issue of power and the role of the Z covariate, I don't see how one can accurately determine the sample size if sampling of the subcohort includes a random fraction of cases and the remaining cases need to be added to the analysis. Could anyone provide a practical example?