# Incidence rate in a cohort with small sample size

I have a small cohort of people with a specific disease (n=300) and studying the incidence of a related disease (which is kind of rare and normally occur only as a subsequence of this main disease). I have longitudinal data for this cohort over 4 years and I am fine with the study design in terms of stats to use etc (knowing that there is a sample size restraint but I cannot do more with that, just addressing the limitation). My question is: knowing that this cohort is not exactly representative of the population (but probably the best I can get) how correct is to report incidence rate in my population of the 'related disease'? Is it a valid information or shall I just report frequency of new cases per year? And in case reporting the incidence rate is still a valid information how to approach to that? Standardizing? Calculating the person-time incidence rate? I guess it is a matter of sample size as well...

So all persons in sample have disease X, now you want to determine the incidence rate in this sample for disease Z. Yeah, you can totally do this. Caveat, you can only make generalizations to other samples of persons with disease X that resemble yours. You would not be able to generalize to the general population.

Have you thought about running proportional hazards regression. Doing this you would be able to say at time period 1,2,3,etc., blank proportion of the sample has condition Z. Though, not sure if everyone in your sample is at the same disease state for X. Using the regression would also allow you to control for other covariates.

Though you could pick a time frame and just crudely report your incidence as well.

Thanks, but can I standardize my IRR with some other reported in the literature based on much bigger sample size?

So a little convoluted. I would first try to find an example of something someone else did similar to that approach.

What is your sample size? If it is really small, it may be hard to make conclusions off the standardization with confidence, since due to sampling variability you could easily have a sample not that reflect of the population. Unless your sample was take at random, which I imagine it was not.

Mine is 300, previous study I found was about 4000. Is it better to just report the two IRR addressing the impossibility to properly compare due to sample restraints?

Thanks

So you are trying to compare or interpret your findings to other published results?