clustered data

Hello all,

I have a binary outcome (success \ failure). I have 20 subjects, half of them gave 2 samples, the others just one, so I have 30 data points.

All data points without any exceptions were success (1). So my point estimate of the success rate is clearly 100%. Now I want to calculate a CI for the success rate, mainly to see what the lower limit is, I want to be able to say that with 95% confidence the success rate is higher than....(80%, 85%, whatever comes up).

The problem is, as you can see, is the clustering, I can't use n = 30 because I then ignore the correlation, I feel it is a waste to use n = 20. Is there a way to calculate the variance of a single sample proportion while taking into account the cluster ?

Thanks !


New Member
I dont believe you can calculate fit statistics (confidence limits) when you have all your outcomes as succes. Anyways, have you considered using a generalized linear mixed model?

Yes, I thought about it, but SAS say that the outcome variable should be binary, not a constant ! :-(

I can do CI when all my outcomes are success, using the common formulas, like Clopper-Pearson and others, I already did it. However, this way I use n = 20, while I have n = 30. And I find it to be a waste of information


New Member
Hmm, I forgot about the exact methods but still, if you use any continous approximation I think you will end up with a variance estimate of 0 and hence cannot compute confidence limits.

One (not very elegant) method could be to use GLMM with 1 failure and all 30 results (10 paired) and if the confidence limits are narrower than in the clopper-pearson with 20 succeses you can state that the confidence limit is narrower than the one found in the GLMM.

I am sorry I can't be of any more help

On the contrary Morten, you are of great help !

I found a reference in a book with an equation for this clustered problem, and guess what - you were right :) I got variance of 0 because they used continuous approximation...

I like your idea about GLMM, will give it a try, I have nothing to lose, if it doesn't work, I'll take the conservative way and use n=20

thanks, your help was significant ! :)