GzLM for counts or ?

Hi everyone,

I'm trying to figure out how to run an appropriate comparison for clutch sizes of birds. Ultimately, I want to compare clutch size of the same species between 2 sites with 2 levels (good, poor).
First I need to sort out if there were differences among years in each site to be able to determine which years were good and which were poor. So, for this, the question is: is clutch size different among years? If so, which ones differ?

I have run this before for the first site where the birds only ever lay 1 or 2 eggs. I ran a GzLM with a poisson family in R, but it was WAY underdispersed. So I converted the counts into binomial data, and ran a GzLM with various turns of binomial family (negative binomial fit the data best) in R. I anticipated doing the same for the second site, however, a handful of nests there contained 3 eggs (11 nests of 1428; however, these were largely from one year. In that year, it's 8 nests of 201).

How can I make these comparisons? I tried running a GzLM with family = poisson, but it's underdispersed again, but this time with the 3-egg nests exerting a lot of leverage. I'm wary of dropping the 3-egg nests and doig as I did with the first site (family = binomial) because they're true values, but I'm not sure how else to deal with the data.

Does anyone have any tips? I've asked my supervisors for their opinions, and as excellent as they are in every other way, they can't help here.

Another related question: can one use a Mann-Whitney test to compare count data between sites? I also need to pool the data from each site across all years to compare the overall means. I know count or binomial data cannot be used in an ANOVA, since it assumes the dependent variable is continuous, but is this true of a Mann-Whitney (I'm assuming here that a t-test is out due to non-normality)?

Thank you for ANY suggestions anyone might have! I'm stuck!



I've decided to ditch the 3-egg nests entirely and just run a glm with a binomial link function. However, I'm now trying to make an comparison between 2 sites, so my question has shifted:

Can I use a t-test with binomial data? Or should it be a glm with a binomial link still?
Last edited: