Is this group of data different from the rest: From ANOVA to...

Hi! We have performed an experiment to see if semi-wild pigs have any food preferences (in what type of nutrient supplement we add to the food). Different foods were added in their home range at each their place. After each trial we counted the number of food pellets that were eaten. first I thought this would be easy, because we just want to see if any of these foods are significantly different from the others with a straightforward ANOVA. However, then I read about count data so maybe I need Poisson distribution? Also, I have the total number of pellets so I could use percentage eaten instead of count data if that helps. And also, I don't really use the extra information I get from randomizing the position of the food (i.e. when moving the food type around for each trial) and that each individual does this 5 times. Possibly a mixed/random-effects model could include this no?

Any tips on what I should do? I'm using R. Any help would be very much appreciated!!
Last edited:
I'm moving mostly sideways, but wouldn't the repeated measures of the same individual mean I should use a repeated measures model instead of random-effects model?


Active Member
I think the thing here is that the experiment is 'doubly repeated'. there is going to be presumably some correlation between different 'trials' on the same pig, similar to time effects in usual repeated measures setting, and also some correlation between feed consumption on the same day (a pig no matter how wild can only eat so much food).

I think the food position issue would be modeled by and interaction between position and food id, praying that it will not be significant. If it is, then you will probably have to break up the analysis by food position. if it is not signficant interactiont then can include as fixed effect adjustment for food position.

Poisson is probably a good way to go looking at your datas, the repeated measures would usually be handled by generalized estimating equation 'GEE'. There are typically special correlation structures for doubly repeated measures, but i don't think i have used them in a while, if ever. You will probably want to use over-dispersed poisson, since poisson itself is almost never a good fit, and tends to give way to small p-values.

Good luck bringing home that bacon!
Thank you for taking the time! Its been many years since my last stat course but I really want to do this scientifically correct.

From what I can see from histograms on the count data grouped in different ways the distributions look quite similar, the most difference is when separating trials (where the first of the trials have way more eaten then the rest, -that's not biologically interesting but probably something that needs to be accounted for in the model). I'm really just interested in the effect from the foodID.. But because of this, would it be better to avoid GEE? Since the first trial in each individual will make the variances bigger than necessary?

What if I account for the repeated measures and fixed effects in a GLMM, would that be appropriate?

Suggesting this formula:
pelletsEaten ~ foodID + (1 | foodID:foodPosition) + (1 | foodID:trial) + (1 | foodID:animalID), family = quasipoisson
I just thought of something that will make everything much simpler! Instead of using the individual (repeated) measures, I could average them and use their means instead. Wouldn't that make it possible just to use a GLM??


Active Member
yeah i was thinking that would be a good idea. you mean average or sum across trials? this makes the analysis alot easier. I think it is probably ok,