Replication in linear regression. General Estimating Equations (GEE) a good option?

#1
or is it the only option?

I've been collecting hosts in 3 different habitats, about 20 hosts every sampling trip. I've been recording prevalence of a parasite (1 or 0) and abundance (number of parasites per host) for each individual. I'd like to know:

  1. Abundance/prevalence (DV) are the same for the 3 habitats (IV).
  2. Some other variables (temperature, soil type, etc.)(IV) could affect the overall prevalence/abundance (DV) of the parasite.

I'd say I need to use a linear regression, but I'm afraid that the hosts collected in each trip could be non independent.

Since General Estimating Equations deals at a population level, I was thinking about applying it, but I'd like to confirm it with you.

Should I use GEE ('geeglm' {geepack}) using each sampling trip as a clustering vector?


More info:

3 habitat; about 10 sampling trips to each; 20 hosts per trip. About 600 hosts.

After applying the linear regression ('glm'), residuals are not normally distributed ('shapiro.test'). Durbin Watson test ('durbinWatsonTest' {car}) p-value > 0.05 and Breusch-Pagan Test ('bptest' {lmtest}) p-value > 0.05.

I'm using R, so any answered tailored to it would be more than welcome.
 
#2
Re: Replication in linear regression. General Estimating Equations (GEE) a good optio

I'm curious about why nobody answer.

Is it because the question is not clear enough? Because it is very specific? The format is incorrect?

I tried to give as much info as possible, but feel free to ask me.

Thanks.
 

Jake

Cookie Scientist
#3
Re: Replication in linear regression. General Estimating Equations (GEE) a good optio

GEE could work, but given my understanding of your data, something like a zero-inflated poisson model with random effects might be the most appropriate choice. An example of this kind of model in R can be found here:
https://groups.nceas.ucsb.edu/non-linear-modeling/projects/owls/WRITEUP/owls.pdf
Or, for a book-length treatment, Zuur et al. have a book on zero-inflated models and GLMMs...I sort of assume they also look at models that combine the two (which is what the model I mentioned above is), but I don't know for sure as I don't have access to this book. I do have a different book by Zuur et al. and it's excellent, so I suspect this one is good as well:
http://www.highstat.com/book4.htm
 
#4
Re: Replication in linear regression. General Estimating Equations (GEE) a good optio

Thanks Jake

I'm not familiar with the zero-inflated poisson models, but I'll see if they could be valid for my data.