Protocol comparison: repeated measures ANOVA and mixed linear models

#1
Hello!
This is my first experience with repeated measures ANOVA and mixed linear models. Since I did not found an example that uses a design similar to my experiment, I hope I can get some help. I collected 3 soil samples (SAMPLE_ID) in 4 areas (SITE), totaling 12 samples. My objective is to show the effect of a modified protocol (TREATMENT) for soil samples, considering the variability between the different areas. Thus, each sample from each area was submitted to the original protocol and to the modified protocol. I have been thinking of using the TREATMENT as a fixed factor and the SITE as random because although I have chosen my sites to cover various types of soils, I do not have as much interest in the sites themselves. However, I do not know how to write the model considering my repeated measures (SAMPLE_ID). I would appreciate any advice. If there is anything that I should clarify, please let me know! Thanks!
 

hlsmith

Omega Contributor
#7
I was imagining an agriculture (farming) context here. I think you just need to explain the background in more detail. So you take soil samples from these area then do something with them else where, like in a lab?
 
#8
I was imagining an agriculture (farming) context here. I think you just need to explain the background in more detail. So you take soil samples from these area then do something with them else where, like in a lab?
Yes, in a lab. I am a molecular biologist and I have been working with tropical soil samples that are very hard to extract nucleic acids. So, I applied two protocols for DNA extraction using each sample: the most used protocol for soil samples and a modified protocol that I developed. That's the reason I decided to use soils from four sites, because I wanted to make sure my protocol is capable of improving the DNA quantity and quality for most soil types. Thank you for your help @hlsmith. I really appreciate.
 

hlsmith

Omega Contributor
#9
Yeah, the number of samples and groups is small but if the effects were large enough you may still have some power. mixed (multilevel model [MLM]) are usually preferred over repeats these days (e.g., they are not effected by missing data). You could run the model controlling for soil groups and as you thought, have treatment as a fixed effects. So it would be similar to non-MLM model but you are controlling for samples being clustered (grouped) within soil location, so using MLM controls for this additional source of variability.

Sorry for confusion earlier, everything about your descript reminded my of an agriculture plot context. That and Sir Ronald Fisher created many of these related approaches based on land plot examples.
 
#10
Yeah, the number of samples and groups is small but if the effects were large enough you may still have some power. mixed (multilevel model [MLM]) are usually preferred over repeats these days (e.g., they are not effected by missing data). You could run the model controlling for soil groups and as you thought, have treatment as a fixed effects. So it would be similar to non-MLM model but you are controlling for samples being clustered (grouped) within soil location, so using MLM controls for this additional source of variability.

Sorry for confusion earlier, everything about your descript reminded my of an agriculture plot context. That and Sir Ronald Fisher created many of these related approaches based on land plot examples.
Thanks for the reply @hlsmith! It was very useful. I started with two different models (below), but I'm still struggling to understand how to treat site as a random effect and cluster the samples within the site. I have read several tutorials, but the difference of certain signals (like |, / and : ) using lmer is still unclear to me in this case. Please, can you give me an example of how the model you described could be done?

aov(dna_concentration ~ treatment * site + Error(sample_id), data = dna)
lmer(dna_concentration ~ treatment * site + (1| sample_id), data = dna)
 

hlsmith

Omega Contributor
#11
are you trying to do a factorization with the use of "*" in the models, also if so, does that still leave the base terms in the model?

So is it equivalent to: dna ~ Treatment + site + treatment*site?

P.S., Yes the Imer coding is confusing, I have only used it a couple of times.