Generalized Least Squares (and ANOVA type III)

Dear all,

I ran a type III ANOVA on a model of this kind: gls(y~x1+x2+x3). I found that both x2 and x3 are significant. However, if I drop x3 from the analysis (gls(y~x1+x2), then no independent variable is significant. If I drop x2 instead (gls(y~x1+x3), x3 is significant.

So, basically, the effect of x2 on y is influenced by x3. I think this is the correct interpretation. My problem is that x3 has two sates, one of which has 53 individuals and the other has 6. The fact that one of the groups of x3 has just 6 members could be confusing the analysis, I think.

What are your thoughts on this?

Thanks in advance.


Omega Contributor
Please provide more information on the sample size and how these variables are formatted. Have you explored any mediation or interaction models?

By Type III, you mean significance after controlling for other variables? Is there any background knowledge to help guide your analyses, including your own hypotheses?
Hello and thank you for answering.
By Type III I mean I tested for the presence of a main effect after the other main effect and interaction.

My n=59. x1 is a continuous variable (body mass); x2 (diet) and x3 (activity pattern) are categorical. x2 has 3 different groups and x3 has two groups. I'm testing if body mass, diet or activity pattern have any effect on cerebellum size. This had not been statistically tested before. When I plot the data it doesn't seem that diet categories are that different from each other, but that's not the result of the ANOVA if the model contains x3.


Omega Contributor
Well BMI doesn't seem to be a driver, so you may contemplate removing it from the model, after making sure its effect size (though not significant) isn't of clinically significance. You are currently skating the line of overfitting the model.

Also, you do or do not have an interaction term in the model?

The six subjects in the one sub-group is alarming, is that the active or inactive persons? Also, it would be interesting to just draft out the average size in each of the groups then sub-groups: so, diet1, diet2, diet3, then activediet1, activediet2, activediet3, inactivediet1,..., inactivediet3 along with their standard deviations.

Also you can determine their partial R-square contributions, that would be of interest as well.

Lastly, I also wonder if you have physiological rationale for the analyses. Do you have a mechanism or biologically plausible mechanism driving your hypothesis. I know that atrophying of cerebellum typically results in less activity and perhaps muscle wasting.
Hi again,

The answer is no. I didn't include any interaction terms.When I drop body mass, both diet and activity pattern are significant.

Activity pattern refers to diurnal vs nocturnal. So I have 3 types of diet and 2 types of activity pattern. The one sub-group with just 6 members has 4 carnivore/nocturnal members and herbivore/nocturnal members. No specimen is, at the same time, omnivore and nocturnal. The idea behind this is that more visual animals (I would say predators and diurnal) would have a larger cerebellum (for it drives ocular muscles). But if such trend is not observed, there are actually several ways to justify its absence.

So, do you think I should drop the activity pattern variable? At least until I obtain more data for the nocturnal group?

Thank you for your help.


Omega Contributor
I would suspend all of the analyses until you can get more data. That way you don't keep running it until you get enough to make significance, when a few more than that number given sampling variability may have made it not significant.

I had to run some genetic data awhile back and only 6 people had these two polymorphism. The analyses kept leaving me feeling apprehensive, in that you have to ask yourself do these 6 people represent all people like them in the world. Its not quiet the same thing, but you can flip a coin and get 6 heads in a row, but the law of big numbers just hasn't played out. Well these six people may not be a simple random sample that may be generalizable, so the analyses are more hypothesis generating than testing.

So are you catching the animals and killing them and measuring their brains or using a scan? Perhaps you are only catching the inferior animals?