PDA

View Full Version : Model selection on Orthogonal data



kalle2
01-17-2011, 04:49 PM
Hi,

I have a question whether or not it is really necessary to do model selection if one has orthogonal data and no correlation between explanatory variables.

Crawly (The R book - 2007) says on p. 328. “ANOVA tables are often published containing a mixture of significant and non-significant effects. This is not a problem in orthogonal designs, because sums of squares can be unequivocally attributed to each factor and interaction term. But as soon as there are missing values or unequal weights, then it is impossible to tell how the parameter estimates and standard errors of the significant terms would have been altered if the non-significant terms had been deleted.”

Does this mean that if I do have a balanced dataset, with no missing values and no correlation between my explanatory variables, can I then draw reliable conclusions about the effect of my variables based solely on the maximal model, i.e without doing any model-selection to derive the minimal adequate model?

Thank you

/K

SE_Lazic
01-25-2011, 05:16 PM
You can obtain reliable conclusions with either the maximal or the minimal adequate model. If you have a non-significant interaction (in a 2x2 design for example), then removing the interaction term shouldn't make a huge difference for estimating the remaining parameters. You gain a degree of freedom by removing the interaction term, and if your sample size is small (e.g. n=3 in each cell), this may be an important increase in power.

It sounds like you would like to use the maximal model and are wondering whether it is justified or okay, is this correct? This is not uncommon, especially if the interaction is of theoretical interest and you want to report the result (of course you could also compare models with and without the interaction directly). Model selection isn't a requirement, and in designed experiments (at least in the nonclinical biomedical field) it is rare.