# Thread: Interaction effects, distributions not normal- is ANOVA justified??

1. ## Interaction effects, distributions not normal- is ANOVA justified??

Hi!

I have a 2*4*6 factorial design. the sample distributions in most cases are NOT normal (tested with Kolmogorov-Smirnof test). It is recommended to use non-parametric tests in such cases but my question is this: What test should I use to test for interaction effects?

Thank you!

2. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

Remember that the important thing is not whether the raw data are normally distributed. Rather, the important thing is whether the residuals of the model are normally distributed. So, the first thing you need to examine the distribution of your response variable--not your independent variables--. Once getting that normally distributed (though a link function or, less preferably, a data transformation), you should be able to tell what your next step should be. I would not go the non-parametric route yet!

3. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

Ok that's something I want to hear. But, now I'm confused. what I did and thought I was supposed to do was test if, for exemple, both of the 2 independant groups had their values on a dependant variable normaly distributed. So what is the difference between that and exemening the distribution of my response variable. And isn't the distribution of residuals something different from both those things?

4. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

Ok I tried what you said. I used linear regression to get the standardized residuals and than I use the Kolmogorov-Smirnoff test to see if the distributions of residuals in each group are different from normal. And I still got the same results as before. Most of the distributions are not normal. I don't really know how to do a link function or data tranformation to get normal distributions. Could you please explain what my next step should be? Or do you have any suggestions about what procedure other than ANOVA I could use to test the interaction effects?

6. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

The sample sizes are between 50 and 100. Only one group has n=210. I should also mention that the variances are equal.

And most of my Q-Q plots look like the one on the image I attached.

7. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

You have a design with 2*4*6 = 48 cells.

How many observations do you have in total and in each the cells. I suggest you give a few values of “n” in each cell of the 48 cells.

Even if you had perfectly normally distributed random error terms, (which is almost the same as “residuals”) but also an imbalanced design (an equal “n” in each cell) it is questionable if it is meaningful to estimate interactions.

Maybe someone else has some comments on this?

8. ## The Following User Says Thank You to GretaGarbo For This Useful Post:

knedlica (11-03-2012)

9. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

Oh, yeah, sorry. Yeah the n's in each of 48 cells are quite small (<10). I see your point. The interactions are, btw, all non-significant. I wonder what would happen if i combined some groups together thus decreasing the number of cells and enlargening the sample sizes in each cell. But somehow i think that is not advisable doing and should have been done prior to conducting the experiment.

10. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

Specifically talking about the sample size:

How large is the sample in total?

If you pick out arbitrarily (or even better randomly) 5 to 10 cells, how many observations do you have in each of these cells?

11. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

Is it the same data set you are using in this thread:

Is that another project, an other variable of did you reformulate the problem of what would be the dependent variable?

12. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

The dependant variable is different but the independants are the same. So yes it's the same data set just different dependant variable.

13. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

Originally Posted by GretaGarbo
Specifically talking about the sample size:
...
How many in the cells?

14. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

Oh, sorry I didn't see that. All together the whole sample has N=400. Individual cell go from highest n=64 to lowest n=1. But most of them are between 5 and 20. I think that's what you were asking.

15. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

I was asking about how many observations you have in each cell to try to get some idea about how unbalanced your design is. It is a matter about the “quality” of the estimates. If it is exactly balanced it will be “orthogonal” so including or excluding an interaction term will not influence the estimates of the remaining. (The question in a way is if you should include or exclude higher order interactions.)

I have the impression that different users have different opinions about this. I don't want to suggest arbitrary recommendations since I have not seen the data.

16. ## Re: Interaction effects, distributions not normal- is ANOVA justified??

If it is exactly balanced it will be “orthogonal” so including or excluding an interaction term will not influence the estimates of the remaining.
This is a good point. You have a huge range of sample size between the cells. Like GretaGarbo said, you need be careful of this.

Another point is about the plot you provided. What does your residual plot for your full model look like?

But, more to the heart of the matter....we don't know what your data look like, so, as Greta says, I think many of us will be cautious giving specific recommendations. That said, I think most of us agree that models with three-way interactions are a bit tricky. They have to be based on huge sample sizes, but even then interpretation can be difficult. Though I don't entirely recommend this, one option is drop the highest-order interaction from the model (the three-way term) and then conduct a model selection approach on all lower-order models, with the following as your most parameterized (or global) model:

A + B + C + AxB + AxC + BxC

And then these models:

A + B + C + AxC + BxC
A + B + C + AxB + BxC
A + B + C + AxB + AxC
A + B + C + BxC
A + B + C + AxC
A + B + C + AxB
A + B + C
A + B + C + BxC
A + B + C
A + C + AxC
A + C
A + B + AxB
A + B
A + C
A + B
A
(null)

Inference, I think, would be a bit easier and would still be strong. Of course, there are trade-offs with this approach. The only major downside I see is that model selection approaches are generally not philosophically compatible with a designed experiment. But I'm not sure if your factorial design was an experiment or observational study, etc. People, of course, have different opinions on this, so take what I say with a few handfuls of salt.

Page 1 of 3 1 2 3 Last

 Tweet