Poll: ANOVA and Missing Data... what to do?

Student shows up with an unabalanced factorial ANOVA design. you (being awesome)...

  • Treat the unbalancedness as missing data and model it

    Votes: 0 0.0%
  • Screw missing data. SPSS uses Type III Sums of Squares. STICK.TO.IT.

    Votes: 0 0.0%
  • Do people still care about unbalanced designs in factorial ANOVAs?

    Votes: 4 100.0%

  • Total voters


Smelly poop man with doo doo pants.
basic set up. student shows up at your office with the all-too-common problem of having an unbalanced factorial ANOVA design and asks you what to do.

here are your options:

- "I just learnt about missing data methods in SEM (Full Information Maximum Likelihood and Multiple Imputation). Since the General Linear Model can be subsumed under the general framework of Latent Variable Models, i'll just treat that unbalanced stuff as missing data and model the *heck* out of it!"

- "Missing data methods? Screw that! SPSS defaults to Type III sums of squares. STICK TO IT, YOU FANCY PANTS".

- "Unbalanced designs in ANOVA? Do people still *care* about that stuff? :p"


Cookie Scientist
12/05 12:15 Jake: your poll is about an unbalanced design, but your thread title talks about missing data. i guess that's a big hint about how you would approach this one
12/05 12:16 Jake: do people still care about unbalanced factorial designs or have we reached the 1960s yet
12/05 12:17 spunky: so what do people do in the real world? stick to SPSS's default of Type III sums of squares and move on?
12/05 12:18 spunky: and, even more importantly, if people stick to Type III SSq is that the best approach to this?
12/05 12:19 Jake: most people just stick to type 3. i guess your concern is that the type 3 estimates are slightly less efficient in the unbalanced case?
12/05 12:21 Jake: in theory you could use type 3 estimates and adjust your contrast codes so that they provide estimates equivalent to the type 2 ones. but ive never seen anyone actually do this
12/05 12:21 spunky: something along those lines, true. the other thing is that i see people regularly using this fancy-dandy missing data methods with uber-complicated designs and i can barely find anything related to much more pedestrian methods like regression/ANOVA (particularly ANOVA where i would say having an unbalanced design is more the rule rather than the exception)
12/05 12:22 Dason: a factorial design is uber-complicated?
12/05 12:25 spunky: no. a factorial design is simple. i'm talking about more in the context of SEM where i see these methods used routinely but they don't get used in easy designs (ANOVA/regression)
12/05 12:25 Jake: i see no reason to use missing data methods just because you have an unbalance design... in fact that seems pretty suspect to me
12/05 12:25 spunky: sorry, i kinda meant to imply ANOVA/regression (or the general linear model) = simple. weirdo multivaraite designs with latent variables = compicated
12/05 12:25 Dason: I agree Jake
12/05 12:25 spunky: @Jake why would it be suspect?
12/05 12:26 Dason: give a good reason for doing it in the first place? Because it makes it easier to get the estimates by hand?
12/05 12:26 Jake: because you don't actually have missing data?
12/05 12:27 Jake: if you call an unbalanced design missing data, isn't that kind of like saying "well i recruited 70 subjects, but i wanted to get 100... so i have 30 missing data points!"
12/05 12:27 Dason: I agree
12/05 12:28 spunky: uhm. good point. what if you have attrition in a repeated-measures ANOVA context?
12/05 12:28 spunky: go you mixed models. i answered my own question
12/05 12:29 Jake: haha well yes i would use a mixed model, but i think it would at least be sensible to use missing data methods in that case