# Thread: ANOVA with multiple imputed Data

1. ## ANOVA with multiple imputed Data

Hello

I used multiple imputation on my data to get a complete data set. I want to do a ANOVA now. Does anybody know how to do that correctly?

SPSS calculates ANOVAS for every single imputation group but does not pool the results. Some of my imputation groups are significant (e.g. 0,04) and some aren't (e.g. 0,07).

There is some small literature about pooling multiple imputed data but I don't understand it...(https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4029775/)

Froop

2. ## Re: ANOVA with multiple imputed Data

The process is actually much easier than you probably think, based on Rubin's approach.

You average the estimates from imputation based analyses and that gets you the estimate value across imputes.

Then the SE within and between imputations based analyses values. The section in your link about single pooling covers this. So the estimate is super easy to get, then you create the SE based on within and between imputation variability. This makes the SE measure a little larger since it takes into account the slight variability accounted for between imputes, since it is probability based.

3. ## Re: ANOVA with multiple imputed Data

SE = standard error?

So I just calculate the average of "everything"?^^

Thanks!

4. ## Re: ANOVA with multiple imputed Data

Originally Posted by froop91
SE = standard error?

So I just calculate the average of "everything"?^^

Thanks!
Not everything. The only thing you can "average" (as in just taking the mean) are the parameter estimates you obtain (regression coefficients, correlation coefficients, etc.). That would be 'Q' on your link in Eq (1). Then you need to calculate the within- and between- imputation variance, get the test statistic manually, etc.

Overall its is quite a drag to do. But I just wanted to remind you that this is not as easy as just "taking the average of everything" or "averaging the datasets and running the analysis on it". It's a little more complicated than that.

5. ## Re: ANOVA with multiple imputed Data

Ok, then i have to look how I can do within- and between- imputation variance.

thx a lot

6. ## Re: ANOVA with multiple imputed Data

Just to be sure i try to make an example:

Original Data:

Participant 1: 5 4 3 -
Participant 2: - 2 1 1
Participant 3: 1 2 4 -

Imputation 1:

Participant 1: 5 4 3 2
Participant 2: 2 2 1 1
Participant 3: 1 2 4 3

Imputation 2:

Participant 1: 5 4 3 1
Participant 2: 2 2 1 1
Participant 3: 1 2 4 1

So I average the Imputations

Participant 1: 5 4 3 1,5
Participant 2: 2 2 1 1
Participant 3: 1 2 4 2

And based on this averaged imputation sheet I calculate within and between variance (by hand)? If this is correct, how can I tell SPSS to average the imputations? As far as I know, SPSS keeps all imputations separated and only gives pooled results on some calculations like "frequency" but doesn't pool the data itself)

Is it better to import my data to excel to be able to calculate properly or can do Spss all the calculations?

thx for help and sorry for the question but I am still confused in the topic

7. ## Re: ANOVA with multiple imputed Data

You average your estimates, I am guessing you don't have that many impute sets, so just put them in a new data frame and ask SPSS to average.

As, Spunky reiterated the variance part requires the formula in the paper. Yes, SE = standard errors.

8. ## Re: ANOVA with multiple imputed Data

Just post the output for the m analyses and make Spunky do it for you!

9. ## Re: ANOVA with multiple imputed Data

I haven't done it with R yet, but I used SAS (i.e., PROC MIANALYZE) and it is as easy as inputting values.

10. ## Re: ANOVA with multiple imputed Data

I use the mice package in R. But I know STATA also has good missing-data handling capabilities so whichever one you think is easier for you I guess.

Originally Posted by froop91
Would be the easiest but its my exam project so I have to do it^^
If this is any exam project, didn't they teach you in school how do to it then before they let you do it yourself? I'm just wondering if maybe you have something on your notes on how do to this stuff and then you won't need to switch software or anything.

11. ## Re: ANOVA with multiple imputed Data

Originally Posted by spunky
If this is any exam project, didn't they teach you in school how do to it then before they let you do it yourself? I'm just wondering if maybe you have something on your notes on how do to this stuff and then you won't need to switch software or anything.
The answer is always yes, despite what many students tell you, barring any crappy for-profit schools and some community colleges where I have seen this happen. That's the minority, though. On occasion, I've seen professors assign a project with the intention of students completing parts as the material is covered in class.

12. ## Re: ANOVA with multiple imputed Data

Originally Posted by ondansetron
The answer is always yes, despite what many students tell you, barring any crappy for-profit schools and some community colleges where I have seen this happen. That's the minority, though. On occasion, I've seen professors assign a project with the intention of students completing parts as the material is covered in class.
Well, when I’ve taught or TA’d I’ve seen one of two things happening, depending on the type of project.

One is you give the students a dataset with the issues/kinks covered in class so you can see if they’re able to recognize them and address them. The other is you let students do their own project with their own datasets and then the kinks and peculiarities of the dataset reveal themselves as the project goes along. When you find yourself in the latter situation is when the students may struggle a little bit more because you can’t possibly cover every single data issue in an introductory class (like how to handle missing data or what to do if you have a truncated variable, etc.) and they get lost trying to figure things out themselves. So I feel like whereas in scenario #1 you just tell the person “go look it up on your notes” in scenario #2, as an instructor, it’s more like “wow, good job for recognizing this as a problem and trying to fix it yourself”. I tend to work in the latter scenario (people are more interested in analyzing their own data than whatever you can give them) and a lot of the material that’s covered in my classes has now changed because of it. But it obviously demands more of you as an instructor because you need to look after as many datasets as people are in your class.

13. ## Re: ANOVA with multiple imputed Data

Originally Posted by spunky
Well, when I’ve taught or TA’d I’ve seen one of two things happening, depending on the type of project.

One is you give the students a dataset with the issues/kinks covered in class so you can see if they’re able to recognize them and address them. The other is you let students do their own project with their own datasets and then the kinks and peculiarities of the dataset reveal themselves as the project goes along. When you find yourself in the latter situation is when the students may struggle a little bit more because you can’t possibly cover every single data issue in an introductory class (like how to handle missing data or what to do if you have a truncated variable, etc.) and they get lost trying to figure things out themselves. So I feel like whereas in scenario #1 you just tell the person “go look it up on your notes” in scenario #2, as an instructor, it’s more like “wow, good job for recognizing this as a problem and trying to fix it yourself”. I tend to work in the latter scenario (people are more interested in analyzing their own data than whatever you can give them) and a lot of the material that’s covered in my classes has now changed because of it. But it obviously demands more of you as an instructor because you need to look after as many datasets as people are in your class.
We gave the illusion of choice in our class. Students could pick any data set they desired, so long as it was from our pool of 4-6 pre-approved sets ... it helped us focus the scope to what we had taught. Somehow students always came into the TA lab hours saying "We didn't do this in class!" Then, I would show them in their notebook or the course notes where we did it. You're a bit more bold since you let them pick any data set they want.

14. ## Re: ANOVA with multiple imputed Data

If all else fails, you can run it as a General Linear Model using the original data then imputation is not needed.

15. ## Re: ANOVA with multiple imputed Data

Originally Posted by katxt
If all else fails, you can run it as a General Linear Model using the original data then imputation is not needed.
This is me being curious here. When you say "run it as a GLM" do you mean re-phrasing the problem as an unbalanced groups design (where the missingness implies factors with an unequal number of participants) or do you mean running a linear mixed effects model that can naturally handle unbalanced groups?