+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 24

Thread: ANOVA with multiple imputed Data

  1. #1
    Points: 53, Level: 1
    Level completed: 6%, Points required for next Level: 47

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    ANOVA with multiple imputed Data




    Hello

    I used multiple imputation on my data to get a complete data set. I want to do a ANOVA now. Does anybody know how to do that correctly?

    SPSS calculates ANOVAS for every single imputation group but does not pool the results. Some of my imputation groups are significant (e.g. 0,04) and some aren't (e.g. 0,07).

    There is some small literature about pooling multiple imputed data but I don't understand it...(https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4029775/)

    Thanks in advance!

    Froop

  2. #2
    Omega Contributor
    Points: 38,253, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,989
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: ANOVA with multiple imputed Data

    The process is actually much easier than you probably think, based on Rubin's approach.


    You average the estimates from imputation based analyses and that gets you the estimate value across imputes.


    Then the SE within and between imputations based analyses values. The section in your link about single pooling covers this. So the estimate is super easy to get, then you create the SE based on within and between imputation variability. This makes the SE measure a little larger since it takes into account the slight variability accounted for between imputes, since it is probability based.
    Stop cowardice, ban guns!

  3. #3
    Points: 53, Level: 1
    Level completed: 6%, Points required for next Level: 47

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: ANOVA with multiple imputed Data

    SE = standard error?

    So I just calculate the average of "everything"?^^

    Thanks!

  4. #4
    TS Contributor
    Points: 22,339, Level: 92
    Level completed: 99%, Points required for next Level: 11
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: ANOVA with multiple imputed Data

    Quote Originally Posted by froop91 View Post
    SE = standard error?

    So I just calculate the average of "everything"?^^

    Thanks!
    Not everything. The only thing you can "average" (as in just taking the mean) are the parameter estimates you obtain (regression coefficients, correlation coefficients, etc.). That would be 'Q' on your link in Eq (1). Then you need to calculate the within- and between- imputation variance, get the test statistic manually, etc.

    Overall its is quite a drag to do. But I just wanted to remind you that this is not as easy as just "taking the average of everything" or "averaging the datasets and running the analysis on it". It's a little more complicated than that.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  5. #5
    Points: 53, Level: 1
    Level completed: 6%, Points required for next Level: 47

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: ANOVA with multiple imputed Data

    Ok, then i have to look how I can do within- and between- imputation variance.

    thx a lot

  6. #6
    Points: 53, Level: 1
    Level completed: 6%, Points required for next Level: 47

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: ANOVA with multiple imputed Data

    Just to be sure i try to make an example:



    Original Data:

    Participant 1: 5 4 3 -
    Participant 2: - 2 1 1
    Participant 3: 1 2 4 -


    Imputation 1:

    Participant 1: 5 4 3 2
    Participant 2: 2 2 1 1
    Participant 3: 1 2 4 3


    Imputation 2:

    Participant 1: 5 4 3 1
    Participant 2: 2 2 1 1
    Participant 3: 1 2 4 1


    So I average the Imputations


    Participant 1: 5 4 3 1,5
    Participant 2: 2 2 1 1
    Participant 3: 1 2 4 2


    And based on this averaged imputation sheet I calculate within and between variance (by hand)? If this is correct, how can I tell SPSS to average the imputations? As far as I know, SPSS keeps all imputations separated and only gives pooled results on some calculations like "frequency" but doesn't pool the data itself)

    Is it better to import my data to excel to be able to calculate properly or can do Spss all the calculations?

    thx for help and sorry for the question but I am still confused in the topic
    Last edited by froop91; 09-07-2017 at 04:31 AM.

  7. #7
    Omega Contributor
    Points: 38,253, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,989
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: ANOVA with multiple imputed Data

    You average your estimates, I am guessing you don't have that many impute sets, so just put them in a new data frame and ask SPSS to average.


    As, Spunky reiterated the variance part requires the formula in the paper. Yes, SE = standard errors.
    Stop cowardice, ban guns!

  8. #8
    TS Contributor
    Points: 22,339, Level: 92
    Level completed: 99%, Points required for next Level: 11
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: ANOVA with multiple imputed Data

    Quote Originally Posted by froop91 View Post
    Just to be sure i try to make an example:



    Original Data:

    Participant 1: 5 4 3 -
    Participant 2: - 2 1 1
    Participant 3: 1 2 4 -


    Imputation 1:

    Participant 1: 5 4 3 2
    Participant 2: 2 2 1 1
    Participant 3: 1 2 4 3


    Imputation 2:

    Participant 1: 5 4 3 1
    Participant 2: 2 2 1 1
    Participant 3: 1 2 4 1


    So I average the Imputations


    Participant 1: 5 4 3 1,5
    Participant 2: 2 2 1 1
    Participant 3: 1 2 4 2


    And based on this averaged imputation sheet I calculate within and between variance (by hand)? If this is correct, how can I tell SPSS to average the imputations?
    NO! This is *exactly* what we warned you not to do! What you should be doing looks more like:


    Original Data:

    Participant 1: 5 4 3 -
    Participant 2: - 2 1 1
    Participant 3: 1 2 4 -


    Imputation 1: <---- RUN ANOVA HERE, GET PARAMETER ESTIMATES (WE'LL CALL THEM Q1)

    Participant 1: 5 4 3 2
    Participant 2: 2 2 1 1
    Participant 3: 1 2 4 3


    Imputation 2: <---- RUN ANOVA HERE, GET PARAMETER ESTIMATES (WE'LL CALL THEM Q2)

    Participant 1: 5 4 3 1
    Participant 2: 2 2 1 1
    Participant 3: 1 2 4 1


    Now you have two vectors of parameter estimates, Q1 and Q2. You average Q1 and Q2 to get the parameter estimates on which you will do your hypothesis tests, you pool the variances and standard errors of Q1 and Q2 to get the correct within- and between- imputation variance and finally you get the F-statistic that you want. Everything by hand following Eq. 1 - 6 of the document you attached.

    Notice that, as shown in the example of the article you attached, you'll need to reframe the ANOVA as a multiple regression so you'll need to ask it for the regression equation to get the regression coefficients and R-squared (whose F-test is statistically equivalent to the F-test you get by taking ratios of Mean Squares.

    Here's my honest opinion. If you're dealing with missing data switch software programs. SPSS makes things so unnecessarily complicated that it almost makes you wonder why they bothered only giving you half of the missing data routine.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  9. #9
    Omega Contributor
    Points: 38,253, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,989
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: ANOVA with multiple imputed Data

    Just post the output for the m analyses and make Spunky do it for you!
    Stop cowardice, ban guns!

  10. #10
    Points: 53, Level: 1
    Level completed: 6%, Points required for next Level: 47

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: ANOVA with multiple imputed Data

    Quote Originally Posted by spunky View Post

    Here's my honest opinion. If you're dealing with missing data switch software programs. SPSS makes things so unnecessarily complicated that it almost makes you wonder why they bothered only giving you half of the missing data routine.
    Which one do you prefere then? Stata? R?

    You average your estimates, I am guessing you don't have that many impute sets, so just put them in a new data frame and ask SPSS to average.
    Unfortunately I have 20 Imputations :/


    Actually I try to test 2 effects on 3 outcomes at 3 brands (MANOVA) but I think I just do several ANOVAS.

    If I just make a ANOVA of one of the parts, SPSS gives me the following:
    https://www.imageupload.co.uk/image/DFuV


    Just post the output for the m analyses and make Spunky do it for you!
    Would be the easiest but its my exam project so I have to do it^^
    Maybe if i switch to a programm that gets it done for me it will get easier...

    Usually I'm not that bad at math but these eq. (1) - (6) look some kind of difficult... don't know why... maybe if some1 could give me a small calculation-example... I can do the rest then


    Thanks a lot, guys
    Last edited by froop91; 09-08-2017 at 03:38 AM.

  11. #11
    Omega Contributor
    Points: 38,253, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,989
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: ANOVA with multiple imputed Data

    I haven't done it with R yet, but I used SAS (i.e., PROC MIANALYZE) and it is as easy as inputting values.
    Last edited by hlsmith; 09-09-2017 at 07:30 PM. Reason: Change imputing to inputting, typo
    Stop cowardice, ban guns!

  12. #12
    TS Contributor
    Points: 22,339, Level: 92
    Level completed: 99%, Points required for next Level: 11
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: ANOVA with multiple imputed Data

    I use the mice package in R. But I know STATA also has good missing-data handling capabilities so whichever one you think is easier for you I guess.

    Quote Originally Posted by froop91 View Post
    Would be the easiest but its my exam project so I have to do it^^
    If this is any exam project, didn't they teach you in school how do to it then before they let you do it yourself? I'm just wondering if maybe you have something on your notes on how do to this stuff and then you won't need to switch software or anything.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  13. #13
    Points: 1,741, Level: 24
    Level completed: 41%, Points required for next Level: 59

    Posts
    230
    Thanks
    37
    Thanked 68 Times in 59 Posts

    Re: ANOVA with multiple imputed Data

    Quote Originally Posted by spunky View Post
    If this is any exam project, didn't they teach you in school how do to it then before they let you do it yourself? I'm just wondering if maybe you have something on your notes on how do to this stuff and then you won't need to switch software or anything.
    The answer is always yes, despite what many students tell you, barring any crappy for-profit schools and some community colleges where I have seen this happen. That's the minority, though. On occasion, I've seen professors assign a project with the intention of students completing parts as the material is covered in class.

  14. #14
    TS Contributor
    Points: 22,339, Level: 92
    Level completed: 99%, Points required for next Level: 11
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: ANOVA with multiple imputed Data

    Quote Originally Posted by ondansetron View Post
    The answer is always yes, despite what many students tell you, barring any crappy for-profit schools and some community colleges where I have seen this happen. That's the minority, though. On occasion, I've seen professors assign a project with the intention of students completing parts as the material is covered in class.
    Well, when I’ve taught or TA’d I’ve seen one of two things happening, depending on the type of project.

    One is you give the students a dataset with the issues/kinks covered in class so you can see if they’re able to recognize them and address them. The other is you let students do their own project with their own datasets and then the kinks and peculiarities of the dataset reveal themselves as the project goes along. When you find yourself in the latter situation is when the students may struggle a little bit more because you can’t possibly cover every single data issue in an introductory class (like how to handle missing data or what to do if you have a truncated variable, etc.) and they get lost trying to figure things out themselves. So I feel like whereas in scenario #1 you just tell the person “go look it up on your notes” in scenario #2, as an instructor, it’s more like “wow, good job for recognizing this as a problem and trying to fix it yourself”. I tend to work in the latter scenario (people are more interested in analyzing their own data than whatever you can give them) and a lot of the material that’s covered in my classes has now changed because of it. But it obviously demands more of you as an instructor because you need to look after as many datasets as people are in your class.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  15. #15
    Points: 1,741, Level: 24
    Level completed: 41%, Points required for next Level: 59

    Posts
    230
    Thanks
    37
    Thanked 68 Times in 59 Posts

    Re: ANOVA with multiple imputed Data


    Quote Originally Posted by spunky View Post
    Well, when I’ve taught or TA’d I’ve seen one of two things happening, depending on the type of project.

    One is you give the students a dataset with the issues/kinks covered in class so you can see if they’re able to recognize them and address them. The other is you let students do their own project with their own datasets and then the kinks and peculiarities of the dataset reveal themselves as the project goes along. When you find yourself in the latter situation is when the students may struggle a little bit more because you can’t possibly cover every single data issue in an introductory class (like how to handle missing data or what to do if you have a truncated variable, etc.) and they get lost trying to figure things out themselves. So I feel like whereas in scenario #1 you just tell the person “go look it up on your notes” in scenario #2, as an instructor, it’s more like “wow, good job for recognizing this as a problem and trying to fix it yourself”. I tend to work in the latter scenario (people are more interested in analyzing their own data than whatever you can give them) and a lot of the material that’s covered in my classes has now changed because of it. But it obviously demands more of you as an instructor because you need to look after as many datasets as people are in your class.
    We gave the illusion of choice in our class. Students could pick any data set they desired, so long as it was from our pool of 4-6 pre-approved sets ... it helped us focus the scope to what we had taught. Somehow students always came into the TA lab hours saying "We didn't do this in class!" Then, I would show them in their notebook or the course notes where we did it. You're a bit more bold since you let them pick any data set they want.

+ Reply to Thread
Page 1 of 2 1 2 LastLast

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats