+ Reply to Thread
Results 1 to 9 of 9

Thread: Trend analysis / regression model with missing data

  1. #1
    Points: 2,462, Level: 30
    Level completed: 8%, Points required for next Level: 138

    Posts
    200
    Thanks
    20
    Thanked 48 Times in 43 Posts

    Trend analysis / regression model with missing data




    Hi,

    I want to apply a Poisson GLM to count data, analyzing trends. However, there are missing counts (i.e. missing values in the outcome variable) in the dataframe. Is there any recommended method how to deal with this? I know there exists "multiple imputation methods" to fill the gaps in the data, but is this the most recomended way if I subsequently want to apply a regression model e.g. for trend analysis?

    Thanks

  2. #2
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Trend analysis / regression model with missing data

    I know nothing about poisson GLM. But, given that statisticians disagree about everything, I think that multiple imputations is the state of the art for missing data that will be used in regression. Or at least it is among the most recommended methods. It is anything but simple for non interval data.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  3. The Following User Says Thank You to noetsi For This Useful Post:

    mmercker (01-26-2016)

  4. #3
    Omega Contributor
    Points: 38,303, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,993
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Trend analysis / regression model with missing data

    Agreed that multiple imputation is a preferred method. Though what do you know about your missingness patterns? Also, some multilevel approaches for data clusters, which have repeat measures - can slide into this type of issue and still function with missing data.
    Stop cowardice, ban guns!

  5. The Following User Says Thank You to hlsmith For This Useful Post:

    mmercker (01-26-2016)

  6. #4
    Points: 2,462, Level: 30
    Level completed: 8%, Points required for next Level: 138

    Posts
    200
    Thanks
    20
    Thanked 48 Times in 43 Posts

    Re: Trend analysis / regression model with missing data

    Thank you both. I found an interesting paper considering regression analysis based on imputed datasets, they present a slightly modiefied approach called "multiple imputation, then delition" :

    https://www.utexas.edu/lbj/sites/def...%20Y's.pdf

    This approach aims to avoid bias of regression parameters by imputed values, as far as I understand.

  7. #5
    Omega Contributor
    Points: 38,303, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,993
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Trend analysis / regression model with missing data

    I just very briefly skimmed the paper. I am not sure you need to use the MID, and it may depend on the type of missingness that you have. Also, their approach, as per their comment MI converges to MID as imputes approach infinity. Well it has been 9 years since this paper and processing lets us conduct now conduct many more imputes easily - this may negate whether you need to go the MID route.


    Do you know the mechanism behind your missingness? That is the most important thing to direct you. Also, what is your sample size and how much data is missing?
    Stop cowardice, ban guns!

  8. The Following User Says Thank You to hlsmith For This Useful Post:

    mmercker (01-28-2016)

  9. #6
    Omega Contributor
    Points: 38,303, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,993
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Trend analysis / regression model with missing data

    This isn't count, but I believe the key parts carryover. An article in JAMA about mixed models and missingness.


    http://ovidsp.tx.ovid.com/sp-3.18.0b...a478366865a856
    Stop cowardice, ban guns!

  10. #7
    Points: 2,462, Level: 30
    Level completed: 8%, Points required for next Level: 138

    Posts
    200
    Thanks
    20
    Thanked 48 Times in 43 Posts

    Re: Trend analysis / regression model with missing data

    Thank you for the intersting information, the type of missingness is "missing at random" (MAR), i.e., the missingness can be related to fully avaiable covariates. Furthermore, approx 20 % of the data are missing, I have approx 1000 data for each of 10 different sites (thus, 10.000 data nested witin sites).

    Unfortunately I can't open your provided link, "ovid login failure", could you please send me title/authors of the publication? Thanks!

    They key thing I want to understand is how imputed values are correctly considered within regression analysis, since 1.) They do not really add additional information to the data but only additional data points, thus, the result should be some kind of "pseudoreplication", 2.) Imputed values are connected to uncertainties, and the question is how to propagate this uncertainties to the SE's of regression coefficients.

  11. #8
    Omega Contributor
    Points: 38,303, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,993
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Trend analysis / regression model with missing data

    Review: JAMA Guide to Statistics and Methods
    407–408
    Analyzing Repeated Measurements Using Mixed Models
    January 2016, Volume 315, Issue 4
    Michelle A. Detry, PhD; Yan Ma, PhD


    Well if data are systematically missing (non-ignorable), then you need them back to attempt not to have an unbiased estimate. Next, multiple imputation does, as I believe you mentioned, provide variability that gets suck up into SE. This accounts for our uncertainty of these imputed values.
    Stop cowardice, ban guns!

  12. #9
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Trend analysis / regression model with missing data


    If you data is missing not at random (MNAR) your basically out of luck. Multiple imputations and all similar approaches only work if data is missing at random (MAR). About the best thing you can do with MNAR data is a form of sensitivity analysis. Allison wrote an article on this, I will try to find it...
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats