+ Reply to Thread
Results 1 to 9 of 9

Thread: Comparing fit of GLM to OLS regression

  1. #1
    Points: 855, Level: 15
    Level completed: 55%, Points required for next Level: 45

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Comparing fit of GLM to OLS regression




    Hello talkstat.

    First post in this magnificent forum!

    In my master thesis I'm estimating health care expenses to patients, who has experienced a occupational injure on administrative data.

    The depended variable is total health care expenses for a given individual a year after the injure.

    Due to typical right skewness in the depended variable I have estimated two models; the first is GLM model using log-link function and a gamma distribution for the depended variable. The mean expenses are estimated to around DKK16,000 (approximately $2200) for the treatment group (people who has experienced a occupational injurie) and around DKK5,000 for the control group.
    As an alternative to the GLM model, I have also estimated the expenses using a OLS-regression with logarithm transformed depended variable. After using a Duan-smearing factor and "retransforming" the estimate, I obtain an estimate for the predicted expenses of DKK15,000 and DKK4,000 for the treated and the control group.

    My question is; how can I compare for the fit of the models and choose the "best" one or at least the better one? Both with regard to a graphically and/or test.

    Steinberg

  2. #2
    Points: 4,664, Level: 43
    Level completed: 57%, Points required for next Level: 86
    kiton's Avatar
    Location
    Corn field
    Posts
    234
    Thanks
    47
    Thanked 51 Times in 46 Posts

    Re: Comparing fit of GLM to OLS regression

    There are several ways to explore the model fit. Easiest would be to compare the AIC and BIC -- smaller values indicate a "better" fit. Next, you can examine the R^2 -- higher values are desirable. Finally, you can predict the residuals and run a Q-Q plot to compare their distribution (you can top it with some formal test, say, Jarque-Bera and see which model's residuals have a smaller chi-square statistic). Additionally, I'd consider comparing the standard errors to see which model provides more efficient ones.

    On a side note, have you checked if your models satisfy the required assumptions? I am asking because if, say, you have not met the OLS assumptions (at minimum: normality of residuals, lack of heteroskedasticity and multicollinearity) then what is the purpose of comparing its estimates with other estimators.

  3. #3
    Points: 855, Level: 15
    Level completed: 55%, Points required for next Level: 45

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Comparing fit of GLM to OLS regression

    Thanks Kinton!

    I've check to make sure the assumptions of the OLS-regression holds. I does.

    I'll will give you advice a go!

  4. #4
    Points: 855, Level: 15
    Level completed: 55%, Points required for next Level: 45

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Comparing fit of GLM to OLS regression

    Is the Jarque-Bera test valid, when I have specified a gamma-distribution for the dependend variable?

  5. #5
    Devorador de queso
    Points: 95,754, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,932
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: Comparing fit of GLM to OLS regression

    No. The Jarque-Bera test is a test of normality.
    I don't have emotions and sometimes that makes me very sad.

  6. #6
    Points: 855, Level: 15
    Level completed: 55%, Points required for next Level: 45

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Comparing fit of GLM to OLS regression

    All right.

    So I'm left with AIC?

    Yet, I can't just pick a model by only looking at the AIC, right? I mean using the "relative likelihood", the exponential of the mean value of the "distance" from the two "best" (i.e. lowest AIC) models, I get a really (I mean REALLY) low value.

    What about the assumption of the variance being equal to the squared mean? When I calculate the squared mean of the expected value, I don't even get close to the variance. Is that an argument against the GLM-model?

  7. #7
    Points: 4,664, Level: 43
    Level completed: 57%, Points required for next Level: 86
    kiton's Avatar
    Location
    Corn field
    Posts
    234
    Thanks
    47
    Thanked 51 Times in 46 Posts

    Re: Comparing fit of GLM to OLS regression

    I may be missing something here with the distributions' nuances, but let me elaborate on the following. So, I estimated two simple models using (A) Gaussian family and identity link (default), and (B) Gamma family and log-link -- while the estimated coefficients differ substantially, the residual distribution seems to be identical. The JB test results capture only a minor difference in the chi-squared statistic between the two. As such, is it not plausible to approximate the model fit of a GLM with gamma-log with its residual distribution?

  8. #8
    Points: 855, Level: 15
    Level completed: 55%, Points required for next Level: 45

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Comparing fit of GLM to OLS regression

    Quote Originally Posted by kiton View Post
    I may be missing something here with the distributions' nuances, but let me elaborate on the following. So, I estimated two simple models using (A) Gaussian family and identity link (default), and (B) Gamma family and log-link -- while the estimated coefficients differ substantially, the residual distribution seems to be identical. The JB test results capture only a minor difference in the chi-squared statistic between the two. As such, is it not plausible to approximate the model fit of a GLM with gamma-log with its residual distribution?
    But what I've not used a Gaussian regression, I've used an ordinary least squares regression. From what I understand the reason that there is no R-squared statistics for GLM is that it may be non-linear.

    What about the assumption that the variance is equal to the predicted value squared? What if it does not hold?

  9. #9
    Points: 13, Level: 1
    Level completed: 25%, Points required for next Level: 37

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Comparing fit of GLM to OLS regression


    Shouldn't something be said about the suspicion that the fluctuation is equivalent to the anticipated quality squared? Imagine a scenario in which it doesn't hold.

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats