+ Reply to Thread
Results 1 to 8 of 8

Thread: How to run and interpret a quadratic variable in ordered logit

  1. #1
    Points: 881, Level: 15
    Level completed: 81%, Points required for next Level: 19

    Posts
    20
    Thanks
    2
    Thanked 0 Times in 0 Posts

    How to run and interpret a quadratic variable in ordered logit



    Hello everyone,

    I've run an ordered logit model (in Stata 12) with a quadratic age variable (8 age groups) on a dependent variable with 5 categories (self reported health [poor to excellent]. I've run the margins command as per the instructions Bukharin gave me awhile back.
    http://www.talkstats.com/showthread....e-Binomial-Reg.

    What I'm getting for outcome 1 (poor health) is this:

    | Delta-method
    | Margin Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    _at |
    1 | .0823576 .0106364 7.74 0.000 .0615106 .1032046
    2 | .0705586 .0074272 9.50 0.000 .0560015 .0851157
    3 | .0602709 .0049174 12.26 0.000 .0506329 .0699089
    4 | .0513465 .0031103 16.51 0.000 .0452505 .0574425
    5 | .0436402 .0021147 20.64 0.000 .0394955 .0477849
    6 | .0370132 .0019898 18.60 0.000 .0331131 .0409132
    7 | .0313351 .0023151 13.54 0.000 .0267976 .0358726
    8 | .0264858 .0026728 9.91 0.000 .0212472 .0317244

    I'm not sure how to interpret this: the youngest group (45-49 years of age) has a higher probability...whereas the oldest (80+) has the lowest. This jseems backwards to me but it's probably because I'm not clear on how to interpret it. Perhaps they are cumulative probabilities - if that's the case...I'm also not sure how to interpret them.

    Not sure if this could be an issue but the age variable also includes group 0 (younger than 45) - but this was not included in the regression, nor the marginal analysis (asked for categories 1/8 [not 0]).

    Anywho,
    Thanks for your help
    Sean
    Last edited by seandb; 10-20-2012 at 12:23 PM. Reason: More details

  2. #2
    RoboStataRaptor
    Points: 7,301, Level: 56
    Level completed: 76%, Points required for next Level: 49
    bukharin's Avatar
    Location
    Sydney, Australia
    Posts
    1,015
    Thanks
    9
    Thanked 240 Times in 233 Posts

    Re: How to run and interpret a quadratic variable in ordered logit

    I would start by cross-tabulating age group vs self rated health, and then running a simple model with age as the only predictor. Do the results agree (more or less)?

  3. #3
    Points: 881, Level: 15
    Level completed: 81%, Points required for next Level: 19

    Posts
    20
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: How to run and interpret a quadratic variable in ordered logit

    Quote Originally Posted by bukharin View Post
    I would start by cross-tabulating age group vs self rated health, and then running a simple model with age as the only predictor. Do the results agree (more or less)?
    Here's the cross tab results. I would have thought that the margins results would go the other way. The one's I posted prior were for poor health status. I think my problem is that I'm not sure what the margins are telling me, so I don't know if I'm interpreting them correctly. Would you be able to do me a huge favour and walk me through an example interpretation of the margins output?

    Age group of the | Self-Reported Health
    respondent. | ...poor? ...fair? ...good? ...very g ...excell | Total
    -------------------+-------------------------------------------------------+----------
    45 to 49 | 69 245 653 590 381 | 1,938
    50 to 54 | 84 294 643 601 359 | 1,981
    55 to 59 | 114 299 628 578 356 | 1,975
    60 to 64 | 91 274 529 526 347 | 1,767
    65 to 69 | 69 234 462 393 227 | 1,385
    70 to 74 | 50 194 380 281 155 | 1,060
    75 to 80 | 65 206 313 206 91 | 881
    80 years and older | 82 261 374 238 90 | 1,045
    -------------------+-------------------------------------------------------+----------
    Total | 624 2,007 3,982 3,413 2,006 | 12,032

    For another example, here's the margins for the highest category excellent health status (The probabilities begun going the other way for Very good and Excellent health).

    ------------------------------------------------------------------------------
    | Delta-method
    | Margin Std. Err. z P>|z| [95% Conf. Interval]
    -------------+----------------------------------------------------------------
    _at |
    1 | .135346 .0077956 17.36 0.000 .120067 .150625
    2 | .1561369 .0052982 29.47 0.000 .1457527 .1665211
    3 | .1792657 .0043016 41.67 0.000 .1708349 .1876966
    4 | .204763 .0073369 27.91 0.000 .1903831 .219143
    5 | .2326016 .0126392 18.40 0.000 .2078291 .257374
    6 | .2626901 .0190511 13.79 0.000 .2253506 .3000297
    7 | .2948708 .0262087 11.25 0.000 .2435026 .346239
    8 | .3289206 .0338698 9.71 0.000 .2625369 .3953042
    ------------------------------------------------------------------------------

    Here's a regression with just age (linear) on self reported health status:

    Survey: Ordered logistic regression Number of obs = 12032
    Population size = 12032.46
    Replications = 500
    Wald chi2(1) = 166.67
    Prob > chi2 = 0.0000

    -------------------------------------------------------------------------------------
    | Observed Bstrap *
    XSelfReportedHealth | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
    --------------------+----------------------------------------------------------------
    XAge | .8870974 .0082319 -12.91 0.000 .8711089 .9033793
    --------------------+----------------------------------------------------------------
    /cut1 | -3.466362 .0655787 -52.86 0.000 -3.594894 -3.33783
    /cut2 | -1.824878 .045673 -39.96 0.000 -1.914395 -1.735361
    /cut3 | -.2673545 .0422428 -6.33 0.000 -.3501488 -.1845601
    /cut4 | 1.107109 .044474 24.89 0.000 1.019942 1.194277
    -------------------------------------------------------------------------------------

    Thanks,
    Sean

  4. #4
    RoboStataRaptor
    Points: 7,301, Level: 56
    Level completed: 76%, Points required for next Level: 49
    bukharin's Avatar
    Location
    Sydney, Australia
    Posts
    1,015
    Thanks
    9
    Thanked 240 Times in 233 Posts

    Re: How to run and interpret a quadratic variable in ordered logit

    It looks to me like it's working really nicely with these data. Here is what I get from running a simple linear model with your data:
    Code: 
    clear
    set more off
    
    input age count1 count2 count3 count4 count5
    45 69 245 653 590 381 
    50 84 294 643 601 359 
    55 114 299 628 578 356 
    60 91 274 529 526 347 
    65 69 234 462 393 227 
    70 50 194 380 281 155 
    75 65 206 313 206 91  
    80 82 261 374 238 90
    end
    
    * empirically observed proportions
    egen total=rowtotal(count*)
    foreach cat of numlist 1/5 {
    	gen obs`cat'=count`cat' / total
    }
    tempfile observed
    save `observed'
    
    * now reshape for analysis
    reshape long count, i(age) j(health)
    
    lab define health 1 "poor" 2 "fair" 3 "good" 4 "very good" 5 "excellent"
    lab val health health
    tab age health [fw=count], row
    
    * ordinal logit model
    ologit health age [fw=count]
    estimates store mymodel
    
    * obtain adjust probabilities of each level of health by age
    tempfile predicted
    
    foreach health of numlist 1/5 {
    	estimates restore mymodel
    	margins, predict(outcome(`health')) at(age=(45(5)80)) post
    	preserve
    	parmest, norestore
    	gen health=`health'
    	gen age=5 * _n + 40
    	capture append using `predicted'
    	save `predicted', replace
    	restore
    }
    
    * now plot predicted and observed probabilities against age
    use `predicted', clear
    keep health age estimate
    reshape wide estimate, i(age) j(health)
    
    * merge in observed probabilities
    merge 1:1 age using `observed'
    
    twoway scatter obs* age, mstyle(p1 p2 p3 p4 p5) || ///
    	line estimate* age, sort lstyle(p1 p2 p3 p4 p5) ///
    	title(Observed vs predicted probability of health status) ///
    	legend(title(Health status) ///
    		order(1 "poor" 2 "fair" 3 "good" 4 "very good" 5 "excellent")) ///
    	xtitle(Age) ytitle(Probability)

  5. #5
    Points: 881, Level: 15
    Level completed: 81%, Points required for next Level: 19

    Posts
    20
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: How to run and interpret a quadratic variable in ordered logit

    Thanks Bukharin. I'm still not sure how to interpret the margins though. For the first margins example (poor health status - outcome 1). The 45-49 age group (1) has a probability of .0823576 while the 80+ age group (8) has a probability of .0264858. How would I interpret this? Is it that the 45-49 age group has a higher probability of being in the next group (fair health)?

    If that's correct then for the highest outcome (5 - excellent health status). The probability for the 45-49 age group (1) is .135346 while the probability for the 80+ age group (8) is .3289206. I'm not sure what that means though.

    Take care,
    Sean

  6. #6
    RoboStataRaptor
    Points: 7,301, Level: 56
    Level completed: 76%, Points required for next Level: 49
    bukharin's Avatar
    Location
    Sydney, Australia
    Posts
    1,015
    Thanks
    9
    Thanked 240 Times in 233 Posts

    Re: How to run and interpret a quadratic variable in ordered logit

    Your interpretation of the -margins- output is correct but I worry that you have some problem with your model. You can see that the coefficient for XAge in your model is positive - so as people get older they tend to move up health categories. When I ran the simple linear model I got a negative coefficient for age which is more what you'd expect:
    Code: 
    . ologit health age [fw=count], nolog
    
    Ordered logistic regression                       Number of obs   =      12032
                                                      LR chi2(1)      =     199.98
                                                      Prob > chi2     =     0.0000
    Log likelihood = -17638.075                       Pseudo R2       =     0.0056
    
    ------------------------------------------------------------------------------
          health |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             age |  -.0211823   .0015016   -14.11   0.000    -.0241254   -.0182391
    -------------+----------------------------------------------------------------
           /cut1 |   -4.18985   .1005756                     -4.386974   -3.992725
           /cut2 |  -2.546833   .0934131                     -2.729919   -2.363747
           /cut3 |  -1.055638   .0907687                     -1.233542   -.8777349
           /cut4 |    .368005   .0909177                      .1898095    .5462004
    ------------------------------------------------------------------------------
    I'm a little puzzled by your "Population size = 12032.46" - is the model a little more complex than you've described?

  7. #7
    Points: 881, Level: 15
    Level completed: 81%, Points required for next Level: 19

    Posts
    20
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: How to run and interpret a quadratic variable in ordered logit

    Hey Bukharin,

    I think it's positive because it's an odds-ratio and not the coeffecient. The population size is off because of bootstrapped estimates: here's what I'm getting without bootstraps or odds-ratio.

    ologit XSelfReportedHealth XAge if XAge>0

    Iteration 0: log likelihood = -17738.065
    Iteration 1: log likelihood = -17638.188
    Iteration 2: log likelihood = -17638.075
    Iteration 3: log likelihood = -17638.075

    Ordered logistic regression Number of obs = 12032
    LR chi2(1) = 199.98
    Prob > chi2 = 0.0000
    Log likelihood = -17638.075 Pseudo R2 = 0.0056

    -------------------------------------------------------------------------------------
    XSelfReportedHealth | Coef. Std. Err. z P>|z| [95% Conf. Interval]
    --------------------+----------------------------------------------------------------
    XAge | -.1059113 .0075081 -14.11 0.000 -.120627 -.0911956
    --------------------+----------------------------------------------------------------
    /cut1 | -3.34256 .0519516 -3.444383 -3.240736
    /cut2 | -1.699542 .0378382 -1.773704 -1.625381
    /cut3 | -.2083479 .0341976 -.275374 -.1413219
    /cut4 | 1.215295 .0368326 1.143105 1.287486
    -------------------------------------------------------------------------------------

  8. #8
    RoboStataRaptor
    Points: 7,301, Level: 56
    Level completed: 76%, Points required for next Level: 49
    bukharin's Avatar
    Location
    Sydney, Australia
    Posts
    1,015
    Thanks
    9
    Thanked 240 Times in 233 Posts

    Re: How to run and interpret a quadratic variable in ordered logit


    Sorry, you're right - I didn't see that you'd requested odds ratios.

    In any case that simple model should definitely show people shifting to lower health categories as they get older - what do you get when running -margins- directly after the above model? Please post both your -margins- command and its output.

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats