+ Reply to Thread
Results 1 to 2 of 2

Thread: Poisson regression for count data

  1. #1
    Points: 3,621, Level: 37
    Level completed: 81%, Points required for next Level: 29
    askazy's Avatar
    Location
    Holand
    Posts
    160
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Poisson regression for count data




    I need to fit a model to study the severity of a bicycle accident based on other explanatory variables. The data description is

    X1: is the use of helmet, a categorical variable (1=Yes,0=No)

    X2:the speed at which he was riding, a categoricalvariable (1=10−15,2=15−20,3=20−25) in mile/hour

    X3: the severity of a bicycle accident, is a count with two levels (not severe,severe)

    Let the data following



    I was thinking in use a Poisson Regression Model, but in this case we have a two counts data. Can anyone give me suggest?
    Last edited by askazy; 07-10-2016 at 04:34 PM.
    The difference between stupidity and genius is that genius has its limits.
    "Albert Einstein"

  2. #2
    Points: 4,664, Level: 43
    Level completed: 57%, Points required for next Level: 86
    kiton's Avatar
    Location
    Corn field
    Posts
    234
    Thanks
    47
    Thanked 51 Times in 46 Posts

    Re: Poisson regression for count data


    Hello there!

    Indeed, accidents are a form of count data. There are two estimators typically used in such case: (1) Poisson, and (2) Negative binomial. The latter is "commonly used" under the assumption that it addresses the issue of over-dispersion in the outcome variable. However, this issue is actually far more complicated and NB may not address it at all. The former, on the other hand, often suffers from a "limitation" that we cannot always assume the data to be Poisson distributed. But this is easily "fixed" with robust (or clustered when necessary) standard errors.

    Having said that, I'd recommend you estimate your model using both estimators and examine the robustness of the results (and also the appropriate model fit statistics). Now, in terms of the model itself. One way would be to run two separate models for severe and non-severe cases. Yet, another way could be the following. Firstly, create a new binary variable, where 1 = severe case, 0 = non-severe. Secondly, create a new outcome for the accidents by summing the existing two. Then estimate a single model using both estimators. I am a Stata guy, so cannot help further with R, sorry.

    Don't forget that in case of these estimators, the resulting regression coefficient would not be informative by itself in terms of its impact on the outcome. Therefore, I advise you to use the option that displays the incidence-risk ratios (IRRs) that tell you the percent of change in the rate of the outcome.
    Last edited by kiton; 07-10-2016 at 10:27 AM.

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats