# Thread: low frequency independent variable in logistic regression

1. ## low frequency independent variable in logistic regression

I'm running a logistic regression and I think there's a variable that I'm not treating properly.

I'm modeling traffic accidents (1) vs. no accident (0) on the highway for the day. I have a binary rain variable that is precipitation (1) or no precipitation (0).

I have only 30 days with a value of 1 for that variable, however when it does occur 28 of the 30 times there is an accident. My coefficient for that variable is very low though. I think this is because 90% of the time I have a 0 for weather, and there are still many accidents. The regression is showing a weak correlation because I have 9 times as many no-weather days where there may or may not be an accident.

I'm guessing that I have to treat this variable differently since it's such low frequency.
Would I just model accidents 1/0 and with a single variable, and only use days where it rained, and then find the coefficient to manually use in my regression with multiple variables?

2. ## Re: low frequency independent variable in logistic regression

How many observations do you have?

You should look into using the Firth Correction or exact logistic regression, perhaps.

P.S., sweet account name!

3. ## Re: low frequency independent variable in logistic regression

Originally Posted by hlsmith
How many observations do you have?

You should look into using the Firth Correction or exact logistic regression, perhaps.

P.S., sweet account name!
N=355

I'm not familiar with those, but will Google. Thank you ^2!!

4. ## Re: low frequency independent variable in logistic regression

What program are you using, some will spit out a warning if your data are too sparse for the model to converge. Something like "quasi-complete separation, etc."

5. ## Re: low frequency independent variable in logistic regression

Originally Posted by hlsmith
What program are you using, some will spit out a warning if your data are too sparse for the model to converge. Something like "quasi-complete separation, etc."
Hi, I am using STATA, but will likely transition to SAS EG. I don't receive a warning that I've noticed.

Having 112 accidents with a -0- for weather is negating the fact that 28/30 times weather is 1 then accident is also a 1. Maybe if this wasn't a binary variable and was amount of rain/snow that would help?

6. ## Re: low frequency independent variable in logistic regression

Would you have just two quantitative values or would more observations have a value?

7. ## Re: low frequency independent variable in logistic regression

Originally Posted by hlsmith
Would you have just two quantitative values or would more observations have a value?
If I went to actual precip. levels? I'd just use NOAA data, which are in inches to 1-2 decimals.

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts