# Large logit coefficient - what it means

#### neil1690

##### New Member
I am constructing a logit model with 6 explanators, trying to predict outcomes of hockey matches using odds (the variable I am referring to below) and other variables.
I'm just looking for some information on why a logit model may produce very a large coefficient (13.3186), how one should deal with it, and what it means. It was also statistically significant: std. error 5.48536, wald 5.892. All other variables were insignificant.
I realise I have given a minimal amount of info here, but if anyone could point me in the direction of this, or help, it would be great. I have exhausted other sources in relation to answering this. Thanks.

#### noetsi

##### No cake for spunky
One reason it can be large is that the ML estimates don't exist, that is iteration failed. However you should get a warning if this occured.

#### neil1690

##### New Member
Thanks for the reply. I didn't get a warning to that effect, the iterations were completed. I'm thinking perhaps was the variable in question too correlated with the dependent relative to the others, which were all insignificant.

#### noetsi

##### No cake for spunky
One reason that you get large coefficients, I now know is partial data separation or full data separation. But as noted you will get an error message in most software. Another reason might be the unit of the IV. For example if you measure change in income as the DV and your unit for the IV is decade, you will get a huge coefficient. Obviously that is a silly example, but you might look at what the unit of your IV is.

If the IV in question was the only major driver, it may be it leads to signficant change.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Key question, when looking back at the data does the coefficient make sense, but is just surprising?

Does it make sense in that the value could actually be achieved. Was the DV continous? If not can you create an unadjusted 2 x 2 table and see that this always or never occurs with your dependent. If it is continuous, calculate the mean or median of the variable for the two groups (stratified by the dependent variable). Do these data seem appropriate given the generated coefficient??

Lastly, what is your Odds Ratio looking like. Is the Confidence interval about right or does reach toward infinity?

Also, what is the model fit like, and how many values do you have in each group of the dependent variable and how many IV are you using?

#### noetsi

##### No cake for spunky
Variables with huge standard errors are also a sign often that there is a problem with that variable.

#### neil1690

##### New Member
Would SPSS give an error message if a separation occured? The unit of this ind. variable is probabilities implied by bookmaker's odd, ie. decimal odds of 5 imply a probability of 0.2. It is in this form (0.2) that the ind. variable has been used. But I see what you mean with that example.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
So do you think the beta coefficient is incorrect? If this is logistic regression, does the Hosmer Lemeshow test so appropriate fit. Have you stratified the probabilities by dependent group and is there a big difference that explains your results?

#### neil1690

##### New Member
Sorry I only got a quick reply earlier, I had to go somewhere just after I read the replies!
The coefficient has the correct sign. In the context of the model I would expect the a greater prob of a 'one' in the dependent as this variable increases. But, yeah, the magnitude is baffling.
Well, exp(13.3186) = 605863.43, with Confidence Interval - LOWER 12.982, UPPER 28276381559.
Hosmer Lemeshow shows: sig. 0.774, indicating a good fit. Haven't stratified it yet, I'll have a go soon.

#### Dason

Can you tell us more about the predictor that gives you this beta? What are the possible range of values it takes on?

#### noetsi

##### No cake for spunky
Would SPSS give an error message if a separation occured?
YES it does the following:

Warnings
|-----------------------------------------------------------------------------------------|
|The parameter covariance matrix cannot be computed. Remaining statistics will be omitted.|

This may be useful on this issue (I ran into major separation and quasi seperation issues recently so I have been working through it). From what you have said you don't have this problem in that SPSS unlike SAS will not even generate estimates if you have this problem.

http://www.ats.ucla.edu/stat/mult_pkg/faq/general/complete_separation_logit_models.htm

#### neil1690

##### New Member
No problem. Probabilities which range from a min. of 0.418 to 0.657. Average (mean) of 0.55. 100 observations. For the dependents, 0 and 1, the average when 0 is 0.537, the average when 1, 0.562.

#### neil1690

##### New Member
I haven't been given that message and the estimates were generated ok. Cheers for the link! very useful.