# Logit v. Probit: A fight to the death

#### Dason

So by log link you mean:

$$\log(p_i) = \beta_0 + \beta_1X_{1i} + \beta_2X_{2i} + ... + \beta_nX_{ni}$$

Which would give

$$p_i = \exp(\beta_0 + \beta_1X_{1i} + \beta_2X_{2i} + ... + \beta_nX_{ni})$$

which is unbounded above. This may or may not be a problem depending on the parameters and what you're looking to do with the model but it is something to keep in mind.

##### Ninja say what!?!
Ahhh. Very good point Dason! Most of the time, I have outcomes that are very rare. I don't ever run into the problem that p_i is greater than 1. But that would explain why it is not as popular, as it would be a nuisance to deal with.

#### Dason

Like I said - it may or may not be a problem. We deal with cases where our predictions could be an impossible event a lot. If you're doing a simple linear regression where the outcome is known to be a positive quantity then the regression line can go negative - this probably doesn't bug you too much unless your regression line is close to 0 and your prediction intervals have a significant portion on the negative line.

#### vinux

##### Dark Knight
From academic point of view there is merit to look in the details. Why one should limit to probit or logit, one could also go for other CDF, like cdf of laplace or skew normal or Linear Probabililty Model(LPM, is not badif we are studying around mean ) or log(log)... etc as index/link function. Estimation may be difficult.

If a data generated by probit way(simulation) then in estimation, definitely probit will be a good fit. So academically there is no point to argue which one is better. But in practice it matters.

#### Dason

If a data generated by probit way(simulation) then in estimation, definitely probit will be a good fit. So academically there is no point to argue which one is better. But in practice it matters.
And my concern is partially what we do in practice. But I don't think that the academic part can be disregarded. The academic findings drive what is done in practice (it might take years for it to be common practice but the methods come from somewhere). I was interested in interpretations and the reason to use one model over the other. Sure we can do some simulation studies saying "well if the data was generated this way then this method is better for this situation but if it was generated this way..." and that's all well and good but I wanted to explore the other reasons for choosing a model.

We have to choose one model or the other (or do we?) so I was interested in justifications for one over the other.

one could also go for other CDF
Now if we're thinking about purely academic thought experiments I wonder how well the poisson inverse cdf would do as the link function for an integer valued covariate in the binomial situation...

#### noetsi

##### No cake for spunky
My position on this is pretty simple. In the real world of business and government offices, if you can't explain what you are doing very quickly it won't be paid any attention to. Period. I have practical experience in that sadly. So you have to do this calculas. Given that method Y is worse than method X; X will be rejected out of hand by management and thus will be entirely ignored. So the question you have to ask yourself is, is it better to report Y (which is presumably better than nothing, often by quite a bit) and have it accepted or go with X and get nothing.

I know people here believe that you can explain complex statistics to management and get them to accept it in line agencies, but that is not my experience nor of any other analyst I have ever known who work there. Academics and business/government are very different worlds. You would be, unbelievably, lucky even to be given the time to explain the more complex numbers - especially by senior managers who actually make the decisions.

#### Dason

Do you have experience in government or just business?

#### Jake

I don't think anyone doubts that managers have a hard time understanding statistical concepts. A cursory look at this forum reveals that even many junior researchers don't understand basic statistical concepts. What I find a little more puzzling is why on Earth these managers expect you to be able to explain statistical methods to them quickly or easily, or why they feel that a method is worthless if they can't quickly grasp it. I mean... do they pay you to be a teacher of statistics, or to be a statistician? If the latter, then it seems clear to me that what they should care about is simply whether or not you have done your job right. It must be frustrating to work somewhere where this is not the case.

#### noetsi

##### No cake for spunky
Do you have experience in government or just business?
Both. The only real difference, in analysis, is that you have to produce results a lot faster in business. The basic realities I note are pretty much the same in both. And the views of others I have talked to and observed, that is others doing analysis.

I work as a data analyst now at the Florida Department of Education. Before that I worked as a data analyst at Kaiser, Wellcare, and the TN Department of Transportation (the last four jobs after my departure from the wonderous world of academics).

#### bryangoodrich

##### Probably A Mammal
I know people here believe that you can explain complex statistics to management and get them to accept it in line agencies, but that is not my experience nor of any other analyst I have ever known who work there. Academics and business/government are very different worlds. You would be, unbelievably, lucky even to be given the time to explain the more complex numbers - especially by senior managers who actually make the decisions.
That is why you have to be able to take complex ideas and put it into terms they can understand. Rarely do statistician report the statistics. You save that for publication. You present lay ideas to lay people. You can do the more complex model. It is just incumbent on the statistician to be able to communicate the results in a way that management can understand. Certainly odds ratios and RR come as a good device for making that transition, but that should not deter one from using a probit or some other GLM. It may come down to you simply answer the question they sought to ask, and the process remains a black box to them. That is often how an analyst's job is. Don't even try to open pandora's box for them. Leave them ignorant. Give them what they want to the extent they can understand it.

#### noetsi

##### No cake for spunky
I don't think anyone doubts that managers have a hard time understanding statistical concepts. A cursory look at this forum reveals that even many junior researchers don't understand basic statistical concepts. What I find a little more puzzling is why on Earth these managers expect you to be able to explain statistical methods to them quickly or easily, or why they feel that a method is worthless if they can't quickly grasp it. I mean... do they pay you to be a teacher of statistics, or to be a statistician? If the latter, then it seems clear to me that what they should care about is simply whether or not you have done your job right. It must be frustrating to work somewhere where this is not the case.
Why they pay me, a lot of money in the private sector, has often puzzled me. I think the answer to your question lies in the nature of management. They make the decisions, and if they simply take your word for it (given that trust in statistics is not all that high to start with) they aren't making the decisions. You are. If you don't trust your own judgement as a manager, you don't go that far - it's not a niche given to doubts about your ability or humility generally. They also have to sell their decisions and if they can't explain (quickly and intuitively) what they are doing that is a lot harder.

They have a lot more confidence, and interest, in finance than statistics - which I suspect is extremely rare outside a few medical firms in the US. Now it is true that I spent a lot more time in academics than the business world and am not in the same world of understanding of statistics as people here. But I am pretty confident from my observations and talking to others with more experience that anything remotely complex statistically would go nowhere. I discussed an ANOVA project once with a senior analyst, it took ten minutes for him to decide it (a simple project compared to what is discussed here) would go nowhere.

One reason analyst are hired with specialized knowledge or degrees is political. My boss (a really smart man with a PHD in economics) hired me to do fairly basic analysis at a large health care firm because he could use the fact that I had a doctorate to sell what I ran for the company. Never mind my degree had nothing to do with statistics, I was only modestly trained in it, and they ignored everything I actually ran. The doctorate mattered.

Don't go expecting that business or government is rational as academics see it....

#### Dason

Don't go expecting that business or government is rational as academics see it....
At the same time don't go assuming that your experience encompasses all of the private sector and/or government (not that I'm saying you do - but your claims do typically encompass "the private sector"). I know plenty of people that have worked in both business and the government that have had experiences quite the opposite of yours (and some that have had experiences similar to yours).

#### spunky

##### Can't make spagetti
I know plenty of people that have worked in both business and the government that have had experiences quite the opposite of yours (and some that have had experiences similar to yours).
i did an intership on ETS (Education Testing Services, the guys who administer the SATs, GREs, TOEFL, etc.) and my experience was very, very different from noetsi. everybody there understood statistics well beyond the usual MLR/ANOVA and it was me the one who had to do a lot of catching up so i could contribute at least something to the projects i was assigned at... so i guess it depends on who you work with and where you do it...

#### CB

##### Super Moderator
i did an intership on ETS (Education Testing Services, the guys who administer the SATs, GREs, TOEFL, etc.) and my experience was very, very different from noetsi. everybody there understood statistics well beyond the usual MLR/ANOVA and it was me the one who had to do a lot of catching up so i could contribute at least something to the projects i was assigned at... so i guess it depends on who you work with and where you do it...
ETS has sure had some rather competent statisticians over the years... the names Joreskog and Lord come to mind!

I agree that this issue depends a lot on individual contexts. The private sector is catching some stick here, but academics can be pretty standoffish about unfamiliar techniques too. I once had a reviewer describe an article in which I used Spearman's rho and logistic regression as using "rather unique" analyses and asking me to add a bunch of explanatory notes so people would understand these dreadfully new and scary techniques... :O

Anyway, I agree with those pointing out that we should be able to explain the techniques we're using, but the reality is that the pre-existing knowledge of the intended audience for our research does constrain the degree to which this is possible. You can't be talking to a person who doesn't understand the concept of a standard deviation and realistically expect that you'll be able to explain the probit model to them in a satisfactory way (unless you have several months spare to talk about it!)

#### vinux

##### Dark Knight
I actually got bored with typical questions in the forum. This thread interesting.

I worked 5 years in analytics. I developed statistical model for few BIG firms in US. Some of them part of Basel II norms. I have used probit, logit, GAM,...etc.

Let me start from Dason. I am not against probit. My experience was both probit and logit were giving similar results( KS, concordace,., etc). But there are cases probit has performed better than logit. The difference was very less.
One main reason i liked logit because the estimation is straight forward ( not a black box). Have you tried to estimated standared error of beta estimates using probit? The logit SE estimates similar to regression and this gives lot of simplicity.

Using latent variable approach we could explain index models. Instead of probit use,CDF of skewed normal( i guess this make lot of sense). By assuming other CDF as index function we assume different sigmoid curve. If one can understand probit they will understand this too.
Now if we're thinking about purely academic thought experiments I wonder how well the poisson inverse cdf would do as the link function for an integer valued covariate in the binomial situation...
I had mention the support is R in the first post. Other range is also possible to assume. for Eg. Uniform. This is linear probability model ( but probability can go outside the (0,1) range).

There are some R&D type analytics companies experiment models with different alternatives. They explore the academic literature. But in general in practice, I found interpretation is one of the key elements in model development.

#### spunky

##### Can't make spagetti
But in general in practice, I found interpretation is one of the key elements in model development.
now, when you say "interpretation" do you mean us, as statisticians/methodologists/quantitative analysts are able to interpret and communicate or do you mean anyone s?hould be able to interpret it?

i am trying to reconcile this apparent dichotomy that a few of our members have commented on with regards of whether statistical results should be a hermeneutical exercise of the experts or should we make things so simple than anyone can grasp them? i mean, i dont think anyone in his or her right mind should develop a statistical model that he or she cannot interpret, but is it good practice to do run models that are not accurate simply on the basis of interpretability for the lay person? once again, we come back to Einstein's adage: "“Make things as simple as possible, but not simpler"

#### vinux

##### Dark Knight
now, when you say "interpretation" do you mean us, as statisticians/methodologists/quantitative analysts are able to interpret and communicate or do you mean anyone s?hould be able to interpret it?

i am trying to reconcile this apparent dichotomy that a few of our members have commented on with regards of whether statistical results should be a hermeneutical exercise of the experts or should we make things so simple than anyone can grasp them? i mean, i dont think anyone in his or her right mind should develop a statistical model that he or she cannot interpret, but is it good practice to do run models that are not accurate simply on the basis of interpretability for the lay person? once again, we come back to Einstein's adage: "“Make things as simple as possible, but not simpler"
The focus here is probit vs logit.

#### noetsi

##### No cake for spunky
ETS is not a regular business. They do testing and by the nature of testing they have expertise (and respect for expertise) far beyond the norm. There are other agencies and businesses in similar worlds who I am sure are the same. I speak of the 99th percent of regular organizations that are not involved in such. I would truly love to believe that respect for statistics is common in the US. Nothing I have ever seen, and none of the more experienced people I have spoken to believe that.

I am currently working on a three year assessment for a large federal agency. We are using things like ordered and multinominal logit and structured qualitative analysis. I looked at other states and the most "sophisticated" thing they were doing was chi square - and that wrong. Looking at some of the Six Sigma makes me want to gag - given the way statistics are being misused (or things labeled statistics that are not).

Again I hope I am wrong - statistics should be central to the US economic recovery. But I don't believe its very likely.

#### Dason

One main reason i liked logit because the estimation is straight forward ( not a black box). Have you tried to estimated standared error of beta estimates using probit? The logit SE estimates similar to regression and this gives lot of simplicity.
The estimation process for the logit and the probit models is essentially identical isn't it? Is there some closed form solution for the the logistic model that I don't know of? Either way we can get the estimates for the probit model - it might take a little bit more work to implement an algorithm if you're doing things from a frequentist perspective but it's the same algorithm for both links. So if you're calling the probit estimation "black box" wouldn't that make the logistic estimation "black box" as well?

I have done the standard error estimation for the probit model. Both in a frequentist and bayesian setting. There's more math to be done in the frequentist setting to get some of the algorithms to work well but hey nobody ever said it would be simple. There is actually a really nice MCMC algorithm to get samples from the posterior for the Bayesian setting which makes getting standard errors a breeze (although I don't see why we would need the standard errors too much in this case since we just have the posterior distribution)

#### vinux

##### Dark Knight
I am not considering Bayesian here. Because the approach is different. I agree all the points you mention about Bayesian. Also not considering Bootstrap.

The estimation process for the logit and the probit models is essentially identical isn't it? Is there some closed form solution for the the logistic model that I don't know of? Either way we can get the estimates for the probit model - it might take a little bit more work to implement an algorithm if you're doing things from a frequentist perspective but it's the same algorithm for both links. So if you're calling the probit estimation "black box" wouldn't that make the logistic estimation "black box" as well?
Check Nelder or Maculloch's GLM book( not remembering which one), they have explained asymptotic distribution of logit. The variance matrix of beta in the form of covariance of weighted least square method. I tried it for probit( long time back) and not able to solve. Since I know more about logit, and everybody in the business understand what is odds, so i used prefer logit.

I would love to see more discussion on the core topic. This would give more take away and less fight.

@Dason, Why the last part "A fight to the death "? any reason?