1. ## Probit Regression?

So I was working on a Logistic Regression with binary DV, but now it turns out that the DV is actually probabilities such that they are between 0 and 1 (obviously).

Do I want to switch my focus onto Probit Regression now? I am finding some material but not a whole lot.

A push in the right direction would be greatly appreciated!

2. ## Re: Probit Regression?

When you say that the response is probabilities do you mean that it's actually a proportion? Or are we talking about a continuous outcome that could take any value between 0 and 1?

3. ## The Following User Says Thank You to Dason For This Useful Post:

Autobot (08-16-2012)

4. ## Re: Probit Regression?

You should get the same substantive results with logistic and probit regression.

5. ## The Following User Says Thank You to noetsi For This Useful Post:

Autobot (08-16-2012)

6. ## Re: Probit Regression?

You can model proportions using a GLM with a logit link. See eg here. I can't think of any reason you couldn't also do it with a probit link (I just tried and it worked), but I generally prefer the logistic model because the coefficients are so much easier to interpret.

7. ## The Following User Says Thank You to bukharin For This Useful Post:

Autobot (08-16-2012)

8. ## Re: Probit Regression?

My DV is the probability that a failure was caused by a certain ingredient. The DV ((dependent variable) is masked just because of the way this place takes in data, so I have to use a probability instead of a pass/fail scenario. And adding up the probabilities does not add up to 1. I do not have the data yet, but it looks like logistic might still be the way to go?

9. ## Re: Probit Regression?

Or maybe I should do a beta regession? And to handle 0's I could transform the DV (dependent variable) as such:

where N=sample size
s=arbitrary number in (0,1), I am choosing 0.5

http://psychology3.anu.edu.au/people...erkuilen06.pdf
page 61 on right top side.

10. ## Re: Probit Regression?

If your DV is expressed as a probability (from 0 to 1 but with essentially infinite values between so its not a bivariate variable) why can't you simply use OLS which is much easier to interpret and run diagnostics on?

11. ## Re: Probit Regression?

Logit is about 0/1 values. But it can also be about a proportion for example 15 out of 20 (so that 15/20). But then that is a sum of basically 0/1 variables.

If it is values that can take any value between 0 and 1, then a beta-distribution might be useful.

I must admit I don’t understand this so I might have misunderstood this post:
Originally Posted by Autobot
The DV is masked just b/c
It comes to my mind “The Ecologists” posting instruction:
"So don't use instant-messaging [SMS] shortcuts. Spelling "you" as "u" makes you look like an semi-literate dud who just saved two entire keystrokes."
I don’t want to go that far, but if you want to be understood and answered, don’t use abbreviations.

12. ## The Following User Says Thank You to GretaGarbo For This Useful Post:

Autobot (08-16-2012)

13. ## Re: Probit Regression?

Originally Posted by GretaGarbo
"So don't use instant-messaging [SMS] shortcuts. Spelling "you" as "u" makes you look like an semi-literate dud who just saved two entire keystrokes."
should be a semi-literate not an. Pot calling kettle black. Hopefully my posts are more understandable now

14. ## Re: Probit Regression?

Originally Posted by Autobot
should be a semi-literate not an. Pot calling kettle black. Hopefully my posts are more understandable now
Greta was just quoting this thread on posting guidelines (sure there are mistakes in it but the point still stands). She hates abbreviations though. English isn't everybody's first language here (Greta is included in this) so she was just trying to make you more conscientious of that.

15. ## The Following User Says Thank You to Dason For This Useful Post:

Autobot (08-16-2012)

16. ## Re: Probit Regression?

I'm sure Greta meant "semi-literate dude" as a term of endearment

17. ## Re: Probit Regression?

The only reason it kind of bothered me is as soon as somebody sidetracks the post it becomes defunct, now I will get no more useful posts. But as Dason points out, I will make sure to make everything crystal clear for those that might not understand English so well. I do appreciate everyones help though. I am looking to do the beta distribution with the alteration to the 0's so I can keep my range as (0,1) instead of [0,1].

18. ## Re: Probit Regression?

Well as long as the conversation is side tracked I'll help you with the latex tags you used above. You used latex rather than tex or MATH. I prefer math as tex would not have displayed your info correctly.

So...

[MATH]x' = \frac{x(N-1)+s}{N}[/MATH]

Gives you this...

Sorry to further side track but this may be helpful to you in posting here I know it was
for me.

OK let's get this thread back to its original intent everyone

19. ## The Following User Says Thank You to trinker For This Useful Post:

Autobot (08-16-2012)

20. ## Re: Probit Regression?

As Dason pointed out I was literally quoting “The Ecologists” forum guidelines (not “posting instructions” sorry for that). I am not a native English speaker and I guess that The Ecologist is not either. (And yes, I also noted that “an” but I didn’t want to change the quotation.) And besides, I said that I did not want to go that far. I interpret is as don’t cut down when it is not necessary.

But that is the formulation given in: “How to post”. It has been there for years. Anybody – who have read the guidelines - and is good in English - could have suggested a correction.

@Smoothjohn, I din not say "semi-literate”. It was the forum guidelines.

Originally Posted by Dason
She hates abbreviations though.
No, I don’t hate abbreviations. I just feel sorry for those who post and are not understood. This is an international site. What is obvious for some in one country might not be understandable in other countries, for example for our friends in India and Nigeria. What is obvious for the psychometrician might be unknown for chemometricians.

We can talk about glm, hglm, gee, gllamm gam, gmm,gamlss, glmm and the pros and con of each of them. I am sure that Dason understands this and could give a lecture about each of them, but - and this is my point - would the hundreds of readers here understand all of that? And it only takes one abbreviation to lose the reader.

@Autobot. Now you can ask your self which do you prefer: to be given a suggestion in broken English [you seems to have accepted the idea of betadistribution] and be told that I had not understood your abbreviation, or be left without suggestion?

This is a suggestion for the improvement of this community, for Autobot and other writers: If you want to be understood and answered, don’t use abbreviations!

21. ## The Following User Says Thank You to GretaGarbo For This Useful Post:

Autobot (08-17-2012)

22. ## Re: Probit Regression?

Originally Posted by Autobot
s=arbitrary number in (0,1), I am choosing 0.5
This is why I don't like this approach - you need to choose an arbitrary number to get your model to run. If you change the arbitrary number then you get slightly different results. Using a generalised linear model (GLM - see Greta I'm listening!) you do not need to do this kind of fudge.

Originally Posted by GretaGarbo
Only by convention. It's just another data transformation. It happens to be very useful for modelling binary proportions, which is why it's used for that; but as far as I know there is absolutely no statistical or mathematical reason why it shouldn't be used for arbitrary proportions between 0 and 1.

Of course you can always try a couple of different models and then choose the one that provides the best fit for you data.

23. ## The Following User Says Thank You to bukharin For This Useful Post:

Autobot (08-17-2012)