# Thread: OLS regression versus Binary regression

1. ## OLS regression versus Binary regression

Hello,

I am currently working on my master thesis and I have a question regarding the use of two types of regressions. First, I will explain the situation:

In one hypothesis, I analyze whether people who participate in on the on the job training (training = variable 1) score high on 'perceived organizational support (POS = variable 2)

The variable POS is created by combining three seperate variables (converted in Z-scores) of which two are measured on a 5-point likert scale, and one is dichotomous (yes or no). Thus, the variable is a continuous variable.

Because the combined variable POS consists of three variables of which two can only take values ranging between 1 and 5, and one the values 1 or 2, the variable POS also has a small range. Descriptive statistics show that POS range from -1,27 to 1,87. It also shows that the values are not equally distributed. For example value -1,27 has a frequency of 94.

Since POS (the dependent variable) is continous, I used an OLS regression analysis. However, my supervisor adviced by to also use a binary logistic regression analysis. He adviced me to combine the upper 50% of POS values (the high scores = value 1) and the low 50% of POS scores (the low scores = value 0). In doing so, it becomes clear whether, as expected, the people who train are the people with a POS level in the upper 50%.

My questions are as follows:

1. What are the downsides of risks of using a OLS regression in this situation?
2. What are the upsides of the binary logistic regression as explained in this situation to potentially deal with the downsides of an OLS regression?

2. ## Re: OLS regression versus Binary regression

Someone correct me if i'm wrong, but:

The value of your dependent variable (POS) can be any number if you use OLS. So, if you use OLS, the downside is the possibility of an estimated value POS outside the range of -1,27 to 1,87, which according to your information should not be possible.
The advantage of using binary logistic regression by seperating your observed POS values into an upper and lower part is that you calculate the probability of someone with (or without) training belonging to the positive part (upper 50%) or to the more negative part (lower 50%). Thus, you have no more problems with values that could be outside your range..

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts