# Probit/Logit independent variables

#### Athinagoras

##### New Member
Hello,

My DV is a binary variable (Yes/No) and I am using a logit/probit regression analysis.

As for my independent variables,there are three I'm primarily interested in. They are all ordinal categorical variables,each of which is of the form:

Strongly Agree
Agree
Indifferent
Disagree
Strongly Disagree

Whats the best way to include those variables in the model?

I could arbitrarily plug some ordered values to each answer (e.g. -10 for "Strongly Disagree", -5 for "Disagree", 0 for "Indifferent", etc).

I could create dummies for each category and put them into the regression.

Or,finally, I could create one dummy for each variable which divides the categories in half and put them into the regression e.g. dummy=1 if ("Str. agree" or "Agree") and dymmy=0 if ("Indif." or "Disagree" or "Str. Disagree")

Which one of the three would you propose I do or, do you have any other alternatives?

Appreciate you time

#### threestars

##### New Member
So there is no right answer here.

I would say think about the process you are trying to model. You could assign values to your scale 0-5 or -20 through 20. All could be very appropriate but you have to think about what you are modeling. For instance, I have seen studies on wars and sometimes they will categories wars along some casualty scale.

No casualties
0-50 Casualties
50-100 Casualties
100-1000 Casualties
1000+

In this case it would make sense to have the scale be something like 0, 25, 75, 500, 2000. Or any variation of that. But if the scale were something else degree of support for a presidential candidate for instance, it could be 0,1,2,3,4,5. Basically, you need to think about the concept you are measuring and ask yourself, what should the gaps be? This decision should not be arbitrary.

You've also raised the possibility of dummying out your scale (i.e. creating a dummy variable for each level of the independent variable). This is a good idea if you think that the affect of the IV is not constant (i.e. linear) across all of your levels. If you were to dummy out a scale that goes from 0 to 2, you would be saying that the effect of going from 0 to 1 is different than going from 1 to 2. This is actually a great way to do your analysis because relationships typically are not linear. But it comes with the caveat that interpretation of these coefficients is a bit more difficult to present to a reader/client, etc.

Hope this helps.

#### Athinagoras

##### New Member
Thanks, this makes perfect sense

We eventually included a dummy=1 if ("Str. Agree", "Agree") and 0 otherwise. Thus, we have the issue of creating bias, since the dummy=0 category not only contains the "Indifferent" people, but also the "Don't Know" and "Did not answer".

How do you suggest treating this,assuming that deleting these observations loses a lot of information and makes our small sample even smaller?