# Thread: More than two values for a dummy variable (regression)?

1. ## More than two values for a dummy variable (regression)?

How do I deal with the case where I am dealing with more than two values for a dummy variable when doing regression? I know that if there are 2 values for a dummy variable e.g. yes and no then yes is 1 and no is 0. But, how do I deal with more than 2 values for the dummy e.g. what brand of laptop someone uses : Acer, Toshiba, Apple, Dell, HP, Others. Do I put Acer as 1, Toshiba as 2, Apple as 3, Dell as 4, HP as 5 and Others as 6?

3. ## Re: More than two values for a dummy variable (regression)?

You create n-1 dummy variables where n is the number of levels of the categorical variable. So for your example, you'll have 5 dummy variables. Depending on the interpretation you can use different coding schemes. Here is a very good discussion on them: http://www.ats.ucla.edu/stat/sas/web...r5/sasreg5.htm

Another way is to just keep single variable & use proportions for categories or probit function generated inverse of proportions.

4. ## The Following User Says Thank You to jrai For This Useful Post:

david_q (01-30-2012)

5. ## Re: More than two values for a dummy variable (regression)?

wait i don't understand. why should i have (n-1) variables instead of (n/2)?

6. ## Re: More than two values for a dummy variable (regression)?

*(n/2) rounded up

7. ## Re: More than two values for a dummy variable (regression)?

Ok, here is the exercise for you. Explain how will you denote 6 categories of your example with 3 dummies.

8. ## Re: More than two values for a dummy variable (regression)?

Sure.

variable a: 0 if Acer, 1 if Toshiba.
variable b: 0 if Apple, 1 if Dell
variable c: 0 if HP, 1 if Others

so x = A a + B b + C c

capitals: constants to be determined by regression.

9. ## Re: More than two values for a dummy variable (regression)?

Ok. Now if you want to denote that a computer is an Apple what will your three variables look like? (0, 0, 0). If you want to denote that a computer is an HP what will your three variables look like? (0, 0, 0).

Do you see the problem?

10. ## The Following User Says Thank You to Dason For This Useful Post:

david_q (01-31-2012)

*sheepish*

12. ## Re: More than two values for a dummy variable (regression)?

Haha. Don't worry. Dummy variables definitely take some getting used to. And note that there using reference coding isn't the only way to create the dummy variables.

13. ## Re: More than two values for a dummy variable (regression)?

Just to add 1 more point. You've used equation: x = A a + B b + C c

Your equation doesn't contain an intercept. When the intercept is missing then you need n dummy variables & not n-1. Intercept acts as a reference category & denotes the excluded category but when you omit intercept then you must include all the categories as dummies.

14. ## Re: More than two values for a dummy variable (regression)?

Good catch. I wasn't paying much attention to that.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts