How do I deal with the case where I am dealing with more than two values for a dummy variable when doing regression? I know that if there are 2 values for a dummy variable e.g. yes and no then yes is 1 and no is 0. But, how do I deal with more than 2 values for the dummy e.g. what brand of laptop someone uses : Acer, Toshiba, Apple, Dell, HP, Others. Do I put Acer as 1, Toshiba as 2, Apple as 3, Dell as 4, HP as 5 and Others as 6?
You create n-1 dummy variables where n is the number of levels of the categorical variable. So for your example, you'll have 5 dummy variables. Depending on the interpretation you can use different coding schemes. Here is a very good discussion on them: http://www.ats.ucla.edu/stat/sas/web...r5/sasreg5.htm
Another way is to just keep single variable & use proportions for categories or probit function generated inverse of proportions.
Last edited by jrai; 01-31-2012 at 01:00 PM.
wait i don't understand. why should i have (n-1) variables instead of (n/2)?
*(n/2) rounded up
Ok, here is the exercise for you. Explain how will you denote 6 categories of your example with 3 dummies.
variable a: 0 if Acer, 1 if Toshiba.
variable b: 0 if Apple, 1 if Dell
variable c: 0 if HP, 1 if Others
so x = A a + B b + C c
capitals: constants to be determined by regression.
Ok. Now if you want to denote that a computer is an Apple what will your three variables look like? (0, 0, 0). If you want to denote that a computer is an HP what will your three variables look like? (0, 0, 0).
Do you see the problem?
Haha. Don't worry. Dummy variables definitely take some getting used to. And note that there using reference coding isn't the only way to create the dummy variables.
Just to add 1 more point. You've used equation: x = A a + B b + C c
Your equation doesn't contain an intercept. When the intercept is missing then you need n dummy variables & not n-1. Intercept acts as a reference category & denotes the excluded category but when you omit intercept then you must include all the categories as dummies.
Good catch. I wasn't paying much attention to that.
Advertise on Talk Stats