+ Reply to Thread
Results 1 to 12 of 12

Thread: More than two values for a dummy variable (regression)?

  1. #1
    Points: 851, Level: 15
    Level completed: 51%, Points required for next Level: 49

    Posts
    61
    Thanks
    27
    Thanked 1 Time in 1 Post

    Unhappy More than two values for a dummy variable (regression)?



    How do I deal with the case where I am dealing with more than two values for a dummy variable when doing regression? I know that if there are 2 values for a dummy variable e.g. yes and no then yes is 1 and no is 0. But, how do I deal with more than 2 values for the dummy e.g. what brand of laptop someone uses : Acer, Toshiba, Apple, Dell, HP, Others. Do I put Acer as 1, Toshiba as 2, Apple as 3, Dell as 4, HP as 5 and Others as 6?

  2. #2
    Points: 851, Level: 15
    Level completed: 51%, Points required for next Level: 49

    Posts
    61
    Thanks
    27
    Thanked 1 Time in 1 Post

    Re: More than two values for a dummy variable (regression)?

    please help?

  3. #3
    Points: 2,626, Level: 31
    Level completed: 18%, Points required for next Level: 124

    Location
    Dallas, TX
    Posts
    311
    Thanks
    12
    Thanked 94 Times in 93 Posts

    Re: More than two values for a dummy variable (regression)?

    You create n-1 dummy variables where n is the number of levels of the categorical variable. So for your example, you'll have 5 dummy variables. Depending on the interpretation you can use different coding schemes. Here is a very good discussion on them: http://www.ats.ucla.edu/stat/sas/web...r5/sasreg5.htm

    Another way is to just keep single variable & use proportions for categories or probit function generated inverse of proportions.
    Last edited by jrai; 01-31-2012 at 01:00 PM.

  4. The Following User Says Thank You to jrai For This Useful Post:

    david_q (01-30-2012)

  5. #4
    Points: 851, Level: 15
    Level completed: 51%, Points required for next Level: 49

    Posts
    61
    Thanks
    27
    Thanked 1 Time in 1 Post

    Re: More than two values for a dummy variable (regression)?

    wait i don't understand. why should i have (n-1) variables instead of (n/2)?

  6. #5
    Points: 851, Level: 15
    Level completed: 51%, Points required for next Level: 49

    Posts
    61
    Thanks
    27
    Thanked 1 Time in 1 Post

    Re: More than two values for a dummy variable (regression)?

    *(n/2) rounded up

  7. #6
    Points: 2,626, Level: 31
    Level completed: 18%, Points required for next Level: 124

    Location
    Dallas, TX
    Posts
    311
    Thanks
    12
    Thanked 94 Times in 93 Posts

    Re: More than two values for a dummy variable (regression)?

    Ok, here is the exercise for you. Explain how will you denote 6 categories of your example with 3 dummies.

  8. #7
    Points: 851, Level: 15
    Level completed: 51%, Points required for next Level: 49

    Posts
    61
    Thanks
    27
    Thanked 1 Time in 1 Post

    Re: More than two values for a dummy variable (regression)?

    Sure.

    variable a: 0 if Acer, 1 if Toshiba.
    variable b: 0 if Apple, 1 if Dell
    variable c: 0 if HP, 1 if Others

    so x = A a + B b + C c

    capitals: constants to be determined by regression.

  9. #8
    RotParaTon
    Points: 46,287, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardFrequent PosterCommunity AwardMaster Tagger
    Dason's Avatar
    Location
    Ames, IA
    Posts
    9,083
    Thanks
    211
    Thanked 1,609 Times in 1,379 Posts

    Re: More than two values for a dummy variable (regression)?

    Ok. Now if you want to denote that a computer is an Apple what will your three variables look like? (0, 0, 0). If you want to denote that a computer is an HP what will your three variables look like? (0, 0, 0).

    Do you see the problem?

  10. The Following User Says Thank You to Dason For This Useful Post:

    david_q (01-31-2012)

  11. #9
    Points: 851, Level: 15
    Level completed: 51%, Points required for next Level: 49

    Posts
    61
    Thanks
    27
    Thanked 1 Time in 1 Post

    Re: More than two values for a dummy variable (regression)?

    *sheepish*

  12. #10
    RotParaTon
    Points: 46,287, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardFrequent PosterCommunity AwardMaster Tagger
    Dason's Avatar
    Location
    Ames, IA
    Posts
    9,083
    Thanks
    211
    Thanked 1,609 Times in 1,379 Posts

    Re: More than two values for a dummy variable (regression)?

    Haha. Don't worry. Dummy variables definitely take some getting used to. And note that there using reference coding isn't the only way to create the dummy variables.

  13. #11
    Points: 2,626, Level: 31
    Level completed: 18%, Points required for next Level: 124

    Location
    Dallas, TX
    Posts
    311
    Thanks
    12
    Thanked 94 Times in 93 Posts

    Re: More than two values for a dummy variable (regression)?

    Just to add 1 more point. You've used equation: x = A a + B b + C c

    Your equation doesn't contain an intercept. When the intercept is missing then you need n dummy variables & not n-1. Intercept acts as a reference category & denotes the excluded category but when you omit intercept then you must include all the categories as dummies.

  14. #12
    RotParaTon
    Points: 46,287, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Discussion EnderPosting AwardFrequent PosterCommunity AwardMaster Tagger
    Dason's Avatar
    Location
    Ames, IA
    Posts
    9,083
    Thanks
    211
    Thanked 1,609 Times in 1,379 Posts

    Re: More than two values for a dummy variable (regression)?


    Good catch. I wasn't paying much attention to that.

+ Reply to Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats