Regression Analysis w/Dummy Variable

I am trying to run a regression on a dataset. The dataset has the regions listed as 1, 2, 3, and 4. To run a regression, do i need to recode these regions to 0,1,2 and then run a regression? If so, where in the regression analysis output will it tell me the difference between the regions? Should i be running a different analysis instead of regression? My main point is: I am trying to determine if the region someone lives in effects how many times they went to an orthopedic doctor...

If you want to use a regression and this a nominal variable that takes on one of four variables, you can recode your data so that there are now three 0/1 dummy variables. For example, for the region variable, suppose your data looks like this: 1 1 2 2 3 3 4 4. You would create three variables with length 8; the first would be 1 1 0 0 0 0 0 0; the second would be similar but correspond to the 2's in the region variable. Etc.

Then, each of these three new variables is in indicator for its corresponding region. The intercept incorporates in the residual category.


what software are you using? I should add on to Janus's comment to say that the p-values you are getting will only tell you the difference between the individual groups and the reference group. It will not tell you the difference between, say, group 1 and 2 (if you encode dummy variables for those).