Multiple Regression and Likert Scale Data

With multiple regression, is it necessary to recode independent variables that are measured using Likert Scale responses into dummy variables (with values of 1 or 0)?

Background: I am testing hypotheses concerning consumer purchasing patterns. A survey was used to collect the necessary data for the various independent variables. The questions and subsequently the data collected is structured according to five point Likert Scale responses (eg. (1) no importance to (5) utmost importance; e.g. (1) strongly agree to (5) strongly disagree).

Sample Data:

Dependent Variable: Have you purchased products from ACME retailer?
Possible responses: (1) yes, (0) no

There are several independent variables, this is a sample of just one: PRICE.
Survey question: How important was price in your purchase consideration? Possible responses:
(1) no importance
(2) little importance
(3) moderate importance
(4) great importance
(5) utmost importance

If it is necessary to recode using dummy variables, it is my understanding that 4 dummy variables are needed, with one reference category.

Dummy 1: no importance=1, otherwise=0
Dummy 2: little importance=1, otherwise=0
Dummy 3: moderate importance=1, otherwise=0
Dummy 4: great importance=1, otherwise=0

With SPSS (version 16), can this be done by clicking “transform”, “recode into new variable” and then creating the above dummy variables?

I am pretty confident in running the regression and then interpreting the results, but I am having difficulty in actually coding the data for multiple regression. Any assistance would be greatly appreciated.


TS Contributor
Theoretically you are right, if your independent variable is categorical, you need to use dummy variables. According to your description, your dependent variable is dichotomous, thus you need a Logistic Regression model. If you use SPSS, on the "Binary Logistic" model, you have a button "Categorical". After you enter your variables, you can press on it, and then define which variables are categorical. If you do that, you save the trouble of actually building these dummy variables (although to be honest, when there are not too many of them, I usually prefer doing it manually !).
Your problem arise when you have too many categorical independent variables, each with many categories. If you build dummy's (or let SPSS do it for you), you will end up with 4 variables for each IV. That means that if you have p IV's, you will end up with 4p new IV's. If 4p>n, I think you have a problem. And even if not, the more variables you'll have, the harder it will be to get meaningful results. I suggest that you try to transform your scale from a 5 options scale to a 3 options scale.
I'll be happy to hear other opinions, this is an important issue I think...
Hi. With regard to the number of independents, a rule of thumb, if I am not mistaken, is to have at least 20 cases per independent. This is the effective sample size per independent variable, meaning that it concerns at least 20 cases in the smallest category of your variable. If you cannot reach this size, it is indeed wise to recode your variable, as WeeG suggested, so that you meet this requirement.

With regard to the dummy's I do think it is important to use the SPSS command to do so. In the first place this allows you to specify a reference category. For example if you take the reference category to be zero, the regerssion coefficients will indicate the relative change of categories 2, 3 and 4 to this reference category. You can freely choose this category.

Second, however, and more importantly, when you work with categoricals, the entire group of dummies should be significant, otherwise you should not proceed to interpretation of the results, if I am not mistaken. this is the significance level indicated at the reference category. This will be impossible to verify if you recode them manually.

I hope this was of any help.

Best regards and succes,