# Thread: Expert Help Required - Multiple Regression with Nominal and Scale Variables

1. ## Expert Help Required - Multiple Regression with Nominal and Scale Variables

Hi all,

I'm having huge issues with the assumptions of multiple regression and determining whether I am running my analysis correctly.

First of all, I have eight different predictors and one outcome variable. The idea is to determine whether each variable is a significant predictor of the outcome variable and their contributions to the overall regression model. Four of the variables are scale so no problems there. Two variables are dichotomous. However, the remaining two categorical variables have three levels.

I believe I am correct when I say that it is not possible to run a multiple regression with categorical variables with three levels. I have looked at videos with dummy coding variables but I am unsure whether it is possible to do so. In the case that I do go with dummy coding, is it possible to do dummy coding with other variables as predictors?

Also, my predictors with three levels are very disproportionate in terms of participant numbers. For instance, one variable had 36% of participants at level 1, 52% at level 2 and 12% at level 3. Is it still possible to include them as predictor in a multiple regression analysis despite the disproportionate numbers?

2. ## Re: Expert Help Required - Multiple Regression with Nominal and Scale Variables

It is possible, and commonly done, to have a categorical predictor variable with 3 or more levels in regression. You simply use dummy variables. You have k-1 dummies where k is the number of levels of the original variable. So if you have 3 levels there will be two dummies. This works fine with any other predictor.

There is nothing theoretically wrong with having dummy variables with a higher percentage of responses in some level. If you get beyond 90 percent in one level (this might be a higher its been a while) then there is a danger that the slope of the dummy may be attenuated (too small) but you should be ok with 12 percent. Note that this is not actually violating an assumption of regression, its a problem that has been detected in practice tied to reduced variation.

3. ## The Following User Says Thank You to noetsi For This Useful Post:

Aegislash (02-27-2017)

4. ## Re: Expert Help Required - Multiple Regression with Nominal and Scale Variables

Thanks very much for the reply. So I've used dummy coded variables and everything appears to be in order. The dummy coded variables remain as significant predictors when I add them to the regression model. Is it acceptable to include them alongside the other scale predictors in the regression model or would I have to include them in a separate block in SPSS? I have no theoretical reasons to include them elsewhere, though from the statistics books that I am reading don't specify the correct course of action.

Another couple of silly questions have come up in the past few hours and I feel it would be unfair on other posters if I were to start another new thread. They are both related to multiple regression and Hayes' PROCESS tool. First of all, I realised that I coded one of my dichotomous yes/no variables (the question asked as Yes = 0 and No = 1. The variable is coming up as significant predictor though the Beta is in a negative direction. Should I have coded it to other way around? The variable is asking whether a person smokes or not, and theoretically it makes sense if it predicts it in a positive direction as opposed to a negative direction.

Also, I have seen a number of studies that ran correlations to check for potential covariates prior to running a mediation analysis using Hayes' process tool. Basically, they run correlations using categorical variables that have more than two levels such as nationality and determine their association with the predictor, mediator and outcome variables which are scale variables. Then there's correlations reported between two categorical variables as well. Is this acceptable and is there a name for this kind of analysis? I am concerned that I would be violating the assumptions of Pearson's correlations but I've seen it in a few papers now.

5. ## Re: Expert Help Required - Multiple Regression with Nominal and Scale Variables

Is it acceptable to include them alongside the other scale predictors in the regression model or would I have to include them in a separate block in SPSS? I have no theoretical reasons to include them elsewhere, though from the statistics books that I am reading don't specify the correct course of action.
Yes. This is the way you should do it. They are no different than any other variable.

I don't know Hayes PROCESS tool. Sign is artificial with a dummy variable. What you code 1 and 0 will determine the sign [that is if you change what was 1 to now be 0 the sign will reverse]. I am not aware of any theoretical justification for coding one level 1 as compared to 0. You just remember what you code 1 and 0 when commenting on it. So if theoretically male should show a positive sign remember whether you coded 1 as male or not when commenting on the sign you did find.

I can't comment on the last question as I do not know what the PROCESS tool is and don't do the research you are asking about. I know regression, although I am not an expert as others here, but not the substantive topic you are addressing.

6. ## The Following User Says Thank You to noetsi For This Useful Post:

Aegislash (02-27-2017)

7. ## Re: Expert Help Required - Multiple Regression with Nominal and Scale Variables

That's perfect and thanks, I will include them in the same block in that instance. Haha don't worry about that last part, you've been very helpful!

 Tweet