# Distribution of DV in multiple OLS regression

#### jango

##### New Member
Hello,

I am running a multiple linear OLS regression on a DV with the following frequency distribution. The DV is the mean of 5 Likert-scale items (thus, pseudo-metric as common in social sciences).

I know that the DV does not have to be normally distributed in multiple linear OLS regression but that the residuals should be normally distributed. I checked for the normality of residuals and everything seems fine from this perspective (PP and QQ plot looks fine, Shapiro-Wilk Test for the standardized residual is not significant).

I would like to run a multiple linear OLS regression with this DV and would like to know whether you think this is justifiable? I do not want to run an ordinal regression for various reasons.

Last edited:

#### obh

##### Well-Known Member
Hi Jango,

Is it average of several questions that measure the same area?
Do you mean the residuals distribute normally?

#### jango

##### New Member
Hi obh,

Is it average of several questions that measure the same area?
Yes. The items measure different facets of an overarching construct (reflective construct).

Do you mean the residuals distribute normally?
Yes. Residuals distribute normally.

#### Karabiner

##### TS Contributor
This looks decidedly non-normal to (there are 3 peaks). You could have a look at a Q-Q plot in order to get
an additional impression.

But your sample size is large enough, so that non-normality won't matter anyway.

With kind regards

Karabiner

#### obh

##### Well-Known Member
So you can't use ordinal regression since you check the averages which are not discrete?

As I know the normality assumption is only for the residuals.

The DV doesn't look like normal distribution but quite symmetrical, so even if it was residual's distribution I would probably won't deny regression.
just for the fun did you try normality test? what p-value do you get in the following Shapiro Wilk test? http://www.statskingdom.com/320ShapiroWilk.html

#### jango

##### New Member
This looks decidedly non-normal to (there are 3 peaks). You could have a look at a Q-Q plot in order to get
an additional impression.
Hi Karabiner, thanks for your answer. I thought the normality assumption applies only to the residuals?

But your sample size is large enough, so that non-normality won't matter anyway.
Sample size is about 130.

So you can't use ordinal regression since you check the averages which are not discrete?
Well, I guess thats open for debate. Researchers in my field interpret the mean of multiple Likert-Scale items as continuous.

The DV doesn't look like normal distribution but quite symmetrical, so even if it was residual's distribution I would probably won't deny regression.
just for the fun did you try normality test? what p-value do you get in the following Shapiro Wilk test? http://www.statskingdom.com/320ShapiroWilk.html
The graph shows the plain and simple frequency distribution of my dependent variable. Thanks for the link, I'll try the test later.

#### obh

##### Well-Known Member
[QUOTE="Well, I guess thats open for debate. Researchers in my field interpret the mean of multiple Likert-Scale items as continuous.[/QUOTE]

I assume the debate is for the Likert-Scale but you use an average of the Likert-Scale

#### jango

##### New Member
I assume the debate is for the Likert-Scale but you use an average of the Likert-Scale
Ah, yes. Thanks for the clarification.

#### Karabiner

##### TS Contributor
Hi Karabiner, thanks for your answer. I thought the normality assumption applies only to the residuals?
Yes, I misunderstood your description. I thought it showed the residuals, not the DV.

#### jango

##### New Member
Thanks for all your answers.

I also checked the scatter plot showing the standardized residuals vs standardized expected values. You can identify three diagonal lines in the plot. I guess thats due to the three peaks in the frequency distribution of my DV. Is this problematic?

I read the following paper and blog post in this regard and the authors do not seem to mention any problems in this respect:

Last edited: