# Box cox - nonparametric?

#### Odilo

##### New Member
Hi,

I have a major problem with the SPSS analysis I would like to conduct... Please, if anyone has an idea - its very much appreciated

Some background information:

The plan was to do a multiple regression analysis with four independent variables and one dummy variable (capturing a seasonal effect). The dependent variable is composed of the season 1 outcome and the season 2 outcome of the variable.
When looking at the dependent variable, it becomes clear that this composition of the variable causes a non-normal distribution (when conducting a t-test, the means of the two seasons are significant different - which is good for my analysis... but bad for the regression)
The dependent variable has some outliers, but the main problem is that the distribution shows two peaks - a bimodal distribution.
As the ordinary transformations (log, inverse, square, ect.) cannot correct the bimodal distribution, I though about a Box-Cox transformation. Unfortunately I have no clear idea how that works, or what preconditions my variables need to fulfil.
And then I thought about a non-parametric regression, but again, I am not sure if that's the right approach.
Is there a clear guideline about what to do when the dependent variable is bimodal? Does anyone have an idea about what might be the most elegant approach? Or what does not make sense at all?

#### Dason

The assumption of normality is on the error term. If you include the dummy variable in the model (which you said you were going to do) that should take care of the bimodality due to season.

#### Odilo

##### New Member
Hey Dason,

thank you for your answer. I have trubbles understanding what you mean. So when my dummy variable takes care of the bimodality, should I take a look at the two samples individually? Does the error term of the y variable of season 1 / 2 need to be normally distributed?

#### Dason

It means the only distributional assumption in the model is on the error term and you can't assess that until you actually fit the model. You're worried about the bimodality due to the seasonality but if you include a term for seasonality then you're essentially modeling that and the error terms should look fine.

So fit your model and then look at the distribution of the residuals.