# Multiple Regression

#### SAMAR86

##### New Member
When we fit a multiple linear regression, how i can know which explanatory variables i should include and which ones i should exclude from the model, how i know the ones that could affect the model negatively?

#### GretaGarbo

##### Human
It is based on your subject matter knowledge. If you don't know a lot, or very little, about your subject your model might be "strange" (or crazy). But we must all start somewhere.

#### Camelia

##### Member
Research studies usually have explanatory variables that can be numerous for different reasons (mediating, exploratory, theoretical, etc.) and it is not easy to decide which variables to include and which to exclude from a final model.
In order to select important variables, it is necessary to include theoretically relevant variables to address hypotheses. It is also highly important to add enough variables for good predictive power. On the contrary, the model must be kept simple. We must find the balance between these two points. In case we add too many, we run the risk of causing a situation of multicollinearity.

#### SAMAR86

##### New Member
Research studies usually have explanatory variables that can be numerous for different reasons (mediating, exploratory, theoretical, etc.) and it is not easy to decide which variables to include and which to exclude from a final model.
In order to select important variables, it is necessary to include theoretically relevant variables to address hypotheses. It is also highly important to add enough variables for good predictive power. On the contrary, the model must be kept simple. We must find the balance between these two points. In case we add too many, we run the risk of causing a situation of multicollinearity.
exactly I want to know how to identify the variables that have collinearity and or are confounders?

#### Camelia

##### Member
I am really not sure but I would say that in this case, the solution rests on applying a software. When we do not know about theory and collinearity, we can apply a hierarchical analyze. I think you have to perform a sequence of regression procedures on a step-by-step way, by adding/removing (sets of) IVs. But you can let software automatically choose for you. By the way, when it comes to software, there are three methods: Forward selection, Backward elimination, and stepwise regression. You can do it manually.
If you decide to apply forward selection: