- Thread starter SAMAR86
- Start date

In order to select important variables, it is necessary to include theoretically relevant variables to address hypotheses. It is also highly important to add enough variables for good predictive power. On the contrary, the model must be kept simple. We must find the balance between these two points. In case we add too many, we run the risk of causing a situation of multicollinearity.

In order to select important variables, it is necessary to include theoretically relevant variables to address hypotheses. It is also highly important to add enough variables for good predictive power. On the contrary, the model must be kept simple. We must find the balance between these two points. In case we add too many, we run the risk of causing a situation of multicollinearity.

If you decide to apply forward selection:

Start with ‘empty model’̂Y=a. After that,you have to follow these steps :

1 Add each non-used IV, one at a time, to the model. Compute its sr2.

2Take the largest sr2. Test its significance, test whether adding the associated predictor to the model significantly increases the Y variance accounted for. If the p-value is below the cutscore: Add Xi to the model, and proceed to Step 1. If thep-value is above the cutscore: Stop algorithm.