Appropriate Regression Method?


New Member
Hi guys,

I'm relatively new to statistics, and I'm searching for an answer as to which regression method would be most appropriate for the problem I'm facing. My task is to use a number of continuous predictor variables (up to potentially 20) to predict a continous outcome variable. However, I don't know which variables are significant, which will be useful or not, and hence I assume I must go through some kind of step by step process to find the best variables to use. Is there a method that would be most appropriate for this? Perhaps relatively easy to use software that accomodates this method?

Also when entering these variables into software, is it necessary to test the relationships between individual predictor variables (should they be added, subtracted, multiplied, higher order operators etc?)

Thanks in advance,



Omega Contributor
Superficially you are looking at possibly multiple linear regression. Four typical approaches :

Stepwise (people ha e issues saying it is too automated)

Do a bunch of univariate comparisons first and if significant use them as candidate covariates in model (people dislike approach h grumbling about collinearity and lack of ability to examination moderation amongst covariates).

Enter all variables a d see what goes down.

Lastly enter variable based on literature review and clinical significance.

Ideally the second and last options fused is accepted while test collinarity and mediation, in addition to model assumption.


New Member
Thanks hlsmith,

I've been looking into stepwise and the alterative potentially better methods such as lasso or least angles, as well as box-cox transformations. I'm looking for a bit of insight as to when you would enter relationships between predictor variables into a regression expression; i.e. a*b, a/b, a-b, a:b etc.

Thanks again!