Hello everyone, I am new here (and to stats in general).
Anyway I am working on a project where I want to predict the value of an outcome and I have a few variables known beforehand that might be related. This is how I planned to go about finding a formula that will more or less predict that outcome, is it correct?
List potentially influential variables
Gather samples (I can do as many as I like)
Calculate the Pearsons correlation between each variable and the outcome
Discard all obviously unrelated variables
Linear regress the first variable, then regress on the residual from that variable etc
Match the full regression against a regression excuding each variable to test importance
Is this how I should go about doing it? How do I know that a multi linear regression is the way to go? And is there any free software out there that would make my life easier?
Anyway I am working on a project where I want to predict the value of an outcome and I have a few variables known beforehand that might be related. This is how I planned to go about finding a formula that will more or less predict that outcome, is it correct?
List potentially influential variables
Gather samples (I can do as many as I like)
Calculate the Pearsons correlation between each variable and the outcome
Discard all obviously unrelated variables
Linear regress the first variable, then regress on the residual from that variable etc
Match the full regression against a regression excuding each variable to test importance
Is this how I should go about doing it? How do I know that a multi linear regression is the way to go? And is there any free software out there that would make my life easier?