How to find the best regression model

Q&A

New Member
#1
Hi all,

I would like to know, how to find the best regression model ?
I have different models with different variables. I would like to know which of them is the "best" to use.
The lowest AIC value is enough to determine the 'best' model ?
 

fed2

Active Member
#2
yeah i think AIC is 'lower is better' isn't it. if your relying on these criteria to determine the 'best model' though your pretty well stumbling in the dark for the most part, imho.
 

Q&A

New Member
#6
Hello,

@Karabiner , @fed2 thanks guys !

And if I have two regression models, with same variables but different data ( a sample and a sub-sample of this sample ), can we compare them with AIC?
 

hlsmith

Less is more. Stay pure. Stay poor.
#7
Best model selection is based in context knowledge. If a person was deviating from that practice, using 3 data splits would be the standard. Train: where you fit all candidate models. Validate: see which model is best on holdout set. Train: where you fit the best model and acquire final estimates.
 

Q&A

New Member
#8
Best model selection is based in context knowledge. If a person was deviating from that practice, using 3 data splits would be the standard. Train: where you fit all candidate models. Validate: see which model is best on holdout set. Train: where you fit the best model and acquire final estimates.
Yes, I have some models with differents variables and the same data. I used AIC to find the best model.
But now I have some models with differents variables and an other data. I used AIC too, to find the best model with this data.
But how compare and find the best data between the model where the data = A and the one where the data = B
 

Q&A

New Member
#10
Can you pool the data or do they contain non-overlapping sets of variables?

These are sub-samples at different points in time, I need to assess for each sample whether the variables that are significant are the same regardless of sample and time or not.
It may be that the models of the two different sub-samples have different variables.
But what criterion should be used to choose the "best" model (see if the model with subsample A is better than the model with sample B?)