- Thread starter Q&A
- Start date
- Tags aic model reduction

In addition to looking at AIC, I would do a test for model comparison. In R:

A significant result suggests that the full model provides a better fit to the data than the reduced model. Lower AIC is better.

Code:

`anova(reduced_model,full_model)`

What should I do?

But I am looking for a reduced model because a model with several variables does not appeal to me in practice.

What should I do?

What should I do?

In industrial statistics, we often have to balance the benefits of a more complete model against the cost of the added complexity. Depending on the application and the cost of the added complexity, we go with the simpler model. Other times, we have to use the more complete model because more precision in prediction is required.

Look at the prediction intervals of the reduced model. Does that model predict "close enough" for your needs?

In industrial statistics, we often have to balance the benefits of a more complete model against the cost of the added complexity. Depending on the application and the cost of the added complexity, we go with the simpler model. Other times, we have to use the more complete model because more precision in prediction is required.

In industrial statistics, we often have to balance the benefits of a more complete model against the cost of the added complexity. Depending on the application and the cost of the added complexity, we go with the simpler model. Other times, we have to use the more complete model because more precision in prediction is required.

M1 has 6 variables and has the smallest AIC (5676)

I would like to have a model with fewer variables. If I remove one variable from the model, the variable that has the highest p-value among all the others, with a p-value of 0.004, for example, my AIC is (5701). What to do? Knowing that I still want to reduce M1 because 6 variables is a lot?

I can be interested in the confidence interval, right? If the interval is large, I can remove the variable even if the AIC is larger afterwards?

But if the full model contains a lot of variables, isn't there a way to remove some of them?

P.S., This concept would fall under bias/variance trade off. Less variables means more potential bias and more variables (given finite sample) means greater standard errors.

If you have a loss function or value you are trying to optimize (e.g., MSE or accuracy) -- having a random holdout set is the best approach of constructing the best subset that is generalizable.

If you have a loss function or value you are trying to optimize (e.g., MSE or accuracy) -- having a random holdout set is the best approach of constructing the best subset that is generalizable.

Last edited:

Yes, that's right, I need a scale model. So in the case where we have an M1, M2 and M3 model.

M1 has 6 variables and has the smallest AIC (5676)

I would like to have a model with fewer variables. If I remove one variable from the model, the variable that has the highest p-value among all the others, with a p-value of 0.004, for example, my AIC is (5701). What to do? Knowing that I still want to reduce M1 because 6 variables is a lot?

I can be interested in the confidence interval, right? If the interval is large, I can remove the variable even if the AIC is larger afterwards?

M1 has 6 variables and has the smallest AIC (5676)

I would like to have a model with fewer variables. If I remove one variable from the model, the variable that has the highest p-value among all the others, with a p-value of 0.004, for example, my AIC is (5701). What to do? Knowing that I still want to reduce M1 because 6 variables is a lot?

I can be interested in the confidence interval, right? If the interval is large, I can remove the variable even if the AIC is larger afterwards?

There are two unrelated questions being asked. Your question about AIC is directed toward finding the optimum "statistical" model. Your question about removal of terms is directed toward how far from the optimum can I simplify the model for purely practical reasons and still have a model that works. The answer to the second question implies that you will ignore AIC and focus on whether the ability of successive simpler models is acceptable. The best way to accomplish that goal is to evaluate the prediction intervals of each simplified model to determine whether it predicts "good enough" for your purpose.

No. Confidence intervals indicate the uncertainty in means around the regression line while prediction intervals indicate the uncertainty of individual predictions made with the regression equation. See this explanation.

View attachment 4269

View attachment 4269