(1) True/False: Models selected by automated variable selection techniques do not need to be validated since they are ‘optimal’ models.

(2) Compute the Akaike Information Criterion (AIC) value for the linear regression model

Y = b0 + b1*X1 + b2*X2 + b3*X3.

The regression model was fitted on a sample of 250 observations and yielded a likelihood value of 0.18.

(a) 9.49

(b) 11.43

(c) 25.52

(d) 15.55

(3) Compute the Bayesian Information Criterion (BIC) value for the linear regression model

Y = b0 + b1*X1 + b2*X2 + b3*X3.

The regression model was fitted on a sample of 250 observations and yielded a likelihood value of 0.18.

(a) 9.49

(b) 11.43

(c) 25.52

(d) 15.55

(4) True/False: Consider a categorical predictor variable that has three levels denoted by 1, 2, and 3. We can include this categorical predictor variable in a regression model using this specification, where X1 is a dummy variable for level 1, X2 is a dummy variable for level 2, and X3 is a dummy variable for level 3.

Y = b0 + b1*X1 + b2*X2 + b3*X3

True

False

(5) True/False: The model Y = b0 + exp(b1*X1) + e can be transformed to a linear model.

True

False

(6) True/False: A variable transformation can be used as a remedial measure for heteroscedasticity.

True

False

(7) When comparing models of different sizes (i.e. a different number of predictor variables), we can use which metrics?

a. R-Squared and Adjusted R-Squared

b. R-Squared and Mallow’s Cp

c. AIC and R-Squared

d. AIC and BIC

(8) True/False: When using Mallow’s Cp for model selection, we should choose the model with the largest Cp value.

True

False

(9) True/False: Consider the case where the response variable Y is constrained to the interval [0,1]. In this case one can fit a linear regression model to Y without any transformation to Y.

True

False

(10) True/False: Consider the case where the response variable Y takes only two values: 0 and 1. A linear regression model can be fit to this data.

True

False

Answers to What I tried:

1) False

2) I tried the formula nlog(RSS/N) + 2k, not working

3) I tried -2ln(likelihood) + ln(N)*K, not working

4) TRUE

5) True

6) False

7) D

8) False - we need smallest

9) False - not sure why

10) FALSE - Logistic, not linear