I have never heard of this, it is not surprising. Perhaps it can be used in the case of over fitting.
Hey all,
I am reading about shrinkage estimators. I am sure most of you are familiar with the concept. In any case, I start with a brief introduction, summarising the text from R package "shrink".
When using model selection, usually when there are many variables present, some of the regression coefficients in final multivariable model can be inflated for several reasons. Post-estimation shrinkage is used to correct for the overestimation of regression coefficients caused by variable selection. Shrinkage can be either:
a) Global shrinkage: This modifies all regression coefficients by the same factor
b) Parameterwise shrinkage: This modifies different coefficients by different amount.
Here is an worked out Example, again from "shrink" vignette:
And here are the comparisons of model coefficients from different methodsCode:install.packages("shrink") require(shrink) ## Simulate data with binary response(y) and two covariates (x1 and x2). set.seed(888) # for replication intercept <- 1 beta <- c(0.5, 1.2) n <- 200 x1 <- rnorm(n, mean = 1, sd = 1) x2 <- rbinom(n, size = 1, prob = 0.3) linpred <- intercept + x1 * beta[1] + x2 * beta[2] prob <- exp(linpred) / (1 + exp(linpred)) runis <- runif(n, min = 0, max = 1) ytest <- ifelse(test = runis < prob, yes = 1, no = 0) simdat <- data.frame(cbind(y = ifelse(runis < prob, 1, 0), x1, x2)) ## Run logistic regression fit <- glm(y ~ x1 + x2, family = binomial, data = simdat, x = TRUE) summary(fit) j1<-coef(fit) # store the coefficients ## Assess the shrinkage factors j2<-shrink(fit, type = "global", method = "dfbeta") # global j3<-shrink(fit, type = "parameterwise", method = "dfbeta") # parameter wise
Here you can see from the results that parameter estimates are overestimation for x1 and x2, and the shrunken coefficients are slightly smaller.Code:k<-rbind(j1,j2[[3]],j3[[3]]) row.names(k)<-c("Regression Coefficients (Uncorrected)","Shrunken Coefficients (Global)","Shrunken Coefficients (Parameter Wise)") ## Comparing Coeffs >k (Intercept) x1 x2 Regression Coefficients (Uncorrected) 0.6411413 0.7774600 1.867610 Shrunken Coefficients (Global) 0.6934747 0.7150984 1.717806 Shrunken Coefficients (Parameter Wise) 0.6907555 0.7262782 1.661713
I am fine with the theory but have few questions.
1) Using Shrinkage estimators, The coefficient for intercept gets inflated while that for x1 and x2, they get shrunken. Why?
2) As the text says,"Post-estimation shrinkage is used to correct for the overestimation of regression coefficients caused by variable selection". The simulated example has only 2 covariates. So, not much variable selection going on here. What is the shrinkage estimator doing here then? Can you use shrinkage estimators for any regression model (not only when performing variable selection)?
I have probably read about thousands of medical papers where they use variable selection methods to come up with a risk factor model. Yet, I haven't seen any of them examining the shrinkage factors. What is its practical applicability?
Many Thnx
Please Join the thread.
Oh Thou Perelman! Poincare's was for you and Riemann's is for me.
I have never heard of this, it is not surprising. Perhaps it can be used in the case of over fitting.
Stop cowardice, ban guns!
Tweet |