- Thread starter sak
- Start date
- Tags distribution of coeff

Hi Sak, you're asking about generating a full histogram of the distribution of the parameter estimate. I've been dealing with this with respect to the difference between confidence and prediction intervals.

I am interested in this issue as well. What I understand is that if you're only interested in the distribution of the mean and mean error, then you simply assume that the SE is the standard deviation and generate the distribution accordingly. For 95% prediction intervals, it's a bit more of a pain (as I understand it) since you need to add the error variance to the unexplained variance in the model. Ugh.

Any comments on this from other contributors? Given a fake data set, how would you....

(1) Generate a histogram of the parameter estimates given the mean and standard error provided in the model output?

(2) Generate a histogram of the underlying "prediction" distribution at a given x, given the model output?

To get a conversation started, here's a fake data set:

Does this make sense? My first thought for a histogram of the mean parameter estimate is:

I would be interested in your thoughts...sak, I hope this directly relates to your original post.

I am interested in this issue as well. What I understand is that if you're only interested in the distribution of the mean and mean error, then you simply assume that the SE is the standard deviation and generate the distribution accordingly. For 95% prediction intervals, it's a bit more of a pain (as I understand it) since you need to add the error variance to the unexplained variance in the model. Ugh.

Any comments on this from other contributors? Given a fake data set, how would you....

(1) Generate a histogram of the parameter estimates given the mean and standard error provided in the model output?

(2) Generate a histogram of the underlying "prediction" distribution at a given x, given the model output?

To get a conversation started, here's a fake data set:

Code:

```
set.seed(100)
fake.data <- data.frame(x=seq(1:20), y=rnorm(20, 10,2)*seq(1:20))
model<-lm(y~x, data=fake.data)
```

Code:

```
set.seed(100)
hist(rnorm(10000, mean=11.0147, sd=0.9272),freq=F)
```

Last edited:

Hi Sak, you're asking about generating a full histogram of the distribution of the parameter estimate.

Code:

```
library(mvtnorm)
reps <- 10000
Beta0.est <- double(reps)
Beta1.est <- double(reps)
Beta2.est <- double(reps)
R2.est <- double(reps)
pop.cor.mat <- matrix(c(1, 0, 0, 0, 1, 0, 0, 0, 1), nrow = 3, byrow = T)
for (i in 1:reps){
samp <- data.frame(rmvnorm(100, c(0,0,0), pop.cor.mat))
samp.lm <- lm(X1 ~ X2 + X3, data = samp)
Beta0.est[i] <- samp.lm$coefficients[1]
Beta1.est[i] <- samp.lm$coefficients[2]
Beta2.est[i] <- samp.lm$coefficients[3]
R2.est[i] <- summary(samp.lm)$r.squared
}
Reg.Results <- data.frame(Beta0.est,Beta1.est,Beta2.est,R2.est)
hist(Beta0.est, breaks = 100, col = "grey", las = 1, main = "Regression Simulation", xlab = "Estimated Regression Coefficient Beta0", ylab = "Frequency", cex.lab = 0.7, cex.axis = 0.7, cex.main = 0.8)
```