GretaGarbo (05-30-2016)
If we have stochastic regressors, we are drawing random pairs $(y_i,\vec{x}_i)$ for a bunch of $i$, the so-called random sample, from a fixed but unknown probabilistic distribution $(y,\vec{x})$. Theoretically speaking, the random sample allows us to learn about or estimate some parameters of the distribution $(y,\vec{x})$.
If we have fixed regressors, theoretically speaking, we can only infer certain parameters about $k$ conditional distributions, $y\mid x_i$ for $i=1,2,\dots,k$ where each $x_i$ is not a random variable, or is fixed. More specifically, stochastic regressors allow us to estimate some parameters of the entire distribution of $(y,\vec{x})$ while fixed regressors only let us estimate certain parameters of the conditional distributions $(y,\vec{x_i})\mid x_i$.
The consequence is that fixed regressors cannot be generalized to the whole distribution. For example, if we only had $x=1,2,3,\dots,99$ in the sample as fixed regressors, we can not infer anything about $100$ or $99.9$, but stochastic regressors can.
Am I right? This is actually a rather difficult question as many textbook only talks about the differences in mathematical derivation but avoid discussing the differences in the extent they can be generalized theoretically. I have sought help from my stats professor but he doesn't know the answer.
GretaGarbo (05-30-2016)
I think I understand your question in spite of your interesting notation. That is, it can be shown that all results in the context of estimation, testing, and prediction for a regression model, i.e., Y_i = B_0 + B_1*X_i + e_i still apply if the following conditions hold:
(1) The conditional distributions of the Y_i given X_i, are normal and independent, with conditional means B_0 + B_1*X_i and conditional variance Sigma^2. And,
(2) The X_i are independent random variables whose probability distribution, say g(X_i), does not involve the parameters B_0, B_1, and Sigma^2.
These conditions require only that the regression model (above) is appropriate for each conditional distribution of Y_i, and that the probability distribution of X_i does not involve the regression parameters. If these conditions are met, all results ( i.e., when juxtaposed to when X_i is fixed) in terms of estimation, testing, and prediction still hold even though the X_i are random (or stochastic) variables.
GretaGarbo (05-30-2016)
Is this result also for the small sample case, or can it be shown just for the large sample case?
Can the result be extended to the exponential family (when the sample is large)?
like when:
"The conditional distributions of the Y_i given X_i, are independent and the distribution belongs to the exponential family (like normal, binomial, Poisson, gamma etc.)" ?
Many thanks. In the context of experimentation where one would choose the X_i values (i.e. fixed regressors), do you think the results can still be generalized to X values that are not tested in the experiment (since now X_i are not randomly drawn)?If these conditions are met, all results ( i.e., when juxtaposed to when X_i is fixed) in terms of estimation, testing, and prediction still hold even though the X_i are random (or stochastic) variables.
Tweet |