Hi,
I'm hoping someone out there can help me with what is probably a very simple question.
For a stats assignment, I've created a linear regression model to predict the sale price of houses using the capital value of the houses to predict the sales price. We were given a large amount of data and got Minitab to generate the appropriate statistics / residual analysis etc.
The final part of the question asks us to evaluate our model (ie: how good is it at making predictions). The linear regression equation is:
Price = 10708 + 0.992 Capital Value
And the associated Minitab data is:
Predictor Coef SE Coef T P
Constant 10708 5203 2.06 0.041
Capital Value 0.99234 0.03232 30.71 0.000
S = 28223.4 R-Sq = 82.8% R-Sq(adj) = 82.7%
We have a nice high R-Sq value and a reasonably low S value (given that we are looking at house prices which range from under $100,000 - over $300,000 in value).
So far so good. I can say that the slope is positive and has a value of 0.992. The question then asks if the standard deviation of a prediction is likely to be smaller than the standard deviation of the responses.
I have no idea what this means.
I googled and found the following for "standard deviation of prediction"
The standard deviations of the predicted values of the estimated regression function depend on the standard deviation of the random errors in the data, the experimental design used to collect the data and fit the model, and the values of the predictor variables used to obtain the predicted values. These standard deviations are not simple quantities that can be read off of the output summarizing the fit of the model, but they can often be obtained from the software used to fit the model.
From the question, it does not look like they are asking us to find either value, they just want us to say if the standard deviation of the prediction would be smaller than the standard deviation of the responses. The problem is that I can't seem to find a definition of "standard deviation of response" compared with "standard deviation of prediction".
Given that our model predicts events based on a straight line, would we not expect the standard deviation of our prediction to always be smaller than the actual observed standard deviation ???
I'm so lost and am hoping that someone out there can clarify this for me
Looking forward to any (and all) suggestions.
Thanks
Dzeni
I'm hoping someone out there can help me with what is probably a very simple question.
For a stats assignment, I've created a linear regression model to predict the sale price of houses using the capital value of the houses to predict the sales price. We were given a large amount of data and got Minitab to generate the appropriate statistics / residual analysis etc.
The final part of the question asks us to evaluate our model (ie: how good is it at making predictions). The linear regression equation is:
Price = 10708 + 0.992 Capital Value
And the associated Minitab data is:
Predictor Coef SE Coef T P
Constant 10708 5203 2.06 0.041
Capital Value 0.99234 0.03232 30.71 0.000
S = 28223.4 R-Sq = 82.8% R-Sq(adj) = 82.7%
We have a nice high R-Sq value and a reasonably low S value (given that we are looking at house prices which range from under $100,000 - over $300,000 in value).
So far so good. I can say that the slope is positive and has a value of 0.992. The question then asks if the standard deviation of a prediction is likely to be smaller than the standard deviation of the responses.
I have no idea what this means.
I googled and found the following for "standard deviation of prediction"
The standard deviations of the predicted values of the estimated regression function depend on the standard deviation of the random errors in the data, the experimental design used to collect the data and fit the model, and the values of the predictor variables used to obtain the predicted values. These standard deviations are not simple quantities that can be read off of the output summarizing the fit of the model, but they can often be obtained from the software used to fit the model.
From the question, it does not look like they are asking us to find either value, they just want us to say if the standard deviation of the prediction would be smaller than the standard deviation of the responses. The problem is that I can't seem to find a definition of "standard deviation of response" compared with "standard deviation of prediction".
Given that our model predicts events based on a straight line, would we not expect the standard deviation of our prediction to always be smaller than the actual observed standard deviation ???
I'm so lost and am hoping that someone out there can clarify this for me
Looking forward to any (and all) suggestions.
Thanks
Dzeni