# distance from expected in terms of standard deviations

#### GQElvie

##### New Member
assume I am predicting home runs, assume all player bat the same number of times, so we can do this by total home runs, and not home run rate) from a player based on past experience. I have a linear regression model for one period of time that determines the model, and I then run 100 players from another period. these 100 players' avg HR's are 20 and their mean predicted is 20. so, for example, player 1 was predicted to hit 17, but actually hit 21.. etc. assume, however, the variability is different between the actuals and predicted (meaning the standard deviation of the 100 actuals is 4 and for the predicted is 5). getting back to the player who was predicted to hit 17 but actually hit 21. he how far off was he in terms of standard deviations? we want to see how rare this is and refer to it on the normal curve, just as we would refer an IQ of 130 to the normal curve with s=2 and about 2.5% area to its right on the right tail. from everything I have read, standard error comes into play, but that divides by the square root of n, so that in essence the standard error is inflated, so to speak. my knowledge base or gut feeling, for lack of a better expression (I have a masters in math) would say to EITHER take (21-17)/4=1 sd to the right of the mean or (21-17)/5 =0.8 sd to the right of the mean. is either one right? again, incorporating standard error seems to muddy the waters, inflating the value such that standard error could not be used on the curve. hope my question makes sense. I can elaborate if need be.