1. ## F-Statistic

I'm trying to understand the intuition behind the F-Statistic. In ISLR:

The numerator is described as the "between variation". To me it looks like the average explained variance per predictor. Why does that speak to Between Variation?

The denominator looks like the Total Variance estimate of the true regression and is the described as the "within variation". Why is "within varitation"?

Finally, why does the ratio of between to within variation support the alternative hypthothesis that at least one Beta coeff does not equal zero?

2. ## Re: F-Statistic

It might be useful (I'm not sure) to think of the F statistic in terms of R^2 where:

F = [R^2/p]/[(1-R^2)/(N-p-1)].

Do you understand the interpretation of what the value of R^2 means?

The null hypothesis states (in English) that you are not going to get any help at all in predicting Y (D.V.) from any of the X's (I.V.'s). That is, you may as well just you the mean of Y (Y_bar) as the best predictor of Y because Y_bar is the OLS estimate of Y.

The alternative hypothesis states (in English) that you will get some help from at least one (perhaps more) of the X's in predicting Y.

The (F) ratio does not automatically support the alternative hypothesis i.e. or, conversely, automatically reject the null hypothesis.

3. ## Re: F-Statistic

Yes I get the r^2. It's % of the explained variance in Y.

I can't make the connection between the ratio of Numeratorenominator and the probability that at least one predictor explains the variance in Y.

4. ## Re: F-Statistic

The "between group" and "within group" nomenclature is most often used in the context of ANOVAs, where the former measures the variation of group means from the overall mean, and the latter measures the "natural variation," or "error," of the y's around their respective group means.

The form of the numerator and denominator is connected to the F-distribution, which describes the ratio of two independent chi-square random variables, each divided by their respective degrees of freedom. If (as under the null hypothesis) the beta coefficients are truly all zero, and normality, independence, and heteroskedasticity assumptions are met, then

So the F-statistic has the form after the have cancelled out. A high F-statistic, which would be observed with an extremely low probability under the null hypothesis, would provide evidence against that hypothesis.

5. ## Re: F-Statistic

Originally Posted by droma3
Yes I get the r^2. It's % of the explained variance in Y.

I can't make the connection between the ratio of Numeratorenominator and the probability, which would explain that at least one predictor explains the variance in Y.
I do not know where you are getting the idea that you need to establish a connection between the F-statistic and the probability that at least one predictor explains the variance in Y.

That said, I would suggest that the idea you need to understand is that Sampling Distributions associated with a statistic are derived assuming that the Null Hypothesis is true. As the previous poster (ab-stats) stated, a low-probability associated (e.g. less than p<0.05) with the F-statistic (again, assuming the Null Hypothesis to be true) would suggest to a researcher to reject the Null Hypothesis in favor of the Alternative Hypothesis that at least one predictor explains the variance in Y - but it is an omnibus test i.e., "I can not tell which one of the predictor(s) it is" and so that is why we have indices such as semi-partial correlation coefficients---which tells the researcher the unique contribution of a particular predictor contributes to the regression model as of it were entered last into the model.

It appears to me, that your connection "problem" lies not with Type I error, but rather, the compliment of Type II error, which of course would be the Power of the F-statistic.

Note: the square root of R^2 is the correlation between Y and the predicted values of Y (Y_hat) and the square root of 1- R^2 is the correlation between Y and the error terms of the regression model.

In summary, to address your "connection" problem, what you need to do is consider a non-central F-distribution with a certain value of a so-called "non-centrality parameter" that shifts the central F-distribution (which assumes the Null Hypothesis to be true) to right to calculate the probability, using numerical integration, that you are seeking to make "the connection."

I hope this post helps.

 Tweet