What does intercept mean in regression analysis?

#1
I have searched it on the web but I just cant understand it.
To better understand it I did one example my self. Can somebody explain me this?
> summary(lm(Weight.of.wheat ~ No.of.species))

Call:
lm(formula = Weight.of.wheat ~ No.of.species)

Residuals:
Min 1Q Median 3Q Max
-2122.2 -1563.6 -274.5 1297.2 4454.3

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3113.60 1127.40 2.762 0.00823 **
No.of.species 18.22 93.49 0.195 0.84632
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1757 on 46 degrees of freedom
Multiple R-squared: 0.0008252, Adjusted R-squared: -0.0209
F-statistic: 0.03799 on 1 and 46 DF, p-value: 0.8463

In this analysis what does intercept mean. This analysis shows that there is no significance between weight of wheat and No of species. But the intercept is significant. What does that tell about my data?
Thank you
 

Karabiner

TS Contributor
#2
The regression weight for number_of_species is 18.22. If number of species is = 0, then

weight_of_wheat = intercept + 18.22 * 0 = intercept.

In the present case, the intercept shows you the weight of wheat if there's no species.

HTH

P.
 
#3
Hi

I was going to write a few things but this is quite a good basic description of the intercept and how it should be interpreted.

http://blog.minitab.com/blog/adventures-in-statistics/regression-analysis-how-to-interpret-the-constant-y-intercept

As it says, much of the time it is fairly meaningless (other than in a theoretical way in that it is the point on the axis where the regression line fits best - ie the line with zero total residuals) . However, if by chance your data set covers the multidimensional point where all IVs can be zero, then in that case the intercept is meaningful. Also remember not to use your model to predict values outside of the range of your data.
 

Dragan

Super Moderator
#4
But the intercept is significant. What does that tell about my data?
Thank you
The F-statistic (and sig. or p-value) associated with the Intercept is testing the null hypothesis that the overall grand mean of the dependent variable (Y) is zero - which is not very interesting for your case. Further, what the intercept does is that it ensures that the mean of the predicted values of Y (the YHats) is equal to the mean of the actual values of the dependent variable (the values of Y).

Furthermore, and in the usual General Linear Model jargon (Type III sums of squares), if you take the computational formula for the Sum of Squares for Y i.e.,

SumY^2 - (SumY)^2/N

the, the value of SumY^2 is referred to as "Total" and the value of (SumY)^2/N is referred to as the "Intercept". The value of SumY^2 - (SumY)^2/N is referred to as the "Corrected Total." Note that in some textbooks you will see that the values of (SumY)^2/N is referred to as a "Correction Factor" as it is also used in in the computational formula for the Between Sums of Squares and is thus referred to as the "Corrected Model" Type III sum of squares.