# Thread: Constant of 1: Why?

1. ## Constant of 1: Why?

I know that in regression (well any linear model I would assume; correct this if I'm wrong) models there's a constant of 1 added to the model. I've done a regression with matrix algebra and you have to put a vector of one's in the matrix for the predictor.

Why? What is it doing. I also know that if you do -1 in R it removes this constant. That means there's no intercept. Why would you want to have no intercept in the model?

I don't have theory so I need the explicit version. I don't really understand mathematical notation as an explanation for the most part.

2. ## Re: Constant of 1: Why?

I think one of the relatively sound reason is that you want the line/plane to pass through the origin, due to their physical nature in the real world (or other plausible constraints). The general advise would be cautious to this kind of modelling (because it is a sub-model of the general one) and avoid that unless there is a good reason to do so.

3. ## The Following 2 Users Say Thank You to BGM For This Useful Post:

hlsmith (09-19-2012), trinker (09-19-2012)

4. ## Re: Constant of 1: Why?

I wrote a statisticspedia article about this but it seems that SP is dead. My argument is that even if we have reasons to believe that the outcome should predict a value of 0 when all of our predictors are 0 that unless we're 100% sure that our model is correct then we're most likely hurting ourself if we omit the intercept.

Really most of the time we're trying to get a local approximation to the truth when we do regression. A linear model could provide a good local approximation even if we think the truth is slightly more complicated. We shouldn't extrapolate too far outside of the range of our covariates. So if 0 is outside of that range then why are we adding info about 0 into our model? If 0 is inside the range then why not just let the data speak for itself?

Also if you don't include an intercept you're really allowing the possibility that the model you come up with ends up being worse than just predicting the mean of Y for any input. If we're going to go through the process of building a model and end up doing worse than just saying "predict mean(Y) no matter what the covariates are" then I say we didn't do a good job of building a model.

You can always play around with this stuff too.

Code:
x <- rep(c(102:106), 10)
# Theoretically when x = 0 then y = 0
y <- -x*(x - 200) + rnorm(length(x), 0, 10)
plot(x, y)

o.without <- lm(y ~ x - 1)
# Oh look x is highly significant.
summary(o.without)
plot(x, resid(o.without))

o.with <- lm(y ~ x)
summary(o.with)
plot(x, resid(o.with))

o.mean <- lm(y ~ 1)
summary(o.mean)
plot(x, resid(o.mean))

summary(o.with)$sigma summary(o.mean)$sigma
summary(o.without)\$sigma

plot(x,y)
abline(o.without, col = "red")
abline(o.with, col = "purple")
abline(o.mean, col = "black")
Just look at the many ways the no intercept model is horrible even in this situation where when x=0 that y should be 0.

5. ## The Following User Says Thank You to Dason For This Useful Post:

trinker (09-19-2012)

6. ## Re: Constant of 1: Why?

Nice demo that makes sense. I'm getting the feeling I shouldn't ever do this with the stuff I do. Maybe a statistician would but I can't think of a time when this would be useful. Further, explaining the results, even if the model were good, would seem difficult at best.

7. ## Re: Constant of 1: Why?

What is the 1 actually doing algebraically. Where is it in Y = mX + b? I know the outcome it makes the line pass through the intercept (in simple regression this is the mean score) but what is it doing in that good old 9th grade linear equation?

8. ## Re: Constant of 1: Why?

It's allowing for the b in your model.

Let's say y = {2, 1, 4}, x = {1, 2, 3} and we wanted to fit the model . Notice that we can write this as

so that we have the form y = parameter*variable + other_parameter*other_variable + ... + error

but in thise case the first "variable" is just always 1.

A linear model has the form where X is a design matrix, is a vector of parameters, and is a vector of error terms.

Well we could reformulate that in matrix terms as

and and

Going through with the matrix multiplication gets us to where we want to be.

9. ## The Following User Says Thank You to Dason For This Useful Post:

trinker (09-19-2012)

10. ## Re: Constant of 1: Why?

Oh I get it now. Other wise the matrix algebra only gives one term (I assume the beta weight and will confirm with my matrix example i worked out).

11. ## Re: Constant of 1: Why?

Well not the beta weight but it only returns one value. Here's a script in case anyone has never done simple regression with matric algebra and wants to give it a try:

Code:
##############################################
#     FORMULA FOR REGRESSION PARAMETERS      #
##############################################
#           b = (X'X)^-1 (X'y)               #
##############################################

#DATA
midterm <- c(5,7,7,7,9)
final <- c(4,5,6,8,10)
(SUM <- summary(lm(final~midterm)))
#==============================================
#ASSIGN DATA TO LETTERS TO FIT MATRIX NOTATION
x <- midterm
y <- final
#==============================================
#CONVERT VECTOR x TO MATRIX X WITH PARAMETER
X <- as.matrix(c(rep(1,length(x)),x))
dim(X)<-c(5,2)
X
#==============================================
#DOING THE (X'X) PORTION
M <- t(X) %*% X
M2 <- crossprod(X) #Fast way to do same as M
#==============================================
#DOING THE (M)^-1 PORTION  (THE INVERSE)
Min <- solve(M)
#==============================================
#DOING THE (X'y)
Xp <- t(X) %*% y
Xp2 <- crossprod(X,y)
#==============================================
#DOING THE MATRIX MULTIPLICATION
(b <- Min %*% Xp)
#The upper is the intercept and the lower is the slope
#..............................................
SUM #compare to b

#======================================
#      WHAT IT ALL BOILS DOWN TO
#======================================
#FASTEST WAY WITH MATRIX MULTIPLICATION
(b <- solve(crossprod(X))%*%crossprod(X,y))

#========================================
#PREDICTING
#========================================
#HOW TO PREDICT FOR A MIDTERM SCORE OF 7
xi <- c(1,7) #THE ONE IS THE PARAMETER
xi %*% b