Regression coefficient for reference groups -- yay or nay?

Hi all,

When I'm working on a variable with 5 different categories and one is the reference group in a logistic regression, I only have 4 coefficients as the reference group doesn't have one, right? Just checking that a regression coefficient of zero isn't a thing...

Asking for a homework assignment!




Ambassador to the humans
Correct. The reason is that we want our models to be identifiable. Basically we shouldn't be able to have different sets of coefficients that give the exact same predictions. If every group had its own coefficient and we had an intercept in the model then it wouldn't be identifiable. To illustrate imagine taking all of the coefficients for the groups and adding 5 to them but then reduce the intercept by 5 as well. The predictions for every group stay the same but we've changed all of the parameters in the model. If we can change all of the parameters and have no impact on any of the predictions then how could we do any sort of meaningful inference about whether any coefficient is different than zero? What do the coefficients actually mean in those situations? The answers aren't very straight forward so traditionally we just limit force one group to be a reference group and then all of those problems go away. There are, of course, other approaches to solving that particular problem though.