Must categorical variables always be dummy coded in linear regression?

#1
My question is must we always dummy code categorical variables in linear regression. What if I wanted to understand the overall effect of a categorical variable couldn't I just include in it my regression without dummy coding it? What if I had ordered categorical data could I include it as one variable without dummy coding and get an overall sense of the estimated relationship?

Alternatively, what if I wanted to control for the variable but not necessarily know the relationship among the various indicators? For instance, let's say I'm running a regression on house sales and I wanted to control for house color. I don't really care which color sells at a higher value but I would like to know overall if house color is significant. Couldn't I just include the categorical variable without dummy coding?
Thanks
 

trinker

ggplot2orBust
#2
How could you include it without dummy coding it? Have you ever worked through regression by hand? With matrix algebra? If not this is a valuable exercise. You'll see it needs numbers. Dummy coding is merely a way of tricking the regression to accept categorical (well maybe it's more sophisticated than tricking).
 

Karabiner

TS Contributor
#3
What if I had ordered categorical data
ok, that means something like "very small - small - bigger - very big".
In order to handle this more easily, people might use numbers, e.g.
very small = 1, small = 12, bigger = 1000, very big = 100000000.
Or anything else (e.g. 1 2 3 4), as long as there's a correct ranking.
Of course there might be people who confuse indicators for the
ranking of variables with numbers which can be treated like interval
scales (so, in a regression they would use 1, 12, 1000 and 10000000
as if it were real numbers), but don't do that yourself.

With kind regards

K.