I believe the answer to your confusion is understanding the confusion between the sample and the population. In the population the dependant variable approximates normality where as in the sample you are unlikely to have a normal distribution. Capital Y refers to a population where as lower case y refers to a sample. You have a capital Y so you're refering to the population.Y~N(Xb, sigma)

I think this will relieve your confusion but if not let us know.

trinker