yours is a clear example of what is called the "unit of analysis problem" in the social/behavioural/helath sciences. i would advice against the averaging procedure because you'd fall into what it's called an

ecological fallacy. it's a mathematical fact that the variance means is considerably less than the variance of individuals, yielding biased correlation coefficents.

the most correct method of dealing with this kind of stiuations where your data is nested is through a specific extension of the general linear model called hierarchical linear models or

multilevel modeling. if you read the "Level"section of the article you'll see the same situation of what you're dealing with here: puplis nested within classes (or teachers in your case). so you'd have a 2-level model where students'scores are your level-one predictor and teachers your level-two predictor. and even if you dont have equal class sizes or missing data, multilevel models can handle that quite nicely...

hope it helps!