Can fixed effects models deal with nested data?


I'm reading a paper now:

English teacher candidates developing dialogically organized instructional practices (2013)
[there's five authors so I'm not going to put the full citation]

The authors state (LINK to a photo of that page):

To address the likely selection bias that results from nesting of lessons within teachers, we used fixed-effect models to isolate the within teacher variance in classroom instruction.
This is nested data (level one lesson plan, level 2 teacher). My question is is this statement sensible? I thought a random effects model would address the nested design. They have in the regression tables OLS models and then in the right column a fixed effects coefficients. Here is an image of the table:

Am I incorrect to assume they were attempting to use a multilevel model? If they use fixed effects are they really addressing the nesting? From my metaanalysis course it seems a fixed effects model assumes equal variance across teachers. Maybe they're using weighted least squared?

If I need to provide more info please ask. I can't upload the entire paper without breaking ethical sharing.


So I've gathered from my reading that the authors did not approach the nested data appropriately. The fixed model does not deal with the likely correlated error terms. I don't think the authors used any sort of multilevel analysis but instead have used some sort of ANOVA. I gathered this from my understanding of fixed effects in meta-analysis and from this Raudenbush chapter (last page). Unfortunately, they are not transparent in their methods. If anyone would care to further take a look PM me and I'd send you the paper that way.


Less is more. Stay pure. Stay poor.
Without reviewing any of the posted materials, could it be that they included a variable for the clusters as an indendent variable and assumed it controlled for the effects (and was fixed effects)? Which this may be poorly descriptive wording on the concept. Or they just misused concepts.


Cookie Scientist
Have not looked at the paper that you mention, but based on the brief info you gave, it looks to me like the authors' use of a fixed effects model is entirely appropriate here. The whole purpose of fixed effects model is to handle nesting/clustering within data -- it is their raison d'être. It handles the nesting in a similar (but not equivalent) way to how random effects models do, namely, by adjusting the predicted values of the model separately for each cluster. In the case of fixed effects models, this is achieved by literally adding a categorical predictor (or predictors) into the model to make conditional predictions for each cluster.

I think the following forthcoming paper does a great job talking about these issues, and also discussing why clever use of random effects models is still pretty much always your best bet (although, again, there is nothing incorrect per se about using fixed effects models, either in the particular case that you mention or in general):


Chat on this subject from the chatbox:

Date	      User Name	Chats
3/10/13 2:48 PM	Jake	okay, read the thread and posted in it
3/10/13 3:01 PM	trinker	Thank you both
3/10/13 3:04 PM	trinker	I I'm trying to connect this to what I already know via meta analysis. Does the fixed effect assume equal variances where as the random effects does not?
3/10/13 3:07 PM	Jake	they both assume equal within-cluster variance of y if that's what you mean
3/10/13 3:09 PM	Jake	one big difference is that fixed effects models analyze ONLY the within-cluster effects, whereas mixed models look at variance both within and between clusters. this is what gives mixed models their greater efficiency, however, it also means that you have to be careful to make sure the within-cluster effects are not too different from the between-cluster effects
3/10/13 3:09 PM	Jake	the paper i linked to in the thread talks about this issue in detail. its really quite a good paper
3/10/13 3:14 PM	Dason	random effects also puts a distributional assumption on cluster effects whereas fixed effects doesn't make any distributional assumption for those
3/10/13 3:15 PM	Jake	right. another source of efficiency for mixed models
3/10/13 3:18 PM	Dason	I don't work with mixed models as much as I think you do - I'm wondering how often, in practice, people check the distributional assumption of their random effects
3/10/13 3:18 PM	Dason	because I know there are complications associated with that
3/10/13 3:20 PM	Jake	i dont think too often really. in textbooks they usually go over checking that, but in practice i think people usually skip it. i usually skip it to be honest
3/10/13 3:20 PM	Dason	what do the textbooks offer as a way to check that assumption?
3/10/13 3:21 PM	Jake	e.g., assessing the alpha on a normal QQ plot
3/10/13 3:21 PM	Dason	using the blups as a plugin estimate I'm assuming
3/10/13 3:21 PM	Jake	right, by alphas i mean BLUPs
3/10/13 3:22 PM	Dason	because I've seen simulation evidence that even when you have normal errors and normal random effects that the qqplot doesn't quite look as nice as we would like
3/10/13 3:22 PM	Dason	one of the other grad students in the department I think is working on ways around this problem
3/10/13 3:23 PM	Jake	i find that the qqplot of residuals almost always looks funky, but i assume it is because really it comes from a mixture of many different normals
3/10/13 3:23 PM	Jake	as for qqplot of random effects, like i said, i usually don't even look at it =\
3/10/13 3:24 PM	Jake	my assumption is that the normality of random effects assumption matters less and less as number of clusters increases, similar to normality of errors in OLS
3/10/13 3:25 PM	Dason	Like I said I haven't done too much with this but it seems like most of the theory is derived under the assumption that we know the variance terms. In which case blups really are our best estimates and if we do know them and all the effects are normal then the distribution of the blups are normal... but in practice we estimate everything and it because a huge horrible nonlinear problem...
3/10/13 3:25 PM	Jake	i dont actually know that to be true though
3/10/13 3:26 PM	Dason	@Jake - That seems reasonable - I'm guessing the variance term estimates are consistent even if we make an incorrect distributional assumption and typically we just want that variance estimate to do actual inference
3/10/13 3:27 PM	Jake	yes typically. although there are some interesting cases where we want to do inference on the variance estimates too
3/10/13 3:27 PM	Jake	i like to talk about those cases when i present these models to my colleagues. its a whole new type of testing that we arent used to
3/10/13 3:27 PM	Jake	in my field that is
3/10/13 3:28 PM	Dason	I'm guessing the scaled variance estimates are asymptotically chi-squared as the number of clusters increases regardless of the distributional assumptions
3/10/13 3:29 PM	Jake	meh, sure ;)
3/10/13 3:29 PM	Dason	then again it's probably trickier than I imagine since we have to estimate a lot of quantities that we typically take as 'given' when deriving most of that theory...
3/10/13 3:30 PM	removed	removed
3/10/13 3:30 PM	Jake	yes removed i think so
3/10/13 3:31 PM	Dason	@removed - that's typically the case
3/10/13 3:31 PM	Jake	hglm?
3/10/13 3:31 PM	Dason	but it doesn't have to be that way.
3/10/13 3:31 PM	removed	removed
3/10/13 3:32 PM	Dason	well you can fit other models
3/10/13 3:32 PM	Dason	I think the hard part is trying to justify why you fit a different model
3/10/13 3:32 PM	removed	removed
3/10/13 3:33 PM	Jake	its hard for me to imagine a lot of cases where you have good reasons for strongly believing the random effects should be something other than normal
3/10/13 3:33 PM	Dason	especially since the main advantage of fitting the mixed model is that you're no longer treating everything as independent and are adequately modeling the covariance.
3/10/13 3:33 PM	Dason	I can think of cases where using something else might be reasonable
3/10/13 3:34 PM	Dason	I've even fit cases where we used something else
3/10/13 3:34 PM	Jake	like what
3/10/13 3:34 PM	Dason	gamma random effects for poisson response
3/10/13 3:35 PM	Jake	i see
3/10/13 3:36 PM	Dason	I can't remember the exact context but we had multiple counts from certain machines... or maybe it was missiles or something. It was real data though.
3/10/13 3:36 PM	Jake	poisson response usually uses log link function right?
3/10/13 3:36 PM	Dason	Well this was just a hierarchical model - no real need to fit a link function since we didn't have other covariates
3/10/13 3:37 PM	removed	removed
3/10/13 3:37 PM	Jake	okay, sure, but you could imagine that with a log link function you wouldnt need to assume a truncated distribution for the random effects, right?
3/10/13 3:37 PM	Dason	but sure we could have used a log link and fit normal random effects
3/10/13 3:38 PM	Jake	so with an appropriate choice of link function isnt it probably the case that a normal distribution of random effects is not much of a stretch?
3/10/13 3:39 PM	Dason	maybe
3/10/13 3:39 PM	Dason	for the most part the log-normal and the gamma can look fairly similar
3/10/13 3:40 PM	Dason	but there are some gammas that can't be considered to look similar to a log normal
3/10/13 3:40 PM	Jake	but those gammas probably are plausible distributions of poisson parameters, no?
3/10/13 3:40 PM	Jake	aren't* i mean
3/10/13 3:41 PM	Dason	They could be - I don't see why not.
3/10/13 3:41 PM	Jake	i guess i dont know what kind of gamma shape youre thinkin of
3/10/13 3:42 PM	removed	removed
3/10/13 3:44 PM	removed	removed
3/10/13 3:49 PM	Jake	removed i looked on amazon and didnt see the book you mentioned. got link?
3/10/13 3:49 PM	Jake	found it -- glm with random effects
3/10/13 3:52 PM	removed	removed
3/10/13 3:53 PM	removed	removed
3/10/13 3:55 PM	Dason	@Jake - a gamma with alpha <= 1 isn't fit *too* well by a log normal. Probably adequately enough but it depends on what you're doing.
3/10/13 4:02 PM	trinker	Folks had to step out for an hour. Reading through your responses now.
3/10/13 4:09 PM	removed	removed
3/10/13 4:11 PM	trinker	@bg I want to sometimes test a package on various versions of R or see what a user experiences on that OS.
3/10/13 4:16 PM	Jake	dason, okay sure, there are some gamma shapes a log-normal can't emulate too great and probably vice versa. but in terms of having a better statistical model, is there any a priori, theoretical reason to believe that the model based on gamma random effects is in any way *better* than the log-normal model?
3/10/13 4:17 PM	Jake	removed, i often feel the same way about researchers wanting to have their pudding and eat it too when it comes to choosing fixed vs. random effects
3/10/13 4:18 PM	Jake	mainly the reason i am harping on this dason is that very often when i present these models to audiences in my field who are not familar with them, they bring up the issue of assuming normal random effects and seem to want to imply that it is a big limitation or restriction
3/10/13 4:19 PM	Jake	and i think, even setting aside the fact that we dont *have* to make that assumption, that really it is not a big limitation at all, in fact usually the assumption of normal random effects seems quite reasonable