Guidelines on Unequal Sample Sizes in Multilevel Models

One of the advantages of using multilevel models is their tolerance to heterogeneity of variances between groups (or points in time for multilevel models of change). And one of the main problems with having unequal sizes between groups is that this inequality can create heterogeneous variance.

I suspect that this means that multilevel models are more tolerant to differences in sample sizes than, say, ANOVAs. However, does any know _how_ tolerant they are? I've looked around, but can't find good guidelines or suggestions for handling unequal sample sizes in multilevel models?

For example, how unequal is too unequal? Does it affect other assumptions or tests, e.g., does having more of the variance--homo- or heterogeneous--come from one group and not the other affect interpreting results?