Analyzing longitudinal unbalanced data - mixed effect quantile regression

Hi Guys,

For my master thesis I have to analyze a large set of longitudinal data, where company data is remeasured every year. Within these data, not all companies have the same number of measurements, which makes the data unbalanced.

Since longitudinal data comes with dependence within-subject, I cannot use normal OLS regressions. Therefore, I was thinking of doing mixed effect linear and logistic regressions.

However, for my additional analysis I was planning to perform a quantile regression. The problem now is that because of the longitudinal data I will have to do a mixed effect quantile regression, of which I have no idea how it works.

My question to you is whether anyone of you perhaps has some experience with this test, and could enlighten me as for how to best approach this?

Thank you in advance!


Not a robit
Have you ever conducted quantile regression before? That would be a first step. I have not run repeated measure quantile reg before, but I know it is an option. You have multiple observations, and this is why you want to do MLM correct?

What do you mean by this, "not all companies have the same number of measurements"?
Yes, correct, that's why I want to do a mixed effect/multilevel model. With "not all companies have the same number of measurements/observations" I mean the next: In my dataset I have 6000 companies, of which some have measurements in for example 2001, 2002 and 2003, whilst others have measurements in multiple or even each year between 2001 and 2018. So in other words, some companies have 3 observations while others have 5, 8 or 17 observations. This makes the data unbalanced.

To be fair, I haven't performed a normal quantile regression before, but I have looked into it and so far I get it conceptually. However, the mixed effect version seems a lot more complex, and I don't really know how to tackle that one. Do you recommend me to first conduct a normal quantile regression? And do you maybe have an idea of how to approach the mixed effect quantile regression?



Not a robit
Well a general rules says you may need quite a few groups to conduct MLM, say 70. You have many more. Next, I never recall how many within group points are recommended, but I would guess it is comparable to linear reg.

A question for you, say you are able to run the MLM Quantile Reg model, what is your hypothesis? I say this because you have multiple time points and multiple groups. Yes you have quite a bit of data, but the number of possible test comparisons is very high and would likely need to be severely adjusted for false discovery.

I have not looked into running MLM quantile reg for repeated measures before. I can try to help where possible, but that may be the extent. I can't think of any one regularly on this forum that would have used that approach. I have used quantile regression and also MLM, but not together. Just curious why you want to use quantile reg in lieu of linear? What program do you plan to use? If you have any online resources, feel free to post links and if I have time I can try to skim. Though, I will be traveling the next 10 days, so may be slow to give any feedback.
Thanks again for your reply, very much appreciated!

To answer your question, I am looking to use the quantile regression to complement my basic analysis, for which I am indeed using a mixed method model regression, both linear and logistic. The reason to do this quantile regression is to see how the quantiles are dispersed for my dependent variable, firm performance. Getting information on the mean firm performance is one thing, but getting additional information on high or low performing firms specifically adds enormous value to my analysis. I haven't formed my precise hypothesis yet, but I could imagine it to be something like: "the effect of faultlines (my independent variable) on firm performance may vary depending on an organization's level of performance".

If it would appear that this analysis really is far too complicated, I have no choice but to leave it out. However, I am willing to put time and effort into it to make it work, and I still have 2 months left until my deadline:)

For these analyses, I am planning to use Stata. I've heard that Stata has a plugin available that allows for a mixed effect quantile regression to be conducted. Hopefully, I'm not wrong haha.

I have indeed found some literature on this. However, for me, it is extremely difficult to interpret these since I don't have a mathematical background. I'll attach the literature to my post.

Thanks again for your efforts, very kind and highly appreciated!!



Not a robit
If i get a chance i will check out the attachments. Yes, i have seen quantile related approaches in STATA, since historically economists have used it.

Good luck and update this thread along the way. I may need to one day figure it out. PS, I'm not a STATA user, most I'M mostly in SAS and R.

it's been a month now and I've learned a lot of new things. through Stata I've tried working with Xtqreg, which is a command that can seemingly process longitudinal data. The only thing I don't really understand is how I'm supposed to make a random intercept model now. I've attached my output to this post. Do you think this output makes sense?

Quantile 1.png

Quantile 25.png
Quantile 75.png Quantile 9.png