Strong linear correlation in residual plot of LME - Reasons and solutions?

Hi everyone,

I am currently performing model analysis on a linear mixed model (4 fixed and 1 random effect) and observed a strong linear correlation in the residual plot (see below). Does anyone have any suggestions on how to adapt the model (transforming variables/introducing link functions etc.)? Also I don't really understand how this can happen...

Really appreciate your help!




Not a robit
Can you tell use more about your model, including context and variables. What program are you using and did it produce this graph or was it you.
Can you tell use more about your model, including context and variables. What program are you using and did it produce this graph or was it you.
I am using MATLAB with a self written script... Context and variables are unfortunately quite complicated and explaining in detail will open up more questions than it will close, I am afraid. In short I am predicting a clinical outcome based on continuous electrophysiological predictor variables with repeated measures for each patient (random effect). I hope that was a little helpful...
It's strange, all right. You would expect the linear part just to be absorbed into the model. You can get plots like this if you have inadvertently set the constant to 0 or some other figure - for example if you have forced a graph through the origin when it didn't really want to go there.
Or left it out of the model altogether?
Thanks a lot for your answer katxt! I think you are right and I might have found the problem. I chose a mixed model assigning 'patient' as a random effect to account for within patient repeated measures in the training cohort but then predicted the outcome on an independent cohort of new patients. This way the model doesnt contain any information about the random effect variable so I am predicting only based on the fixed effects leaving out the random effect variable. When looking at the residuals of the original training model (which includes the random effect variable as well) the residual plot looks nicely without any pattern. While I still don't understand how this can introduce such a strong effect I have the feeling it might have something to do with the fact that the number of measures correlates with the outcome (patients with a strong response had less measures than weak responders). Does that makes sense and if yes any suggestions how to handle it?

Again, thanks a lot for your help!
Last edited:


Not a robit
Great. My follow-up was going to be in regards to the disconnect between the relatively high R^2 and shape of residuals. But it seems it may be resolved :)
The more I look at it the less likely it seems. If only the outcomes of the new cohort were about 40 higher, all would be well. Try adding a constant to the model when you try the new cohort. See what happens.
Is it just a coincidence that the spread of residuals is about the same as the spread of Outcomes? kat