Normality / QQ-Plot in Generalized Linear Models (GLM)

#1
Hi,

sorry if this is a frequent question, but I did not find an appropriate answer:

If I have a GLM with a non-normal distributed outcome (such as Poisson- or Binomial distributed), does it make sense to check normality of the residuals?

Intuitively I would say no, since I explicitly assume that the outcome is not normally distributed. However, in different books QQ-plots or Histgorams for GLMs to check normality are suggested and performed. What is the motivation here?

Wouldn't it make more sence to use e.g. in the case of a Poisson-model other plots which are able to prove if residuals are Poisson distributed?

Thanks
 

Dason

Ambassador to the humans
#2
It doesn't really make sense to check for normality. It can make sense to check the residuals to examine the linearity assumption. There are also residual plots you can make to check the assumptions the model makes but it typically isn't quite as simple as plotting the raw residuals vs predicted values or something like that like we can do with a linear model.
 

noetsi

Fortran must die
#3
I agree with dason. If normality is not assumed why check for it? Its like checking for linearity with ordinal data.
 

noetsi

Fortran must die
#5
But in that case aren't you checking for a poisson distribution rather than a normal one? QQ plots can be used for a wide range of distributions not just normality.
 
#7
Hi, I asked the authors of one of the books which suggest normal-QQ-plot for Poisson models. Their argumentation is that in case of Poisson-models with a high mean or Binomial models with p=0.5 and a high N we have asymtotic normality, thus in these cases the data should be approximately normally distributed, which makes sense. Of course, in all other cases (Poisson-lambda close to zero or N=2 in a binomial Model) the normality is rather blurred and normal QQ-plots do not make much sense.
 

CowboyBear

Super Moderator
#9
Hi, I asked the authors of one of the books which suggest normal-QQ-plot for Poisson models. Their argumentation is that in case of Poisson-models with a high mean or Binomial models with p=0.5 and a high N we have asymtotic normality, thus in these cases the data should be approximately normally distributed, which makes sense. Of course, in all other cases (Poisson-lambda close to zero or N=2 in a binomial Model) the normality is rather blurred and normal QQ-plots do not make much sense.
Wow. I have real concerns when people who say things like this write textbooks. Yes, sure, the Poisson distribution converges to the normal as the Poisson mean/variance parameter goes to infinity. But so what? If you've specified a Poisson GLM, your model assumes that the conditional distribution of the response is Poisson, not normal. Testing whether it's normal is utterly pointless, even if in some cases the two are similar - just test the assumptions of the actual model! I feel like the more likely explanation here is that the author had a vague idea that producing a normal QQ-plot is just what you do when you run a regression type model, and is now confabulating some post hoc justification for giving bad advice.