The title says it all, I would really like to know what Generalized Linear Model is and how it differs from General Linear Model? I have some data that I want to analyse (Q-Q plots show no normal distribution of data), which I assumed have to be analyzed by non-parametric methods. Since I have two factors and want to see if the interaction between is significant, I couldn't use Kruskal-Wallis on SPSS since it only allowed 1 factor. Someone told me that I then could use Generalized Linear Model if I had more factors and wanted to check for interaction but I want to know really what that model is and if there are other "better" options than that!

A general linear model is a generalized linear model but not vice versa. A generalized linear model can use any number of sampling distributions where as the general linear model works of the Gaussian distribution.

This is not a correct assumption. Linear models assume the population distribution you draw from is normal not the data itself. To check this you must first run the model and then look at a QQ-plot of the error terms (residuals) to detect non-normality, then you make decisions about parametric vs. non-parametric.

If your data does come from a normal population distribution then the parametric test will most likely give you the greatest power, not the no-nparametric tests as they are more flexible and make less assumptions but also give you less power to reject the null hypothesis.

This is not a correct assumption. Linear models assume the population distribution you draw from is normal not the data itself. To check this you must first run the model and then look at a QQ-plot of the error terms (residuals) to detect non-normality, then you make decisions about parametric vs. non-parametric.

Ok, I thought I had checked a Q-Q-plot of the residuals, not sure though...How do I do that on SPSS?

Ok, maybe I should tell more about my data. It's count data on river dolphins. However they are not evenly distritubed along the area I surveyed but there are some specific habitats that seem to be more "preferred". And since I surveyed a big area, which I divided into habitats, I got many zero-counts on my data. The survey was firstly done as transects between 1-3 km along the area and due to that, I decided that to analyze the data correctly, I should convert the counts into dolphins per km and THEN check the data for normality etc. to know if I should continue with parametric or non-parametric analysis. I'm a beginner in statistics and find it really hard so I don't know if I'm doing all wrong...

You'll know if you got a QQplot of the residuals if you did it after you ran the model. Is this the case? I think in SPSS you have to ask for the residuals and this is what you make a QQ plot of.

This is usually the best place to start. What do you have is the most important question to ask before you ask what to do about it

This ecological data screams to me that you may want to investigate a Generalized Linear Model probably the Poisson and/or negative binomial distributions and likely account for the zero inflated data. I suggest you look at this LINK as it is one of my favorites for how to's in different stats programs including SPSS.

I'm no expert in stats myself but have had a great deal of education around it and I still don't know if I'm doing it all wrong Stats isn't really about right or wrong. often there's many approaches and tests that will work with your data. This quote:
Essentially, all models are wrong, but some are useful
by George E. P. Box helps keep perspective on model selection.

We have at least two ecologists that are contributors here on talkstats (unfortunately for you, not them , I think both are in the jungle right now) and hopefully one of them or a more knowledgeable other will give you further direction.

A generalized linear model allows for a different conditional distribution for the response other than the normal distribution. So for a given value of covariates in a general linear model we assume the response is normally distributed - with a generalized linear model we allow it to be any one of a certain type of distribution (poisson, negative binomial, gamma, lognormal, or even normal).

Without knowing more about your data it's hard to speculate which distribution would be appropriate though.

A generalized linear model allows for a different conditional distribution for the response other than the normal distribution. So for a given value of covariates in a general linear model we assume the response is normally distributed - with a generalized linear model we allow it to be any one of a certain type of distribution (poisson, negative binomial, gamma, lognormal, or even normal).

yeah ok, I can attach a part of my data so that you understand. The factors would be "lokal (site)" and "habitat" and all that contains "Inia", "Sotalia" or "Dolphins" are the dependent variables. Those are in the unit dolphins/km since I was told that I had to standarize the data in order to analyze it since before that I had the unit "dolphins/transect" but since the transects are of different sizes (the "length" column), that would not be correct.

So, I wanted to first check if there is any significant difference between habitats (I have six different habitat types) with respect to the dolphins found. If so, then I would like to know how to perform some kind of post-hoc test to know WHICH habitats are different. I know how to do this if the data (or actually the residuals) would be normally distributed since I then could simply use the General Linear Model from SPSS and then check the box for post-hoc and get a Tukey test. However, according to me, this is not the case and thus I have to do non-parametric tests instead which I know nothing about but I've heard that Generalized Linear Model on SPSS would be the way to go. However here there are no post-hoc tests so I could not, for example, analyse to see WHICH habitats are different.

The same goes for my other factor, "lokal" since I want to check if there is a significant difference between sites (lokal) when it comes to dolphins. Finally if the interaction between the factors also is significant, i.e. if local have any effect on habitat or vice versa or if it doesn't matter at all.

Aaah, and one more thing that I cannot understand with this program (SPSS). I am pretty sure that I had try the "analyze->descriptive statistics->explore->" and then chosen factor "habitat" and then one of my dependent variables to check for normality via the Q-Q-plot, and I got one plot for the whole dataset. Now that I try to do the same, it seems to have decided that it will show me results for EACH category of my factor (Habitat) and not for the whole dataset...what is going on??? Don't think I have done anything different from before but I need to check the whole dataset for normality of the residuals, not for each of the category!!! Someone please help me before I go nuts!

