+ Reply to Thread
Results 1 to 7 of 7

Thread: GLM on non-linear/curvilinear data

  1. #1
    Points: 497, Level: 9
    Level completed: 94%, Points required for next Level: 3

    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts

    GLM on non-linear/curvilinear data




    Dear all! I’m sure somebody of you will help me with my non-linear data-problem.

    I measured a fitness correlate of an animal (response variable) and want to investigate the effect of the following predictor variable: gender (male/female), temperature (9, 12, 15, 18, 21, 24 °C), population (four different populations), parasite infection status (control, parasite exposed but not infected, parasite exposed and infected). The normal way would be calculating a GLM on the response variable and including all predictor variables as main effects. Subsequently, I would check the residuals for normality (Q-Q Plot) and if the residuals are approximately normal distributed I’m done (if not I would Box-Cox transform the response variable and start from the beginning).

    But it is not that easy… At least sometimes the response variable seems to be not linear over temperature but a curve with the highest fitness at 15 °C and lower fitness at lower and higher temperatures. To account for this, I want to include an additional quadratic term (temperature*temperature) into the model.
    Here are my questions:

    1. Can I just compare the p-values of temperature and temperature*temperature to figure out if I have a linear or quadratic relationship?

    2. Later I want to plot the fitness over temperature for all 24 combinations of the predictor variables (for example for control males of the Population XY). How do I know if I should fit a linear regression or a curve? My data looks like the relationship for some combinations is linear and for others, it appears to be non-linear. But from the GLM I just get one p-value for temperature and one p-value for temperature*temperature…

    3. Does my residuals still have to be normally distributed and can I still Box-Cox transform my predictor variable if the residuals are not normally distributed?

    Thanks to all of you!!!

    Fred.

  2. #2
    Human
    Points: 12,676, Level: 73
    Level completed: 57%, Points required for next Level: 174
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,362
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: GLM on non-linear/curvilinear data

    Oh dear,

    It seem like no one is answering Freds posts. But he has asked about this also in July and in June.

    It seems to be about how well some fish are after a randomized? experiment.

  3. #3
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: GLM on non-linear/curvilinear data

    Hi Fred,

    It's been a crazy busy year for me, so I'm not so active here but as Greta ensured that I noticed this I cant help but give it a shot.

    Looking at your previous questions and the above, I can say that the GLMM was fine and it didn't have to exclude all your "control fish from the model since they have missing values". You should just code the control fish parasite population as a separate category say "C" which means zero parasites. You see, it not that you don't have data there you know exactly how many parasites there were right. Zero! That's not missing data!

    As for your questions:

    1) No, a p-value says little of the overall fit of the model, look at the R2 values or AIC scores at least. If these improve somewhat you have evidence for a quadratic relationship, if these improve hugely you have strong evidence for a quadratic relationship. Then , you could also maybe plot both models against your data and visually confirm the relationship. Finally, a GAM could be used to get an idea of the moving average or functional shape.


    2)
    This questions seem related to #1. Plot the scatter plot, look at AIC values, maybe fit a moving average with a gam.

    3) You are using a GLM. This means that the error distribution does need to be normally distributed depending on exactly which model you are using. Which has me wondering what kind of GLM you are using? Or are you just using a LM?

    Hope this helps!
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  4. #4
    Points: 497, Level: 9
    Level completed: 94%, Points required for next Level: 3

    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: GLM on non-linear/curvilinear data

    Hi Greta and Ecologist,

    Thanks for your reply!

    Luckily my current question is not directly linked to the ones I had in summer – the summer-problems have been solved in the meanwhile… (-;

    I want to use a general linear model (GLM) for my analysis. As far as I understood this is a type of a generalized linear model (GZLM) with Gaussian distribution and identity link. Could it be that the abbreviation GLM leads to confusions since in R a GLM is computed with lm() and a GZLM with glm()?

    As far as I understood you, the normal distribution of residuals is also required if I add a quadratic term to the model – right?

    The GAM procedure seems to be interesting. I have never heard about this before. I did a short web-search and could find some information. Unfortunately it seems not to be implemented in SPSS. Nevertheless, is there still a way to calculate a GAM in SPSS?

    Thanks,

    Fred.

  5. #5
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: GLM on non-linear/curvilinear data

    Quote Originally Posted by Fred. View Post
    Hi Greta and Ecologist,
    I want to use a general linear model (GLM) for my analysis. As far as I understood this is a type of a generalized linear model (GZLM) with Gaussian distribution and identity link. Could it be that the abbreviation GLM leads to confusions since in R a GLM is computed with lm() and a GZLM with glm()?
    A generalized linear model specifying an identity link function and a normal family distribution is exactly equivalent to a (general) linear model. Unhappy choice of abbreviations in statistical programs may contribute to this, but in general glm will be understood to mean a generalized linear model.

    But yes linear models need normally distributed residuals.

    SPSS does not seem to have GAMs but alternatives may exist
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  6. #6
    Points: 134, Level: 2
    Level completed: 68%, Points required for next Level: 16

    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: GLM on non-linear/curvilinear data

    Quote Originally Posted by GretaGarbo View Post
    Oh dear,

    It seem like no one is answering Freds posts. But he has asked about this also in July and in June.

    It seems to be about how well some fish are after a randomized? experiment.
    your post is fruitful for Fred

  7. #7
    Points: 497, Level: 9
    Level completed: 94%, Points required for next Level: 3

    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: GLM on non-linear/curvilinear data


    Hey again and thank you so much so far.

    I tried to find some advice how to handle a quadratic term in a generalized linear (mixed) model but information on that are quite rare. That’s why I come back to you with some further questions. Hopefully, somebody here can help me again…
    At the moment I am calculation a model with the following variables:

    TEMPERATURE (9, 12, 15, 18, 21 and 24 °C), POPULATION (A, B, C and D), GENDER (male and female) and INFECTION_STATUS (control, exposed but not infected, infected)

    Since I expect quadratic curves with an optimum somewhere in the middle of my temperature-scale I also add the quadratic term TEMERATURE_SQUARED to my model. In most cases both TEMPERATURE and TEMPERATURE_SQUARED are significant.
    Now, coming to interactions, I have some questions:

    - What would a significant interaction TEMPERATURE x TEMPERATURE_SQUARED tells me? I just think it would make no sense – that’s why I did not include this interaction into my model. Am I right?

    - What about interactions between TEMPERATURE and one of the other variables- for example, POPULATION? Would I need to include TEMPERATURE x POPULATION, TEMPERATURE_SQUARED x POPULATION or both to my model to figure out if my populations react differently at different temperatures? What is the difference between both?

    Hope you get my questions…!?

    Thanks again for your Help!

    Fred.

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats