PDA

View Full Version : logistic regression in R

claire.saraux
06-23-2009, 11:13 AM
Hi,

I would like to compute a logistic regression in R.
I've got a three-state categorical variable taking 1, 2 and 3 as values. I would like to explain it with both continuous and categorial variables.
That's why I thought about logistic regression but I don't know how to handle that in R for a "trinomial" variable.
I found this polr function on internet but I don't know exactly what it is doing ? And it doesn't give me any P-values. I'm also surprised of these three intercepts it gets me...
Does anyone know what is polr used for ? And is that really what I need ?
Any idea on the subject would help me a lot !
Thanks,
Claire

vinux
06-23-2009, 11:23 AM
I've got a three-state categorical variable taking 1, 2 and 3 as values.

Is it your response variable? . if yes, is it ordinal?

claire.saraux
06-23-2009, 11:25 AM
Yes it is my response variable ! And yes again it is ordinal

vinux
06-23-2009, 11:28 AM
Then you could use ordinal logistic regression.
http://lib.stat.cmu.edu/S/Harrell/help/Design/html/lrm.html
(http://lib.stat.cmu.edu/S/Harrell/help/Design/html/predict.lrm.html)

claire.saraux
06-23-2009, 11:48 AM
well thank you very much!
a last question : how could I get a summary for this kind of model ? I can ask for estimates and stats of the model, but when I ask for a summary it answers me "adjustment values not defined here or with datadist "...

vinux
06-23-2009, 12:00 PM
May be this helps.
http://www.ats.ucla.edu/stat/R/dae/ologit.htm
Here apply is the response variable take 0,1,2

In your case it will have 1,2,3 . You will also get two intercept and beta coefficients.

claire.saraux
06-23-2009, 12:21 PM
indeed it helps !!
Thanks again !
and one more last question and then I will stop bothering you !
Is there any checking to do afterwards ? Like anything on residuals...
And it doesn't give me any AIC or log-likelihood ? Can I get them by any way so I can use them for model selection ?

vinux
06-23-2009, 12:34 PM
Not sure in R. But I am sure in SAS, it gives AIC and log likelihood.

claire.saraux
06-24-2009, 07:51 AM
I ask again about the AIC or log-likelihood. As I said R doesn't give me any of those and if I try to get them manually, it answers me that there is an error because the family is not in c("gaussian", "Gamma", "inverse.gaussian")
Does anyone know if I can do something to get them? Or have you guys any other ideas for model selection ?
Thanks
Claire

TheEcologist
06-24-2009, 08:24 AM
I ask again about the AIC or log-likelihood. As I said R doesn't give me any of those and if I try to get them manually, it answers me that there is an error because the family is not in c("gaussian", "Gamma", "inverse.gaussian")
Does anyone know if I can do something to get them? Or have you guys any other ideas for model selection ?
Thanks
Claire

Hi,

Post your code so I know what it is you are doing exactly. What models are you testing against each other?

Before you can 'get' an AIC value you need a likelihood function, that depends on how you define your error structure/ how your model is fitted (via MLE or SS).

claire.saraux
06-24-2009, 08:37 AM
Here is a part of my code with some explanation before :
My response variable is peak taking 0,1, 2, 3 and 4 as values
SSI and BC are continuous variables (structural size index and body condition)
Depart_rel is a positive integer (time spent at the colony before departure)
The 4 SST and SOI are climatic parameters averaged on different period...

First I would like to compare different models with the same number of parameters depending only on different SOI and SST variables
mod<-lrm(peak~SSI + BC + ***E + Depart_rel + factor(Tag_Year)+ SSTannee1 + SOIannee1 + SSTprint + SOIprint,data=returns1)
print(mod)
mod2<-lrm(pic~SS + BC + ***E + Depart_rel + factor(Tag_Year)+ SSTannee1 + SOIannee1 + SSTwint + SOIwint,data=returns1)
print(mod2)
mod3<-lrm(pic~SS + BC + ***E + Depart_rel + factor(Tag_Year)+ SSTannee1 + SOIannee1 + SSTterre + SOIterre,data=returns1)
print(mod3)

Looking only at the pseudo R&#178; mod2 seem to be the better one, I thus consider it as the maximal model and remove afterwards step by step the least significant terms until the model contains nothing but significant terms. I then chose my best model basing my selection on both parsimony and R&#178;. But I'm not that happy about this method...

mod2a<-lrm(pic~SS + BC + Depart_rel + factor(Tag_Year)+ SSTannee1 + SOIannee1 + SSTwint + SOIwint,data=returns1)
print(mod2a)
mod2b<-lrm(pic~SS + Depart_rel + factor(Tag_Year)+ SSTannee1 + SOIannee1 + SSTwint + SOIwint,data=returns1)
print(mod2b)
mod2c<-lrm(pic~SS + Depart_rel + factor(Tag_Year)+ SSTannee1 + SSTwint + SOIwint,data=returns1)
print(mod2c)
mod2d<-lrm(pic~SS + Depart_rel + factor(Tag_Year)+ SSTwint + SOIwint,data=returns1)
print(mod2d)

Hope that's clear enough and it can help you understand my problem !