Logistic regression or GLM?

#1
Hi everyone:

I have a question about using logistic regression or GLM with only categorical variables as independent variables.

The dependent variable y is shown as binary coded data where 1=males and 0=females. Each corresponding to an excrement. The independent variables are as follows x1=habitat type (NF,SF,CP,RM,CV), x2=elevation (a,b,c,d), x3=forest road (FoR, FaR, MPR) and X4=region (A,B,C,D,E). I would like to know if there are any differences in distribution between males and females among these variables.
Would there be any type of GLM using only categorical variables as independent variables or would logistic regression be my best option?
Below is the dataset:

y X1 X2 X3 X4
1 NF c FoR B
0 NF c FoR B
0 NF b FoR B
1 SF b MPR B
0 SF a MPR C
0 SF a MPR D
1 CP a MPR D
0 SF a FaR A
1 CV a FaR A
0 CV a FaR A
0 SF a FaR A
0 SF a FaR D
1 SF b FaR E
0 CP a FaR A
0 CP a FaR A
1 SF a FaR B
1 CP a FaR A
0 CP a FaR A
1 NF c FoR B
1 NF d FoR B
1 CV a FaR E
0 SF a FaR E
1 CP b FaR A
1 CP a FaR A
1 CP a FaR A
1 RM a FaR A
1 CP a FaR A
0 CV a FaR A
0 SF a MPR D
1 NF c FoR B
1 NF d FoR B
0 SF a FaR E
0 CV a FaR A
1 CP a FaR A
1 CP a FaR A
1 CP a FaR A
1 RM a FaR C
1 NF a FoR B
1 NF d FoR B
0 NF c FoR B
1 SF a FaR B
1 CP a FaR A
0 CV a FaR A
0 NF a FoR C
0 SF a FaR C
0 CV a FaR E
0 CV a FaR E
0 CV a FaR A
0 CV a FaR C
0 SF a FoR C
1 CP a MPR D
1 SF a FaR E
1 CP b FaR A
1 CV a FaR A
1 CP b FaR A
0 CP a FaR A
0 RM a FaR C
1 SF a FaR C
1 NF d FoR B

Any thoughts gladly received, and thanks in advance ;)
 

ledzep

Point Mass at Zero
#2
GLM is broad/flexible and can handle different data types.
And Logistic regression is a special type of GLM with Logit Link.
I think that answers your question.
 
#3
Thank you for your reply!

To narrow my question: I would like know what kind of GLM with Logit Link can categorical variables be used as independent variables. The only examples that I find are GLMs with measurement variables as independent variables.
 

ledzep

Point Mass at Zero
#4
Thank you for your reply!

To narrow my question: I would like know what kind of GLM with Logit Link can categorical variables be used as independent variables. The only examples that I find are GLMs with measurement variables as independent variables.
Your response is binary (0/1). Hence, you can use logistic regression for your analysis with all your Xs as dependent variable in your model.
Code:
## If you were using R: it would look like this:
model<-glm(y~factor(x1)+factor(x2)+factor(x3)+factor(x4),data=my.data, family="binomial") ## Here family=binomial is telling that our data is binary.
GLM can fit a wide range of responses (gaussian/normal, binary, count,..). GLM uses something called link functions which links mean with the variance parameters. In case of binomial, the default (and canonical) link is logit, which is log-odds of the probability.

HTH
 

ledzep

Point Mass at Zero
#5
The dependent variable y is shown as binary coded data where 1=males and 0=females. Each corresponding to an excrement.
Wait, I think I know where your problem lies.
what is your response variable? Y (1/0) ? But shouldn't gender be dependent variable instead? I wouldn't expect gender to be something of your response.
I think your response is excrement? how do you measure it?
What does Y (0/1) mean? excretion=yes/no or Gender=M/F?
 
#7
Thank you for your response ledzep.

Yes. excrement is the dependent variable. 1= M; 0= F

I`ve been studying r lately and was using testing with the poisson family. Something like this: fit<-glm(y1~x1+x2+x3+x4+x5+x6+x7,family=poisson(link = "log"),data=waterpots.

Does it make sense?

I`ll also try the binomial family