Hi everyone:
I have a question about using logistic regression or GLM with only categorical variables as independent variables.
The dependent variable y is shown as binary coded data where 1=males and 0=females. Each corresponding to an excrement. The independent variables are as follows x1=habitat type (NF,SF,CP,RM,CV), x2=elevation (a,b,c,d), x3=forest road (FoR, FaR, MPR) and X4=region (A,B,C,D,E). I would like to know if there are any differences in distribution between males and females among these variables.
Would there be any type of GLM using only categorical variables as independent variables or would logistic regression be my best option?
Below is the dataset:
y X1 X2 X3 X4
1 NF c FoR B
0 NF c FoR B
0 NF b FoR B
1 SF b MPR B
0 SF a MPR C
0 SF a MPR D
1 CP a MPR D
0 SF a FaR A
1 CV a FaR A
0 CV a FaR A
0 SF a FaR A
0 SF a FaR D
1 SF b FaR E
0 CP a FaR A
0 CP a FaR A
1 SF a FaR B
1 CP a FaR A
0 CP a FaR A
1 NF c FoR B
1 NF d FoR B
1 CV a FaR E
0 SF a FaR E
1 CP b FaR A
1 CP a FaR A
1 CP a FaR A
1 RM a FaR A
1 CP a FaR A
0 CV a FaR A
0 SF a MPR D
1 NF c FoR B
1 NF d FoR B
0 SF a FaR E
0 CV a FaR A
1 CP a FaR A
1 CP a FaR A
1 CP a FaR A
1 RM a FaR C
1 NF a FoR B
1 NF d FoR B
0 NF c FoR B
1 SF a FaR B
1 CP a FaR A
0 CV a FaR A
0 NF a FoR C
0 SF a FaR C
0 CV a FaR E
0 CV a FaR E
0 CV a FaR A
0 CV a FaR C
0 SF a FoR C
1 CP a MPR D
1 SF a FaR E
1 CP b FaR A
1 CV a FaR A
1 CP b FaR A
0 CP a FaR A
0 RM a FaR C
1 SF a FaR C
1 NF d FoR B
Any thoughts gladly received, and thanks in advance
I have a question about using logistic regression or GLM with only categorical variables as independent variables.
The dependent variable y is shown as binary coded data where 1=males and 0=females. Each corresponding to an excrement. The independent variables are as follows x1=habitat type (NF,SF,CP,RM,CV), x2=elevation (a,b,c,d), x3=forest road (FoR, FaR, MPR) and X4=region (A,B,C,D,E). I would like to know if there are any differences in distribution between males and females among these variables.
Would there be any type of GLM using only categorical variables as independent variables or would logistic regression be my best option?
Below is the dataset:
y X1 X2 X3 X4
1 NF c FoR B
0 NF c FoR B
0 NF b FoR B
1 SF b MPR B
0 SF a MPR C
0 SF a MPR D
1 CP a MPR D
0 SF a FaR A
1 CV a FaR A
0 CV a FaR A
0 SF a FaR A
0 SF a FaR D
1 SF b FaR E
0 CP a FaR A
0 CP a FaR A
1 SF a FaR B
1 CP a FaR A
0 CP a FaR A
1 NF c FoR B
1 NF d FoR B
1 CV a FaR E
0 SF a FaR E
1 CP b FaR A
1 CP a FaR A
1 CP a FaR A
1 RM a FaR A
1 CP a FaR A
0 CV a FaR A
0 SF a MPR D
1 NF c FoR B
1 NF d FoR B
0 SF a FaR E
0 CV a FaR A
1 CP a FaR A
1 CP a FaR A
1 CP a FaR A
1 RM a FaR C
1 NF a FoR B
1 NF d FoR B
0 NF c FoR B
1 SF a FaR B
1 CP a FaR A
0 CV a FaR A
0 NF a FoR C
0 SF a FaR C
0 CV a FaR E
0 CV a FaR E
0 CV a FaR A
0 CV a FaR C
0 SF a FoR C
1 CP a MPR D
1 SF a FaR E
1 CP b FaR A
1 CV a FaR A
1 CP b FaR A
0 CP a FaR A
0 RM a FaR C
1 SF a FaR C
1 NF d FoR B
Any thoughts gladly received, and thanks in advance