# missing data in regression model

#### Brendaa

##### New Member
Hi guys,

I'm making a regression model using the glm funtion but I noticed that one level of my factors (Agriculture in Landuse) is always missing in the output of the model.

I attached the data I used and below is the code I use.
So my goal is to explain how well a combination of the Dry and Landuse (which consists of three levels; Agriculture, Pastoral and Protected) explains y (which is the distribution of a certain animal).

I followed a similar example in the Crawley R book so I know that the model should give me a result for the different types of Landuse and the combination of Landuse and Wet but as you can see, both Agriculture and Dry: LanduseAgriculture are missing!

I read somewhere that Agriculture might be absorbed in the intercept parameter but this doesn't solve my problem because now I don't have a p-value for two parameters.

Does anywone know why I don't get a result for Agriculture and Agriculturery and what I can do about it?

cheers,
Brenda

> y<-cbind(WD_Y, WD_N)
> pWD<-split(WD,Landuse)
> pDry<-split(Dry,Landuse)
> model<-glm(y~Dry*Landuse, binomial)
> summary(model)

Call:
glm(formula = y ~ Dry * Landuse, family = binomial)

Deviance Residuals:
Min 1Q Median 3Q Max
-3.5847 -1.2198 -0.9366 0.3011 5.6034

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -21.57864 1964.72039 -0.011 0.991
Dry 0.07732 9686.42731 7.98e-06 1.000
LandusePastoral 17.29671 1964.72039 0.009 0.993
LanduseProtected 20.44246 1964.72039 0.010 0.992
Dry:LandusePastoral 0.78610 9686.42732 8.12e-05 1.000
Dry:LanduseProtected -4.46714 9686.42733 -4.61e-04 1.000

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 1001.17 on 244 degrees of freedom
Residual deviance: 566.87 on 239 degrees of freedom
AIC: 833.96

#### Dason

I already explained why the parameters aren't there. Why exactly do you want to the test those specific parameters? What question are you trying to answer by testing those parameters? That's the more relevant issue and if the effect you're looking to test is estimable then we can figure it out.

#### Brendaa

##### New Member
Because I want to know what the effect of the different types of landuse (Agriculture, pastoral and protected) is on y (the distribution of a certain animal) ánd I want to know what the combined effect is of the different types of landuse in combination with dry (which tells me something about the greenness of the vegetation).

So it is not very usefull that both of these parameters are now grouped together in the intercept...

#### Dason

Is Dry a categorical variable with two levels? If so what at the levels?

#### Brendaa

##### New Member
No Dry is a continious variable. Landuse is catagorical with thee levels (Agriculture, Pastoral and Protected).

#### Mike White

##### TS Contributor
Could it be that Agriculture is not significant in the model so it is not included in the output? The number of records for Agriculture is less that other Landuse types and the box plots and variance of Dry ~ Landuse show that Agriculture has a much lower variance for Dry that the other Landuse types.

Code:
table((Landuse)
#Landuse
#Agriculture    Pastoral   Protected
#         36         160          49

boxplot(Dry~Landuse)
pDry<-split(Dry,Landuse)
lapply(pDry, var)
#$Agriculture #[1] 0.005513301 # #$Pastoral
#[1] 0.08484931
#
#\$Protected
#[1] 0.05503401

#### Dason

To estimate the mean for Agriculture at a given level of dry: $$\mu + Dry*x$$
To estimate the mean for Pastoral at a given level of dry: $$\mu + Pastoral + x*(Dry + Dryastoral)$$
To estimate the mean for Protected at a given level of dry: $$\mu + Protected + x*(Dry + Dryrotected)$$
where $$\mu$$ is the intercept.