# Interpret GLM model in test of difference in proportions

#### davidecortellino

##### New Member
I'm facing the following situation: We conducted a marketing campaign with three groups: 2 test vs 1 of control, where two different type of coupons sent to the two test groups, 3 euros vs. 6 euros. I understand that the best way to test the difference in proportion is a GLM model with binomial distribution.

The dataset is composed of two columns, one referring to the reference group and the other to the redemption result (1=yes, 0=no). The column "enviados" is just a counter, being "1" for all the observations.
Code:
> head(exp11)
1                    0        1    BONO6EUROS
2                    0        1    BONO6EUROS
3                    0        1    BONO6EUROS
4                    0        1    BONO6EUROS
5                    0        1    BONO6EUROS
6                    0        1    BONO6EUROS
I wanted to test if there was a significant difference in the redemption rate for each group. Here are the rates:

Code:
> head(testexp)
1     benchmark                 1888     9600 0.19666667
2    BONO3EUROS                  113     1316 0.08586626
3    BONO6EUROS                 5449    27227 0.20013222
Indeed there is a difference in proportion redeemed between the different groups ("ratio" column). To test the significance in difference I applied a GLM model, computed the odds ratios, CIs and ran a post-hoc test:

Call:
glm(formula = TRAN_DURING_CAMP_FLG ~ bono_recibido, family = "binomial", data = exp2)

Deviance Residuals:
Min       1Q   Median       3Q      Max
-0.6683  -0.6683  -0.6683  -0.6618   2.2158

Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept)             -1.40726    0.02568 -54.805   <2e-16 ***
bono_recibidoBONO3EUROS -0.95793    0.10166  -9.423   <2e-16 ***
bono_recibidoBONO6EUROS  0.02179    0.02981   0.731    0.465
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

Null deviance: 37673  on 38142  degrees of freedom
Residual deviance: 37548  on 38140  degrees of freedom
AIC: 37554

Number of Fisher Scoring iterations: 4

> exp(coef(mod.bin)) #odds ratio
(Intercept) bono_recibidoBONO3EUROS bono_recibidoBONO6EUROS
0.2448133               0.3836877               1.0220305
> exp(confint(mod.bin)) # IC
Waiting for profiling to be done...
2.5 %    97.5 %
(Intercept)             0.2327371 0.2573840
bono_recibidoBONO3EUROS 0.3127931 0.4661529
bono_recibidoBONO6EUROS 0.9641824 1.0837123

> summary(glht(mod.bin, mcp(bono_recibido="Tukey")))

Simultaneous Tests for General Linear Hypotheses

Multiple Comparisons of Means: Tukey Contrasts

Fit: glm(formula = cbind(TRAN_DURING_CAMP_FLG, enviados - TRAN_DURING_CAMP_FLG) ~
bono_recibido, family = binomial, data = testexp)

Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
BONO3EUROS - benchmark == 0  -0.95793    0.10169  -9.420   <1e-04 ***
BONO6EUROS - benchmark == 0   0.02179    0.02981   0.731    0.728
BONO6EUROS - BONO3EUROS == 0  0.97972    0.09955   9.841   <1e-04 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)
I have some doubts about how to interpret the results of the test:

Looking at the GLM model, it seems that the stimulus bono6euro coefficient is not significant, while the stimulus bono3euro and the control group are.

Question: Do I assume that the fact of having received a stimulus of 3 euro significantly impacts the chance of having the customer redeem the coupon? (In this case negatively as the odds ratio of the 3 euro stimulus is less than the one of the intercept: 0.2448133*0.3836877=0.0939) while the fact of having received a stimulus of 6 euro does not significantly affect the odds ratio relative to not having received any kind of stimulus?
Looking at the deviances, the residual deviance is slightly lower than the null deviance. I understand it means that the model does not explain much of the variability in the dataset (lower than 1%).

Question: Can I assume correctly that the fact of having received the stimulus have a low explanatory power of the redemption behaviour in the sample? Which means that the fact of having received one of the two stimuli has not triggered in the test groups any greater spending behaviour than the control group, meaning the stimuli have not worked properly?
The confidence intervals for the odds ratios do not include "0" and are narrow.

Question: Can I assume that the odds ratio are then precise enough in telling me that, being <1, the exposure to the stimulus is associated with lower odds of stimulating a purchasing behaviour in the sample? Which means, again, that the stimulus used have not affected the consumer behaviour?

#### consuli

##### Member
Maybe your post will become more clear, if you

1. st present the experiment and your working hyptheses,
2. nd present present the data
3. rd present the model building y ~ x1 + x2 +x ...
4. th present the model results like summary() ...