I'm facing the following situation: We conducted a marketing campaign with three groups: 2 test vs 1 of control, where two different type of coupons sent to the two test groups, 3 euros vs. 6 euros. I understand that the best way to test the difference in proportion is a GLM model with binomial distribution.
The dataset is composed of two columns, one referring to the reference group and the other to the redemption result (1=yes, 0=no). The column "enviados" is just a counter, being "1" for all the observations.
I wanted to test if there was a significant difference in the redemption rate for each group. Here are the rates:
I have some doubts about how to interpret the results of the test:
Looking at the GLM model, it seems that the stimulus bono6euro coefficient is not significant, while the stimulus bono3euro and the control group are.
Question: Do I assume that the fact of having received a stimulus of 3 euro significantly impacts the chance of having the customer redeem the coupon? (In this case negatively as the odds ratio of the 3 euro stimulus is less than the one of the intercept: 0.2448133*0.3836877=0.0939) while the fact of having received a stimulus of 6 euro does not significantly affect the odds ratio relative to not having received any kind of stimulus?
Looking at the deviances, the residual deviance is slightly lower than the null deviance. I understand it means that the model does not explain much of the variability in the dataset (lower than 1%).
Question: Can I assume correctly that the fact of having received the stimulus have a low explanatory power of the redemption behaviour in the sample? Which means that the fact of having received one of the two stimuli has not triggered in the test groups any greater spending behaviour than the control group, meaning the stimuli have not worked properly?
The confidence intervals for the odds ratios do not include "0" and are narrow.
Question: Can I assume that the odds ratio are then precise enough in telling me that, being <1, the exposure to the stimulus is associated with lower odds of stimulating a purchasing behaviour in the sample? Which means, again, that the stimulus used have not affected the consumer behaviour?
The dataset is composed of two columns, one referring to the reference group and the other to the redemption result (1=yes, 0=no). The column "enviados" is just a counter, being "1" for all the observations.
Code:
> head(exp11)
TRAN_DURING_CAMP_FLG enviados bono_recibido
1 0 1 BONO6EUROS
2 0 1 BONO6EUROS
3 0 1 BONO6EUROS
4 0 1 BONO6EUROS
5 0 1 BONO6EUROS
6 0 1 BONO6EUROS
Code:
> head(testexp)
bono_recibido TRAN_DURING_CAMP_FLG enviados ratio
1 benchmark 1888 9600 0.19666667
2 BONO3EUROS 113 1316 0.08586626
3 BONO6EUROS 5449 27227 0.20013222
Indeed there is a difference in proportion redeemed between the different groups ("ratio" column). To test the significance in difference I applied a GLM model, computed the odds ratios, CIs and ran a post-hoc test:
Call:
glm(formula = TRAN_DURING_CAMP_FLG ~ bono_recibido, family = "binomial", data = exp2)
Deviance Residuals:
Min 1Q Median 3Q Max
-0.6683 -0.6683 -0.6683 -0.6618 2.2158
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.40726 0.02568 -54.805 <2e-16 ***
bono_recibidoBONO3EUROS -0.95793 0.10166 -9.423 <2e-16 ***
bono_recibidoBONO6EUROS 0.02179 0.02981 0.731 0.465
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 37673 on 38142 degrees of freedom
Residual deviance: 37548 on 38140 degrees of freedom
AIC: 37554
Number of Fisher Scoring iterations: 4
> exp(coef(mod.bin)) #odds ratio
(Intercept) bono_recibidoBONO3EUROS bono_recibidoBONO6EUROS
0.2448133 0.3836877 1.0220305
> exp(confint(mod.bin)) # IC
Waiting for profiling to be done...
2.5 % 97.5 %
(Intercept) 0.2327371 0.2573840
bono_recibidoBONO3EUROS 0.3127931 0.4661529
bono_recibidoBONO6EUROS 0.9641824 1.0837123
> summary(glht(mod.bin, mcp(bono_recibido="Tukey")))
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: glm(formula = cbind(TRAN_DURING_CAMP_FLG, enviados - TRAN_DURING_CAMP_FLG) ~
bono_recibido, family = binomial, data = testexp)
Linear Hypotheses:
Estimate Std. Error z value Pr(>|z|)
BONO3EUROS - benchmark == 0 -0.95793 0.10169 -9.420 <1e-04 ***
BONO6EUROS - benchmark == 0 0.02179 0.02981 0.731 0.728
BONO6EUROS - BONO3EUROS == 0 0.97972 0.09955 9.841 <1e-04 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Adjusted p values reported -- single-step method)
Looking at the GLM model, it seems that the stimulus bono6euro coefficient is not significant, while the stimulus bono3euro and the control group are.
Question: Do I assume that the fact of having received a stimulus of 3 euro significantly impacts the chance of having the customer redeem the coupon? (In this case negatively as the odds ratio of the 3 euro stimulus is less than the one of the intercept: 0.2448133*0.3836877=0.0939) while the fact of having received a stimulus of 6 euro does not significantly affect the odds ratio relative to not having received any kind of stimulus?
Looking at the deviances, the residual deviance is slightly lower than the null deviance. I understand it means that the model does not explain much of the variability in the dataset (lower than 1%).
Question: Can I assume correctly that the fact of having received the stimulus have a low explanatory power of the redemption behaviour in the sample? Which means that the fact of having received one of the two stimuli has not triggered in the test groups any greater spending behaviour than the control group, meaning the stimuli have not worked properly?
The confidence intervals for the odds ratios do not include "0" and are narrow.
Question: Can I assume that the odds ratio are then precise enough in telling me that, being <1, the exposure to the stimulus is associated with lower odds of stimulating a purchasing behaviour in the sample? Which means, again, that the stimulus used have not affected the consumer behaviour?