# Thread: Hypothesis Tests With Prior Information

1. ## Hypothesis Tests With Prior Information

I've a hypothesis test question here.

Say you're trying to evaluate the success of a change in an email newsletter design. To do this, you can send half of your list one design and half of your list the other design. You can then use a hypothesis test to try to determine if the change is significant. If you do this for a large number of emails and randomly select who receives the design each time, then the better design should be determinable.

But not all subscribers on your email list are the same. Some have better past engagement rates than others. And you have information about each user's past engagement rates. If you randomly split the list, then you might end with one half of the list having a higher engagement rate. But, by taking into account past information, you can divide the list into groups with similar engagement rates and then randomly decide who receives each design amongst those groups. Maybe you could create a model that can predict engagement for each user and compare the predicted engagement rates to actual rates.

It seems that by taking into account past information, you might be able to determine the better email design faster than pure random selection. Is this ever true? If so, in what situations is it true? How do you know when to take into account prior information and when to ignore it? How does taking into account prior information impact hypothesis tests?

2. ## Re: Hypothesis Tests With Prior Information

You could use engagement rate as a covariate in the analysis (ANCOVA is the typical way to do this). But really if you actually do the randomization correctly you really shouldn't end up with a case where one group has a significantly higher engagement rate on average.

4. ## Re: Hypothesis Tests With Prior Information

Thanks for the reply. I understand that in the long-term, over multiple emails, that each engagement level should be sampled evenly according to probability theory.

But, in the short-term, we may end up unevenly dividing up the different engagement levels. If we send one design to one hundred people, another design to one hundred people. What if one group ends up with 15 high engagement users while the other group ends with 5 high engagement users. This could impact the evaluation of the design success. Isn't better to take into account prior information to help mitigate this problem? Is there no value to prior information when evaluating which design is better?

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts