#### m_edmonson

##### New Member
I work for a company that sells products online. I am attempting to create a common sense SOP for our email marketing campaigns and have plenty access to data from our previous ones. I am attempting to use statistics to help prove why the structure of an email title will result in higher open rates.

Essentially, I have separated our previous email titles in to three defined groups that have different characteristics. They are my independent variables.

I have the three dependent variables which are the open rates, click rates, and conversion rates related to each case of the IV's.

That is basically where I am stuck. I have downloaded a program called PSPP, which is essentially a free clone version of SPSS and has all of the same functionality and feel. I do not know what the appropriate type of analysis is that I need to use in order to be able to accept or reject my hypothesis.

Any suggestions or starting points for me? Anything and everything is helpful.

#### hlsmith

##### Omega Contributor
If your dependent variable is binary (opened: yes, no), I would highly recommend logistic regression. The benefit of this approach would be you can enter in multiple covariates (independent variables). For example, did title, region, time of day, etc. improve odds of opening.

#### m_edmonson

##### New Member
Thanks, the dependent variable(s) will, unfortunately, not be binary. What I have are actual open rates, calculated from the percentage of total emails that were sent out. The amount of total emails sent out varies and increases after each email (as more and more users sign up) and therefor I take the amount of opened emails and divide it by the total amount of emails sent, which gives me an open rate percentage. I just take that number (say it is 34%) and turn it in to the straight numerical value of .34. I have about 60 of these values which range from .28 to .39. Same goes for the "click rate", which is the amount of people who clicked on a link inside the email divided by the amount of people who actually opened it. Conversion rate is a similar equation for the number of those people who followed the link and ultimately made a purchase, resulting from the original email. Hope this clarifies exactly what I am doing here.

#### hlsmith

##### Omega Contributor
So you don't have Individual data? You cant just look at every single one individually and say open yes/no?

#### m_edmonson

##### New Member
Well, I could, but each email is sent to over 100,000 people and an average of about 35,000 people open it. So given that I am talking about 60 or so different email titles, which I have turned in to three groups, based on the type of content.

#### Englund

##### TS Contributor
I used to struggle with exactly the same problem at my last job. I concluded that building models based on different emails give very rough results, at best. Time of day, season and type of customers sent to play a very important role so by comparing open rates et cetera between two different mails is difficult.

Therefore I came to the conclusion that the best way to compare different email titles is to do typical A/B tests. Randomly (stratification not necessary because of very large groups, but still preferred) assign customers to different groups and send them different email titles.

My guess is that conversion and click rate will be as good as independent of the title. Basically because once a customer has opened an email, the content is what matters. The same reasoning goes for conversion rate. This is at least the conclusions I made when working on the same problem.

#### Karabiner

##### TS Contributor
I have about 60 of these values which range from .28 to .39.
So you have n=60 "subjects" grouped into 3 categories, and for each subjects you have 1 value. You could compare the 3 groups with resprect to these value using a Kruska-Wallis H-test, or pairwise using U-test.

Or maybe, since n > 50, (the often awkward distribution of the residuals from the analysis of proportions does not matter much in that case) you could even perform a oneway ANOVA, with pairwise follow-up comparisons. Since n=60, you could even include 2 or 3 covariates (such as time of day sent, type of email database used etc.). Keep in mind Englund's comments, though.

With kind regards

K.

#### m_edmonson

##### New Member
I used to struggle with exactly the same problem at my last job. I concluded that building models based on different emails give very rough results, at best. Time of day, season and type of customers sent to play a very important role so by comparing open rates et cetera between two different mails is difficult.

Therefore I came to the conclusion that the best way to compare different email titles is to do typical A/B tests. Randomly (stratification not necessary because of very large groups, but still preferred) assign customers to different groups and send them different email titles.

My guess is that conversion and click rate will be as good as independent of the title. Basically because once a customer has opened an email, the content is what matters. The same reasoning goes for conversion rate. This is at least the conclusions I made when working on the same problem.
Okay, so what I am thinking now is I need to separate this in to several experiments. With title content type being the independent variable on one, having three different values between the 60 emails, and open rate being the dependent variable. For click rate, I will need to find a way to measure, let's say, email content (i.e. what kind of deals/offers are given and what kind of products are there [A,B,C, or D]). Then ultimately for conversion rate, there would have to be some sort of relationship between the discount offered combined with website usability.

#### Englund

##### TS Contributor
My tip is: Keep it simple. You could make it a huge project. The title is fairly straight forward with simple A/B tests. But when it comes to the content of a mail is where it gets messy. Lets say a standard mail contains 10 pictures and 12 sentences, so there are basically infinitely many combinations of how to set this up in the email. In your picture database you probably got hundreds of photos. It is too big of a project to find the optimal photo at the optimal place in the mail.

I would say that finding a good, clickable content is a continous project that stretches over the entire life of a company. If you're selling clothes and you're having a trousers drive, should you have dark blue or light blue trousers in the main picture? Well, A/B testing! If dark blue won, put in the main picture in the upcoming trousers mails. Next mail, compare dark to yellow. And so on.

I certainly doesn't hope that your company expect you to find an optimal solution in a month.