# Thread: 2 Sample T Test for Nonnormal data

1. ## 2 Sample T Test for Nonnormal data

Hi everyone!

I work in the advertising industry and I am in the process of creating a t test calculator. The calculator will be used to test the statistical differences between two different advertisements, two campaigns, two web pages, etc. I've made a click through rate significance calculator (using a Bernoulli distribution) and a calculator for the average order value (normal distribution, so straight forward 2 sample t test). I'm trying to make a revenue per visit calculator now, but I am stuck on what to do!

The vast majority of visitors to a website will not purchase anything, hence they will have a revenue value equal to zero. Since most visitors will have revenue equal to zero, the distribution will be heavily skewed at zero. The sample sizes should be quite large (n>1000). I'm at a loss for how to formulate a hypothesis test for this metric, any advice would be much appreciated! Thanks!

2. ## Re: 2 Sample T Test for Nonnormal data

Yes, your data are highly non-normal. The t-test assumes normality. Hence, you cannot use the t-test. Is this not what you wanted to hear?

It sounds like the Mann-Whitney test (http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U) would be a fine non-parametric alternative for your case.

3. ## Re: 2 Sample T Test for Nonnormal data

Correct, I knew a standard t-test would not be appropriate given the distribution of the data being non normal so I have been searching for a different hypothesis testing procedure that does not rely on a normality assumption. Thanks for suggesting the Mann-Whitney test, I'm not familiar with this test but I will look into it right now. Thanks again!

4. ## Re: 2 Sample T Test for Nonnormal data

Yes, your data are highly non-normal. The t-test assumes normality. Hence, you cannot use the t-test.
With such huge sample sizes the central limit theorem should apply.
The U-test will surely suffer from the similar problems as the t-test here.

Of course in both analyses the question is whether mean or rank sum/
median, respectively, are the appropriate statistics. I would accompany
the analysis with a simple Chi² test (visitor purchased yes/no) and a
subgroup analysis only with those who purchased, for completeness.

With kind regards

K.

5. ## Re: 2 Sample T Test for Nonnormal data

Hi, thanks for your reply. I have two separate calculations that look at conversion rate (yes/no purchase) and Average Order Value (average revenue for purchasing only visitors) like you pointed out.

My concern with the Mann-Whitney test is that the vast majority of values will be 0. If the majority of values have the same rank, seems like this will be a problem? If I choose to not consider the 0 values, then I will have the same calculation as average order value which will be normally distributed, but won't be the metric I would like to calculate which is Average Revenue per Visitor.

6. ## Re: 2 Sample T Test for Nonnormal data

As far as the point about the central limit theorem, I don't think the observations could be considered "random". For example, if the values were truly random, then the likelihood of someone making a purchase of \$10,000 (far end of the right tail) should be the same as someone making a purchase value in the far left tail. However, the far left tail will be quite fat at 0, and the likelihood of a \$0 purchase vs. \$10,000 will be heavily, heavily, skewed to \$0. Thus it seems that the CLT would be void here. Thoughts?

7. ## Re: 2 Sample T Test for Nonnormal data

You may try to google "zero-inflated" for some relevant issues here.

I second Karabiner's idea here. I think you need to clarify your goal - to compare which campaign is better - in what sense/measured by what variable. For example, if you look at the number of purchases only, then as Karabiner points out you just need to compare the proportion by proportion test.

And empirically if the number of purchases is not too small, you may try to focus on the revenue generated (those non-zero data) and try to plot and see if a certain parametric distribution fit it well. Then you may try to conduct an appropriate parametric test for the mean. Of course a non-parametric alternative maybe appropriate as well as the data should not be as highly skewed as before.

If the number of purchases is very small, I do not think you can do much to compare the "magnitude" of the purchases. Actually your situation should be very common in insurance company where they face those zero-inflated data a lot. In such case unless you can rely on external resources to give you additional information for the distribution of revenue, and use it in your analysis to estimate the mean. Without these I think you can just look at those rare purchases case by case.

8. ## Re: 2 Sample T Test for Nonnormal data

For example, if the values were truly random, then the likelihood of someone making a purchase of \$10,000 (far end of the right tail) should be the same as someone making a purchase value in the far left tail.
Sorry, that doesn't make sense. If the underlying distribution is non-uniform
(as is usually the case), e.g. heavily skewed, then a random sample from
that distribution will reflect that. Randomness of the sampling has nothing to
do with the shape of the distribution in the population.

With kind regards

K.

9. ## Re: 2 Sample T Test for Nonnormal data

Originally Posted by BGM

You may try to google "zero-inflated" for some relevant issues here.

I second Karabiner's idea here. I think you need to clarify your goal - to compare which campaign is better - in what sense/measured by what variable. For example, if you look at the number of purchases only, then as Karabiner points out you just need to compare the proportion by proportion test.

And empirically if the number of purchases is not too small, you may try to focus on the revenue generated (those non-zero data) and try to plot and see if a certain parametric distribution fit it well. Then you may try to conduct an appropriate parametric test for the mean. Of course a non-parametric alternative maybe appropriate as well as the data should not be as highly skewed as before.

If the number of purchases is very small, I do not think you can do much to compare the "magnitude" of the purchases. Actually your situation should be very common in insurance company where they face those zero-inflated data a lot. In such case unless you can rely on external resources to give you additional information for the distribution of revenue, and use it in your analysis to estimate the mean. Without these I think you can just look at those rare purchases case by case.
Hi BGM and thanks for your reply. I have already created two calculations already that capture likelihood to purchase (binary yes/no, this is the conversion rate) and one for only revenue generating visits (this is the average order value). My quest is a to create a third calculation which is revenue per visitor (i.e. the average revenue of a visitor to the site via ad A versus the average revenue of a visitor to the site via ad B). This means I will need to include revenue for all visits, hence the heavily zero-inflated data set. Any ideas about hypothesis testing with zero-inflated data? I have yet to find anything via google search about hypothesis testing. Thanks again!

10. ## Re: 2 Sample T Test for Nonnormal data

Originally Posted by Karabiner
Sorry, that doesn't make sense. If the underlying distribution is non-uniform
(as is usually the case), e.g. heavily skewed, then a random sample from
that distribution will reflect that. Randomness of the sampling has nothing to
do with the shape of the distribution in the population.

With kind regards

K.
I guess my holdup with applying the CLT here is that I know as more and more observations are included in the sample, the data will not begin to take the form of a normal distribution whatsoever. I am not an expert on the CLT, perhaps you could chime in here?

11. ## Re: 2 Sample T Test for Nonnormal data

Also, the CLT looks like it holds for the "means of an infinite number of samples from the population." I'm only looking at one sample, albeit a very large one. I suppose I could break the revenue per visitor down on a daily basis and record 30 days worth of data to give me 30 sample means. The problem is that most ad tests will not run for 30 days. Ideally, it would be a test that only looks at one large sample because we would like the testing to be an automatic calculation for significance in excel where all you have to do it paste in all revenue values for every visitor.

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts