The company I work for does a lot of testing on various subjects (marketing promotions, products, website changes, etc) in order to determine which treatment is the most effective at driving various metrics (sales, for example).
Here is a hypothetical example: Management wants to compare the effectiveness of running a 20% off sale versus a 30% off sale in retail stores over the course of two weeks. Two groups of 50 stores are randomly selected and the test is executed. Group A (20% off) had an average of $500 of sales per store for the two weeks. Group B (30% off) had an average of $575 of sales per store over the two weeks. Now we try to determine if the difference in sales between the two groups is statistically significant. So, we decide to use the trusty T test (assume all requirements of the t test are met).
My question revolves around the sample size and its relation with time. In executing a test like this where we are looking at a total average over time, does each day constitute a "sample"? For example, in the scenario above would my sample size for each group be 50 stores * 14 days = 700 (treating the sales for each store on each day as an observation) or is my sample size just the 50 stores that we are calculating the total average across? Obviously the sample size changes the level of significance.
I've posed this question to several people before and have not been able to get a definite example. Does anyone have any input?
You could think of each sample of 50 stores as comprising 700 total observations if you want, but it would not be appropriate to let N=1400 when testing for group differences. To do that would be to ignore a major source of dependence in your data. The t test assumes independent errors, but obviously the 14 errors associated with a particular store would not be independent. So you would use N=100, where each store has only a single error and they are all presumably independent.
Jake, thanks for the reply. Let me make sure I'm understanding correctly. After reading your response I did some further research on "independent errors." An error is basically defined as the deviation of an individual value compared to the group mean (a similar concept to standard deviation). So relating this back to my example and original question, if one of the stores in the test happens to be a low volume store and on a daily basis has sales lower than than average than the rest of the stores in the same test group (the error) you are saying that the error is a function of the store and are therefor not independent. So, in that case it violates the "rules" of a t-test to count the store 14 times in the sample size as the errors would not be statistically independent of each other. Do I have that correct?
Right. The key concept with violations of independence is: if knowing the value of one error or group of errors allows you to make above-chance predictions about the value of another error or group of errors, then those errors are not independent. You can sometimes not worry too much about violations of normality or homogeneity of errors, but you generally don't want to mess around when it comes to nonindependence.