Analysis: 2 groups with drastically different sample size

I want to compare 2 groups with very different sample sizes (Group 1: 8,000; Group 2: 300,000).

What are tests that I can do that preserve statistical power? How do I check if they have equal or unequal variances?


Less is more. Stay pure. Stay poor.
What is the research question you are trying to answer? Typically the more data the better. Are these sample sizes representative to the true population sizes or is there possible sampling bias?


Well-Known Member
With groups that large, you are very likely to find a statistically significant difference, even though in practice the actual difference may not be of any particular interest.


TS Contributor
Do really intend to perform a statistical test? With n=308,000?

If we consider the smaller group only, then the standard error of an estimated mean value would be
SD / Square root(8000) = SD / 90. For group comparisons, the standard error becomes virtually zero.


Well-Known Member
One approach to this is to set up before you start an indifference zone which says something like "I am not interested if the difference is less than +/-5 mm." Then find a 95% confidence interval for the difference using the data. Plot both the indifference zone and the CI on a number line and inspect the diagram.
Now a variety of conclusions are possible such as "the difference is real and important" or "the difference is real but not important" or "the difference is real but we can't say if it is important or not" or even "the difference may be important but we can't even say if it is real." It depends on how theCI overlaps the indifference zone and whether it contains 0.