Does n Affect Effect Size

trinker

ggplot2orBust
#1
We all know the pvalues are affected by sample size. I've had profs claim that effect size is unaffected by sample size. But I'm thinking this may not be true. Let's take standardized mean difference with the pooled sd in the denominator:



where...



and where...



and also...



I'm not mathematician but there's a lot of n in those equations, maybe they cancel each other out somehow but it seems more logical to me that an effect size is indeed affected by n.

Thoughts?
 

Dason

Ambassador to the humans
#2
Note that a p-value is a random quantity. An effect size is not a random quantity. The sample estimate of the effect size is a random quantity but we can still make a distinction between the sample effect size and a p-value.

The expected value of a p-value (under any particular set of parameters that aren't in the null hypothesis) decreases as the sample size increases (for any good statistical test).

The expected value of the effect size isn't a function of n.

It's true that the variability of the sample effect size will decrease as n gets large so in this way you might consider the distribution of the sample effect size is a function of n (and all the other junk that goes into it) but typically that's not exactly what we care about.
 

trinker

ggplot2orBust
#3
This goes back to a question I asked about determining sample size by the research providing alpha=.05, power=.8 and then rather than getting effect size from a pilot or from a lit. review the research creates some value as the minimal sample size they will accept; maybe they use d = .3. This seems absurd to me because if effect size is not affected by sample size it seems that no matter large the sample the effect will not change. So let's say that the effect size is .25 in the population not .3 as the researcher had hoped. Regardless of how many people sampled the effect will still be .25. It seems crazy for the researcher to pull an effect size from their posterior quarters for sample size calculation.
 

Dason

Ambassador to the humans
#4
It probably isn't just pulled out of nowhere though. I think it comes down to practical versus statistical significance.

If we're looking to see if a treatment has an effect then all we might care about is statistical significance. But this typically isn't done in a vacuum void of all other influences. It might be that we really only care about treatments that provide a practically significant increase in some value. Why might this be the case? Because sometimes it's very difficult to change protocol and to change the way things are done. It can be quite costly so we need to be able to find a treatment that isn't juts statistically significant - we need to find one that is worth the change to the new treatment.
 

trinker

ggplot2orBust
#5
Let's posit this then the true effect in the population is .25, the researcher set's their effect at .4 alpha at .05 and power at 1.00.

Code:
> library(pwr)
> pwr.t.test(d=.4, ,sig.level=.05, power=1.00, type="two.sample", alternative="two.sided")

     Two-sample t test power calculation 

              n = 10000000
              d = 0.4
      sig.level = 0.05
          power = 1
    alternative = two.sided

 NOTE: n is number in *each* group
And so I'm a good boy and get n = 10000000 and am excited because I'm sure I'll find an effect size of .4 but I did the study and alas the effect was only .25 :( It just feels to me like the researcher is trying to set something that's unsettable.
 

Dason

Ambassador to the humans
#6
It's not that they're saying "this is the effect size in the population" in those cases. It might be "this is the smallest effect size I actually care about so if we don't get an effect size of this size then ... oh well - it didn't really matter to us anyways".

Also you can't get a power of 1 with a t-test. That must be a rounding issue or something.
 

trinker

ggplot2orBust
#7
Good call:

Code:
> pwr.t.test(d=.4, ,sig.level=.05, power=.999999, type="two.sample", alternative="two.sided")

     Two-sample t test power calculation 

              n = 564.3328
              d = 0.4
      sig.level = 0.05
          power = 0.999999
    alternative = two.sided

 NOTE: n is number in *each* group