Reconcile differences in sample size calculation for two proportions?


I'm reading in John Lachin's Biostatistical Methods 2nd edition. It suggests a formula for the sample sizes of a two proportion Z-test. I coded it in R below. The etas are the expected sample fractions in each group, pi1 and pi2 are the proportions of interest, Z_alpha and Z_beta are the quantiles for alpha and beta respectively:
sample_size <- function(eta1,eta2,pi1,pi2,Z_alpha,Z_beta){




This is different compared to:
power.prop.test(n = NULL, p1 = .4, p2 = .28, sig.level = 0.05,power = .9,alternative = c("two.sided"))
This article explains power.prop.test computes a binomial approximation to the normal distribution. When should one method be used over the other? I'm using prop.test which computes a chi-square statistic. I know that if we square a standard normal we get a chi-squared. But, I don't understand why I would use one method over another. Additionally, why does prop.test give the option for a one sided or two sided? Edit: alternative=greater or less is only used when comparing a single proportion against a null value. Makes sense.
Last edited:


Ambassador to the humans
You could code up a simulation with the given sample sizes to figure out the power for each result you're getting
I see. I'm noticing a fundamental difference in the calculations. I think each has their own justification. The one I coded seems to be a large sample test with the Z statistic. On the other hand, power.prop.test assumes we are comparing frequencies in a contingency table. In other words, a chi-square test of independence between rows and columns. I'm wondering what the benefit and drawback of each. The power in both examples is 90%. Essentially, if I use a Z test the sample size is 652 whereas if I use a chi-square test the sample size is 326. The chi-square sample size calculation uses a binomial approximation to the normal.
# N=326
power.prop.test(n = NULL, p1 = .28, p2 = .4,
                sig.level = .05,power = .90,strict = TRUE,
                alternative = c("two.sided"))

prop.test(x=c(326,91),n=c(326,326),conf.level = .95,alternative = "two.sided")

# N=652
Last edited:


Ambassador to the humans
I don't know. I don't typically like using these kinds of sample size calculators. Instead I just simulate and use that to estimate power over a range of sample sizes. That way I know the sample size I get will be accurate for the actual analysis I'll be doing.
That's fair. I suppose I can do that and see if the power I get back with the values I set lines up. I guess my question boils down to why run a chi square test as opposed to a Z test. I'll do more reading.


Active Member
i check your example against sas proc power, it gives exactly 652, exactly same as your example (apparently example 3.2 from your book.)

proc power;
twosamplefreq test=pchi
groupproportions = (0.4 0.28)
this uses a 'normal approximation method', according to SAS.

Power analysis is not exactly an exact science. no one agrees within +- 30%, so don't obsess over a few ordinary lives.
Oh, I didn't notice that. I'll take a closer look. Edit: Yes! The formula I coded is the total N whereas power.prop.test is per group. I thought I was losing my mind. Thanks.
Last edited: