You should just take a sample or hypothetical effect size and simulate varying proportion sizes based on those until you get a type II error (if there really isn't a difference). Then you may want to buffer up the proportion a little for caution's sake (in case your hypothesized effect was off).

I am not savvy in the automation department, but you could run a loop or macro of sorts to out put the p-values for a descending proportion size and plot them to understand the point where your proportion size gets questionable. Once you get it for your binary example just switch from Bernoulli base data to random normal or modified normal, etc.