Power calculation

skaur

New Member
#1
Hi Guys, really need your help please.

My supervisor has advised that I perform a power calculation. I performed a re-audit ie in 2008/09 there were 55 subjects and so in 2014/15 I chose to look once again at 55 subjects. I compared demographics/comobidities/medications etc where answers were either yes/no and made comparisons using pearson chi squared and said results were significant if p<0.05.

However, how do I perform this "power" calculation? And what is its importance? Not really quite sure. Very new to statistics.

Many thanks in advance.:)
 

EdGr

New Member
#2
Your supervisor is basically asking if you had enough subjects for the comparison, to be sure you picked up any important effects. Often this request comes after the results are NOT significant (perhaps because of poor planning) in which case the researcher is left trying to argue that the number used was sufficient, so the failure to find an effect was not a fluke.

You say that many of your comparisons were significant at p < 0.05. Power is a bit less directly relevant then, but many statisticians (including me) would say it is good practice to know what effect size you had enough power for. I find myself starting to write more than would be appropriate for one of these short answers, since even basic power is a whole session in my class. The idea is to ask how likely you would be to get a significant result if some particular condition were true (say that the true rate of yes responses had declined from 60% to 40% between surveys). For that comparison you have about 56% power, so you would find significance 56% of the time if that was the real change. There are various online power calculators, but I suggest reading up on it first.
 

skaur

New Member
#3
Thanks for your reply.
The comparison between the observed events was not significant.
In the audit there was 15/55 events
In the re-audit there was 22/55 events
X2 = 1.99, deg free=1, p=0.15
Can a power analysis be done?
 

EdGr

New Member
#4
Yes, but it would not be based on the observed results. You would want to say what size of difference you would want to be sure you could detect. You had 27% events in the first audit, but that's just an observed result and may not reflect the true percentage of events at that time. Nor does the 45% the second time.

So suppose you were to say to me, independently of the results, that an absolute difference of 10% between the two audits would be enough for concern. Not 10% in the sample, but a real underlying difference in the process. The 55 subjects is far too few. For percentages in the range of what you are doing, you have less than 20% power (which means you would fail to find a significant effect 80% of the time!).

If you said that changes of 10% wouldn't concern you, but a real difference of 20% would, you're still too small in numbers. You should probably have closer to 100 in each audit for that kind of difference. Your power is only about 56%.

Unfortunately, you can't increase the number in the earlier audit. Still, for a difference of 20%, getting about 400 subjects in the current audit would give 80% power. For a smaller effect (like 10%) you can't possibly have reasonable power with 55 in one group, regardless of the n in the other group.
 
#5
Many thanks for getting back to me. Just wondering if I could ask you a quick question please? The last paragraph mentions that to get a difference of 20% I should have used 400 subjects in the current audit to give 80% power. Just wondering how you calculated this? Need to reflect on this in my project to inform future studies.

When I've used the following website http://homepage.stat.uiowa.edu/~rlenth/Power/ and performed a generic chi square test, it states I would need 434 subjects to obtain a power of 80% is this correct? Where Chi is 1.99, n=110, df=1 (as per 1st comment)

Thanks again
 
Last edited:

EdGr

New Member
#6
You're welcome.

I don't understand what you did in that online power program, and my version of Java blocks it as not meeting high safety standards, so I can't try it! How can you know chi-square before running the study? You don't care what the sample size would be to make an observed chi-square significant. You need a site that lets you enter the percentages of interest and the sample sizes, then outputs power. I think it will be hard to find one that tells you sample sizes for unequal groups. Be sure you have chi-square that compares groups, not goodness of fit. If all of this sounds like meaningless jargon to you, I think you need an actual consultant, not a web page (and not just TalkStat help).

First of all, recognize that when sample sizes are unequal, power is mostly related to the harmonic mean of the sample sizes.

The harmonic mean of x and y is found by averaging (in the normal way) 1/x and 1/y, then inverting the result.

So for 55 and 100 subjects, the harmonic mean is 71, almost in the middle of them.
But for 55 and 400 subjects, the harmonic mean is 97, only a bit higher.
For 55 and 10,000 subjects, the harmonic mean is 109. You can see that diminishing returns sets in. At some point the calculation says, "Hey you only have 55 in one group. That's the limiting factor."

You may want to download something like Gpower and play with that. Be sure you understand what it is doing and that it makes sense.

I could show you the formula, but it's not simple.

Ed
 
#7
Hi There
Many thanks for getting back to me. I agree I think I will need to go back to my tutor and ask for advice.
That makes sense about the use of harmonic mean. That means as I was unable to change the sample size from the earlier audit, in the re-audit I should have collected data for 100 subjects to obtain a harmonic mean of 71? and would this be equivocal to a power of 0.71?

I continued the harmonic means:
2x55x100/155=70.9
2x55x110/165=73.3
2x55x120/175=75.4
2x55x130/185=77.2
2x55x135/190=78.1
2x55x140/195=78.9

Just wanted to ask one last question can the power be calculated for the study I did which was:

In the audit there was 15/55 events =27%
In the re-audit there was 22/55 events =40%
X2 = 1.99, deg free=1, p=0.15

*In post 2, you mentioned The 55 subjects was far too few. For percentages in the range of what I was doing, it was less than 20% power. Just wondering how you obtained this value?

Will download GPower and see if it lets me enter the percentages of interest and the sample sizes, then outputs power.

Many thanks
 
Last edited:

EdGr

New Member
#8
You're welcome. The harmonic mean is the BASIS for power calculation. It is not the power itself. If you had a program that did power calculations for equal sample sizes, you could get a pretty good idea what power you would have using the harmonic mean as though it was your sample size in both groups. But you still need a program to do the power analysis!

I never saw your formula for harmonic mean before, but I looked it up and it is right (for exactly 2 numbers). Interesting!

Remember that power should be for an effect OF INTEREST. The observed effect (27% versus 40%) and the observed chi-square value have no intrinsic meaning. They are an observed result that is randomly different from the truth.

As I said before, I would pick a difference of interest, like 20% (say 27% versus 47%) and see how much power you had for that with 55 subjects per group.
 
#9
Of course it is good to do a power calculation. But it should be done before the experiment.

In this paper by Hoenig and Hesey "The Abuse of Power" they say that it is essentially meaningless to do a power calculation afterwards based on the estimated data. If the p-value is low (i.e. "significance") then the power will always be high and if the p-value is high the power will low. The power calculation will not bring any new information.
 
#10
GretaGarbo makes a valid point, but that is exactly why i said that the power analysis must be based on an effect of interest, not the observed effect. I think that a properly done post hoc power analysis (or one done in advance but with predetermined sample sizes) can be useful. I often have my students report things like, "Given the available subjects (60 males and 40 females) we would have 80% power for a mean difference of ___." If I am reviewing an article with nonsignificant results, I ask the researchers for the same thing. Unfortunately, what they often try to do is power based on the observed difference. That's useless.
 
#11
Of course it is good to do a power calculation. But it should be done before the experiment.

In this paper by Hoenig and Hesey "The Abuse of Power" they say that it is essentially meaningless to do a power calculation afterwards based on the estimated data. If the p-value is low (i.e. "significance") then the power will always be high and if the p-value is high the power will low. The power calculation will not bring any new information.
Power calculation before or after an event is the same. :eek:
 
#12
Of course it is good to do a power calculation. But it should be done before the experiment.

In this paper by Hoenig and Hesey "The Abuse of Power" they say that it is essentially meaningless to do a power calculation afterwards based on the estimated data. If the p-value is low (i.e. "significance") then the power will always be high and if the p-value is high the power will low. The power calculation will not bring any new information.
Power calculation before or after an event is the same. :eek:
 
#14
Well, since you have chosen to do the exact kind of post hoc power analysis that both GretaGarbo and I called useless, I guess you're on your own.
 
#15
Thanks for reply. I agree with both of you it does not serve a great prupose putting my posthoc calculation in. But an interest of 26% (say 27% versus 53%) produces a power of 0.83 for that with 55 subjects per group. I think I should this calculation in as this would be useful for future studies.
 
#16
I agree that the 27% versus 53% is much more useful. Just be sure you understand what it is telling you -- with 55 subjects per group, you can only reliably detect this rather large difference. As you saw, even for a difference as big as 27% versus 40%, you have too little power. Hopefully you can use these results to begin making some serious plans for future studies. Right NOW -- figure out what effect size (say, what difference from 27% or 40% you would like to be able to detect reliably. Get an appropriate sample size. Might be 100, or 150. Then store that data for the next time you want to do this analysis, say 2 years from now, and your pre-planning will give you enough subjects to do a useful test at that time.
 
Last edited: