# Thread: Choosing a sample size for a pilot study

1. ## Choosing a sample size for a pilot study

I am trying to figure out how to compute a sample size. Since this is a pilot scale study the results will likely not be definitive, but I want to insure that I have a big enough sample to inform the parameters that I need to use on a larger follow-up study.

All patients suffer from a chronic degenerative disease that never improves. The level of functional impairment is readily determinable by a simple and accurate test so that is easily quantifiable. The experimental treatment we are testing has the potential to be the first to actually reverse the disease process and improve function, so there is no comparable treated population discussed in the literature that can help me estimate the standard deviation.

If the hypothesis is that the experimental treatment improves function in this patient population, what is the best test to use? My gut tells me to enroll 40 treatment group patients so that I will have at least 30 evaluable subjects and use a one-sample student t test since I can't know the SD.

Any improvement on my thinking?

2. ## Re: Choosing a sample size for a pilot study

It depends primarily on the required power for the study you want although certain methods such as regression require a large sample generally.

I think you should use ANOVA and random assignment, but I am no expert in this area.

3. ## Re: Choosing a sample size for a pilot study

Adding to noetsi's comments- Gpower is a free program you can use (given test statistic, power (usually set to .8), effect size, and alpha (usually .05))

4. ## Re: Choosing a sample size for a pilot study

Yeah Gpower is a great, free program. My statistics program swore by it (the chair actually tested their results against her own calculations and concluded they were accurate).

Of course you can also calculate power manually using non-central chi distributions, but once you do that a single time you will always use Gpower in the future

5. ## Re: Choosing a sample size for a pilot study

Originally Posted by noetsi
Of course you can also calculate power manually using non-central chi distributions, but once you do that a single time you will always use Gpower in the future
Speak for yourself

Gpower is great but I typically either do the calculations myself or via simulation.

6. ## Re: Choosing a sample size for a pilot study

Gpower and OpenEpi can help you determine sample size for a power test but if you're doing repeated measurements...use recursive bayesian for the pilot study alongside ANCOVA if your N > 10 for both groups.

There's no "right" answer to the test and because this is a pilot study this is the BEST time to just get the data and run multiple types of tests for power (recursive bayes and ANCOVA are the ones that come to mind but t-tests should be fine if your hypothesis is reduction of morbidity or even reversal). The problem with ANCOVA will be that your follow up between patients may have wide variation depending on patient host factors and unless you're making a very stringent protocol for treatment that minimizes this which is difficult AND expensive and easily rejectable from grant providers it'll be a hard sell.

One last thing...I don't know what disease this is but are you confident that you'll be able to identify >40 cases if the disease in question and perform the trial? Even in a full blown trial with no pilot study getting 40 patients to end point is resource-intensive. Good luck though!

To note: Pilot studies are never meant to derive an outcome of significance, or even to replicate it. They're designed to determine the the strength of analysis by the case variability. This is not an inexpensive "try it before you buy it" trial, rather it's a quality assurance stamp to make sure that a full expansion doesn't result in a lemon (to use industrial phrases)

7. ## Re: Choosing a sample size for a pilot study

Originally Posted by hodag

If the hypothesis is that the experimental treatment improves function in this patient population, what is the best test to use?

Any improvement on my thinking?
Just a quick word of caution, the hypothesis test is never that the treatment is an improvement. one always starts with the more conservative idea that the treatment makes no difference and proves that this conservative idea can reasonably be rejected.

regards
rogojel

8. ## Re: Choosing a sample size for a pilot study

Dason,

how do you perform the simulation, in R ?

Do you have by any chance a ready example code of a simple case (let's say t-test or something like this) ?

I think that the biggest problem in a pilot study is that you do not have information regarding the effect size or the performance of the two or more treatments. In a bigger study, you use the results of the pilot to estimate all the parameters from the sample size formulas, but in the pilot you don't always has it. What do you do then ? You use data from pre-clinical studies (animals unfortunately), but how do you set a sample size for these pre-clinical trials ?

9. ## Re: Choosing a sample size for a pilot study

NN_STAT maybe that question deserves its own thread???

10. ## Re: Choosing a sample size for a pilot study

Thanks for all the replies. The reason I didn't go with the usual alpha=0.05 and beta =0.8 is that we don't really know the effect size in population and, even if we did, I have seen few real world calculations that did not require a trial of at least several hundred patients. That is too large, and too expensive, for a pilot that is first in humans study, and animal models provide virtually no guidance in this application due to species to species differences that are highly important.

I will take a look at ANOVA and ANCOVA as alternatives. Part of the exercise is to figure out how large N should be, but my gut tells me that if we randomize 40 to treatment and 20 to a comparator group (who may cross over after the six month interim review) that will be enough to provide meaningful results. The other twist is that there are two underlying disease states that cause this condition and I want to have enough observations to do a sub-group analysis to answer the question of whether there is a significant difference in response depending on the underlying etiology (I would be shocked if there wasn't).

Originally Posted by Lowpro
One last thing...I don't know what disease this is but are you confident that you'll be able to identify >40 cases if the disease in question and perform the trial? Even in a full blown trial with no pilot study getting 40 patients to end point is resource-intensive. Good luck though!

To note: Pilot studies are never meant to derive an outcome of significance, or even to replicate it. They're designed to determine the strength of analysis by the case variability. This is not an inexpensive "try it before you buy it" trial, rather it's a quality assurance stamp to make sure that a full expansion doesn't result in a lemon (to use industrial phrases)
I agree completely. No trial is cheap and this one won't be either. Because this is totally new ground that we are covering, it remains to be seen what is really going on with the treatment and that equates to a lot of proteomic analysis just to determine which biomarkers are relevant for the future. The goal is to determine if there is enough hint of improved outcomes to warrant a larger study.

As for patients enrollment, I have been there, done that, and have the t-shirt from other trials that were painfully slow to enroll (i.e. a new therapy with only 3,000 US patients per year available for testing, with competitors creaming off some of those). Fortunately (or unfortunately if you are a patient) in this case the relevant population eligible for enrollment is at least a million and we plan enrollment at three large reserach centers, with more centers available if we need it.

11. ## Re: Choosing a sample size for a pilot study

Don't forget you will inevitiably lose people during the trial and what matters is how many you end up with not the starting number.

12. ## The Following User Says Thank You to noetsi For This Useful Post:

trinker (06-27-2013)

13. ## Re: Choosing a sample size for a pilot study

The good news is that even with small N's, >=10 for repeated measurements you should be able to get enough statistical power due to the mixed model approaches. It works well for the internal assumptions. BUT again because this is a pilot study the results shouldn't be expected to be replicated in a full trial but the power itself should be fine if not better. It may also give you some insight to patient attrition and censoring (not sure the mortality of the disease in question).

And a personal recommendation is that after your pilot study (or even during) get some feedback from metabolomic experts alongside your proteomic research. Inflammation response and allostatic load measurements always seem to be poorly studied alongside proteomic studies which is why most biomarker research becomes less fruitful. I do a good bit of bioinformatics research these days and have seen the dearth in communication and research between proteomic/genomic and metabolomic research.

But again good luck. Independent T-tests and MANCOVA seem like par for the course but I always suggest recursive bayes alongside even if not to publish it's just that for any follow up study for treatment MCMC techniques can give more information with smaller N's. Pilot studies make for a great opportunity to apply all sorts of methods without having to subscribe to a particular result which is what gives Bayes another strength. You don't have to publish all results but you keep them in the pocket going forward.

For the subgroup analysis you'll want a factor of 2 fold increase in your N as the general principle which may be VERY difficult for a pilot study due to all I've mentioned before. But go big or go home, I wish you the best of luck in writing that grant up.

14. ## The Following User Says Thank You to Lowpro For This Useful Post:

hodag (06-27-2013)

15. ## Re: Choosing a sample size for a pilot study

Lowpro,

All good thoughts. Actually this project is all about using proteins for tweaking inflammation and allostatic loads so when I say proteomics I have a lot of things mixed up in that term (but it is all protein, right?). However, you are correct that nobody in academia (or industry) talks to each other which is why I exist; to span the gaps. So long as I am feeding investigators nice things they can publish on they are happy to do the work.

On the patients I expect to lose very few. This therapy will have demonstrable results in six months or not at all. It is a one dose day 0 intervention with a six month and one year follow up visit. Given that the patients are sick, but not too sick, I would not expect to lose more than one patient to sudden death which makes life easier from that standpoint.

16. ## The Following User Says Thank You to hodag For This Useful Post:

Lowpro (06-27-2013)

17. ## Re: Choosing a sample size for a pilot study

I am curious why power does not matter in a pilot study. Is the only purpose of the study to do the design of experiment elements (for example how to address confounds) and the results themself don't matter? Because if the results do matter, than having very low power would tend to invalidate them. You might not reject the null when you absolutely should because of this.

18. ## Re: Choosing a sample size for a pilot study

Pilot studies are meant to clarify and modify the logistics of a full trial intervention, be it sample size and power, misclassification biases inherent in data acquisition, analysis methods etc. I've heard it called a feasibility study.

http://www.biomedcentral.com/1471-2288/10/1

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3081994/

It's not for hypothesis testing but more of a quality control stamp on methods and data acquisition before a larger study.