# Best way to calculate power, precision & sample size using complex pilot data set?

#### statsSeeker

##### New Member
I have a large and complex pilot data set of electromyographic (EMG) data collected from human subjects performing several different exercises. A single subject was used for all data collection in this pilot experiment. Each data set involves one particular exercise being performed with one value of several different parameters that were varied between trials (e.g. stance width, cadence/speed, load being lifted); these parameters varied across 2-4 levels for different trials. Each recorded trial had the subject performing 3-5 repetitions of the exercise. The EMG data that are reported are the mean EMG signal and the integrated EMG signal for each repetition; standard deviations are calculated over sets of N repetitions.

To clarify the above verbal description, here is the data for our subject performing a rowing exercise in 2 different trials using 2 different cadences (Slow vs. Fast), and with all other exercise parameters held the same (all values are integrated EMG data). The data is for one specific muscle of the 16 muscles recorded and is integrated EMG (iEMG) data:

Repetition 1: 0.1361 0.1021
Repetition 2: 0.1812 0.1374
Repetition 3: 0.2002 0.1412
Repetition 4: 0.2730 0.152
Repetition 5: 0.2765 *** (no 5th repetition collected)
Mean +/- s.d.: 0.2134 +/- 0.0607 0.1332 +/- 0.0216

Regarding my pilot data set, for this single subject, I have data for:
~ 4 different exercises
~ several different levels of several different parameters, with one unique combination per trial (e.g. Heavy Load + Slow Cadence + Wide Stance Width, which can be compared with other trials by varying one parameter at a time, like comparing the above to Medium Load + Slow Cadence + Wide Stance Width or Heavy Load + Slow Cadence + Narrow Stance Width)
~ Concentric and Eccentric phases of each repetition
~ EMG data for each of 16 muscles recorded (2 sets of data: mean EMG signal and integrated EMG signal)
~ N repetitions per trial (again, N = 3 to 5)

Now, I need to use this pilot data set from one single subject performing all of these variations on these 4 exercises to determine the power, precision, and number of subjects needed to achieve statistical significance in a future experiment. So far, I have been trying to use online power calculators to compare two trials at a time (although I should eventually also be comparing 3 or more trials, e.g. all conditions held the same except for Light vs. Medium vs. Heavy load being lifted).

One concern I have is that in these preliminary calculations, I have been using N = 5 (or, where appropriate N = 4 etc.) for the sample size (where N is one repetition), so that I can use the standard deviation over the set of N repetitions within my power calculation. However, I am not sure how to reconcile the fact that I am currently working with N repetitions from a single subject, but that I need to calculate the number of *subjects*, rather than number of *repetitions*, for future experiments. Does anyone have suggestions about this?

Also, given that this pilot data set is so large, how can I estimate the power, precision and subject sample size without having to perform hundreds of pairwise (or groupwise) comparisons across all exercise, muscle, and parameter combinations? Is it going to be possible to represent this large pilot dataset with a small number of calculations? Should I perform a separate set of calculations for each of the 4 exercises I'm studying, or consider combining all exercises' data into a single, large data set?

This has been a frustrating process, as the Power values that are often calculated are 100% (although I have calculated some in the 80% range), and the Subject Number values from these calculations are often around 1 - 3 subjects, which I believe is not likely to be correct. (Typically, for these types of experiments, studies often use anywhere from 5-6 to dozens of subjects.)

FYI, I have access to MATLAB and its Statistics Toolbox, if this would be useful.

If anyone can provide suggestions about the best method I should consider using to employ my large pilot single-subject data set to calculate the power and precision that are characteristic of this data set, and the subject sample size required to achieve statistical significance (with the usual default values for power and alpha), I would greatly appreciate your insights. Thank you!!

#### mmercker

##### Member
Re: Best way to calculate power, precision & sample size using complex pilot data set

What you can usually do if you have a complex design and you want to calculate power: You simulate various artifical datasets (based on Monte-Carlo-Simulations of probability distributions and variances estimated from real experimental data) and you subsequently apply your statistical test and analyze, in how many cases the results of your test show a significant relationship which you have prescribed in your artificial data.

#### statsSeeker

##### New Member
Re: Best way to calculate power, precision & sample size using complex pilot data set

Thanks for your suggestion, mmercker. I will look into the possibility of using Monte Carlo simulations for this statistical analysis.