# Sample size needed for a observational correlation study?

#### warhunter

##### New Member
hi

I'm pretty helpless in this. Any kind soul willing to guide me on this. I'm writing up a proposal on a correlation study.

The dependent variable and the independent variable are both continuous variables.

eg.
tightness skin (%)
thickness of skin (µm)

I know i should conduct a pearson's correlation for both variables but am totally lost on the priori tests to determine the sample size and power.

How would I determine the minimum sample size required and the power of a test? Someone please help

#### noetsi

##### Fortran must die
You have to be careful about three distinct concepts tied to "required" sample size that often get mixed up in practice. First, is statistical power. This indirectly involves the chance of a Type II error, not rejecting the null when you should. While other factors influence power (and thus the chance of making such an error) sample size is a critical element of this. Commonly people who are talking about a required sample size in statistics are referring to the sample size needed for adequate power.

Second, is the concept of generalizability. This reflects how certain you can be that your results can be generalized from your sample to a larger population. As sample size increases you can be more certain that you can generalize and your margin of error on the results will become smaller. This requires that your sample be done randomly (a convenience sample regardless of size will not meet this requirement). It is not uncommon for statistical discussions to ignore the issue of generalizability and focus on power especially in my experience in medical fields where the sample size may be adequate for power, but is far too low (and not randomly selected) to generalize to large populations. At heart I think this reflects the difference between internal validity (which gets most of the emphasis in statistics) and external validity (which gets less).

Third many authors argue that you need a minumum sample size to actually use a method. These are rules of thumb and various authors disagree on what they should be. Commonly they are required because various methods are large sample ones, they only can be assumed to be correct with large samples (aka they are asymptotically unbiased - the accuracy of the results with small samples is either unknown or questionable). The larger your sample is the more robust your method is in general. Unlike power and generalizability there is no agreement on what you need to address these issues (or none that I have found anyhow).

Depending on which of these you mean you will get different sample size requirements.