Hello,
I have a complete population consisting of 6587 individuals (genes) and for each individual I have a distance value. These distance values are not normally distributed (see attached density plot).
I'm interested in determining if a sample of 348 individuals (which has a smaller mean distance value compared with the mean distance value from the complete population) is significantly different than expected.
I have limited statistics background and would appreciate advice on what statistical test to use.
Ideas include:
two sample t-test (probably not right, since its not a normal distribution)
wilcoxon rank sum test (buddy suggested this)
some other test that's easy to implement in R
custom test - one idea I had was to randomly pick out 348 individuals (with replacement) and determine their mean distance. Then repeat this 1000 times (I think this is called bootstrapping). I did this, and I get a normal distribution of values with mean 32000 (same as population mean ) and sd = 2000. The mean distance value for the actual sample of 348 individuals was 18000. So by 1-tail p-value from normal distribution of mean 32000 and sd=2000 this appears significant.
Thanks in advance for any advice!
Tweet |