# A dumb question about univariate analysis (repost from applied statistics)

#### Kimboz

##### New Member
Hello
I'm quite dumb regarding statistics, but i really am looking for a simple answer to a simple question.
the question is: how do i explore data regarding a medium-sized (roughly 50-60) set of continuous variables measured in two group of samples (one "control" and one "test"?)

Let's imagine i have two wines which taste different, and i want to determine why they taste so different.

I take one glass of wine from each of three random bottles of each wine (let's assume the population of bottles of the same wine type is quite homogeneous, but not completely identical) and run each of the 6 taken samples (3 from the "control" wine and 3 from the "test" wine) through a GS-MS mass-spectrometers, and obtain roughly 60 peaks, each one corresponding to a compound present in the wine. Each peak's area is measured, giving a good quantitative measurement of how much of that compound is in each of the 6 samples. Now i have the quantities detected for each compound (obviously each compound measured is in a different scale, and possibly in a different unit, which means that one could be around 100 mg/L and the other could be in the vicinities of 3 mM).

Questions now:
- What statistical tests is suitable to understand if in the "test" wine the the measured compounds are present in different amounts as compared to the "control" wine? That is, how can we compare the means of the two groups for each of the variables tested?
- Is there a free program (preferably with a spreadsheet interface) that i can use to do the test?
- how can i explore the interrelation between variables? For example: is there a simple way(software) to check if increases in one compound are associated with decreases in another?

thank you very much and apologies in advance.
Andrea

#### Karabiner

##### TS Contributor
You can compare measured poperties between the two
samples using the Mann-Whintney U-Test. You can
analyse associations between properties using the
Spearman rank correlation.

The problem is the extremely small sample size, n1=n2=3.
I don't know whether the tests can yield statisticallly significant
results at all on the conventional p < 0.05 level, given a total
sample size of only 6. Moreover, if one performs such a huge
number of tests, conventionally one would have to contemplate
how to deal with the possible increase of false-positive results,
due to multiple testing. Usually, the significance level would have
to be more conservative (smaller than 0.05), but that would make
significant results impossible with only n=6.

With kind regards

K.

Dear Karabinier