# Thread: Grouping distributions together based on significance

1. ## Grouping distributions together based on significance

Hi, all:

My PhD research has wandered into an area that requires more statistical analysis than I am currently comfortable with, and so I'm looking for some guidance.

I have a small collection of items (~10) and a procedure that will measure how "good" each item is. However, for each item, the procedure doesn't give a single number as an answer, but rather a distribution of goodness values. I don't have any particular intuition about these distributions; for example, I can't assume that they have identical variances. The distributions are each sampled 100 times, but I could increase this if necessary.

Using Student's t-test with unequal variances, I know I can compare the goodness distributions of two items, determine if their means are significantly different, and then compare the means to order the two items. This is fine as far as it goes.

However, what I'd really like to do is to rank the set of items into a set of groups (or "equivalence classes"), where items in the same group do not have means that are significantly different from each other, and items between groups do have means that are significantly different.

For example, if the set of items are {a1, a2, a3, a4, a5, a6}, I'd like to be able to say something like the following:

Item a5 has mean goodness 0.1
Items {a2, a4} are equivalent and have mean goodness 0.4
Item a6 has mean goodness 0.7
Items {a3, a1} are equivalent and have mean goodness 0.9

Does this make sense, or ring any bells?

I know I can use an ANOVA test to tell me if the means of three or more distributions are significantly different, and came up with the following algorithm to find the groups:

1) Start with each item in its own group.
2) Pick two groups at random and test whether the combined group would have significantly different means (via ANOVA or t-test). If the means are not different, then combine the two groups into a single group.
3) Repeat until no two groups can be combined any more.
4) Rank the groups using the pooled means of the goodness samples of all items in a group.

What do you think? Is this a known problem with a better solution in the literature? Is this a stable procedure -- i.e., are there problems with the ordering in which we select groups in step 2? Am I horribly misusing these tests?

Thanks in advance for reading through this. Any pointers at all will be helpful!

2. Sounds like you need to use a post hoc procedure like tukey HSD and friends.
That'll give you ur groups.

3. Huh, interesting. Very much what I was looking for. Thanks for the help!