Sorting data into groups according to variance in excel

#1
Hi everybody, i am new to this board, let's see if someone can help me.
I have a excel worksheet with a list of every movie that came out in 2008.

The list is ordered by income of every movie, for example:

Zorro ...................290,000

Benjamin Button .....400,000


And so on.

I have to divide the movies in four groups, and every group has to to have maximum omogeneity in the incomes. That is to say, the average variance of the incomes inside every group has to be minimum.

Somebody told me to use the Anova excel function from the add-ins.
So i used the "anova: single factor" function.
But the problem is that this function lets me measure the average variance INSIDE the group, and between the groups, but before i have to arbitrarily divide the groups by my own means.

Instead i need something that automatically creates the groups so that the various incomes of every group have the lowest statistical variance possible.

I am sorry for the mistakes, i am not a native english speaker.

Thank you in advance if anybody can help ; )
 

Mike White

TS Contributor
#2
You can do this using the Ward clustering method which minimises the within group variances and maximises the between group variance. You can easily do this in R but I think it is also available on some of the Excel addin pacakages. The output is a dendrogram (tree structure) which you can cut so as to produce 4 groups.