B_Miner

08-07-2010, 08:49 PM

I am trying to derive the EM algorithm (applied to clustering) for the case where there is a numeric variable assumed to be Gaussian distributed (x1) and a nominal variable (x2) with three levels (a1,a2,a3). I am assuming there are two clusters (C_1 and C_2). Also, assuming that the variables are independent and that the examples (aka cases, aka records) as independent.

I am attaching my work because this question may be easier to follow.

Anyone know how to set this up to then take the derivative etc to derive the maximum likelihood estimates? A nudge in the right direct would be very appreciated.

Thanks!

~B

