Hierarchical Cluster Analysis: which method to use?

Hi all,

I have a data set consisting of 32 records (linguistic constructions) and count data on how often each construction is used for each of 5 variables (semantic categories).

In order to asses the relationships between my records, I wanted to run a hierarchical cluster analysis. Only, which clustering method should I select in SPSS? I'm hoping to find two or three clear clusters, so my main interest is to maximise the distance between clusters.

thanks for your help!


Dark Knight
You can try "complete linkage clustering" or "average linkage clustering" (same as the default one "between-group linkage." ).

Tnx Richie, I'll try that.

Another issue: of my 32 constructions, I have a priori knowledge that some of them are never used for particular semantic categories. Since this is fundamentally different from a zero count (where a construction might be used, but isn't), I have currently coded these as missing values.

This is where things go wrong. Apparently, the Hierarchical Clustering algorithm in SPSS excludes cases with missing values altogether, leaving insufficient data to perform the analysis. I have considered estimating missing values, but this is completely counterintuitive: if anything, they should be zero.

How to go about this? I know that programmes like Clustan use algorhitms that simply ignore cells with missing values, but my employer doesn't want to pay for another software package...

thanks again!