04-13-2006, 06:21 PM
I'm trying to run a cluster analysis with monthly water quality data over an 8 year period with 48 sites. Each site has a particular number of parameters. I want to cluster the sites based on the parameters. In my database monthly site information is contained in a single record. Because of the organization of my database...I believe I need to restructure my data to force the cluster analysis to cluster sites instead of parameters. What I did was aggregate the sites by the median (due to nonparametric-ness) and then transposed that data so that each field contained site data with the parameter medians belonging to a record-and then ran the cluster analysis. Is this an accurate way to explore clusters...it seems like I'm eliminating variability by only examining the medians? Sorry for the confusing explanation...maybe best to explain visually.