I have a number of continuous socio demographic variables that I have condensed using PCA into 8 factors representing composites like affluence and life stage.

Now I'd like to perform a cluster analysis using these factor scores and a standardized continuous raw test performance variable with similar variance. My question is: Is it OK to mix composite factors and raw (standardized) variables in a cluster analysis? Should I check for equal variance etc?

The standardized variable is pretty much a dependent of these different factors (which elsewhere in the study are treated as predictors) so I don't want to insert it into the factor model itself - I've also run regression analysis with factors and raw variables, but the cluster segmentation is really what interests me: "are there clearly interpretable groupings of demographic factors and performance scores?" and "can we use those groupings to create a typographic model?"

I'm asking here because I know composite factor variables are commonly used in psych research for scales etc and to predict other continuous variables. But I'm not sure if the two are clustered together. An alternative approach to cut down the number of demographic variables would be to use the variables retained by a regression analysis against the performance score and not worry about creating factors (which I have done) but this is not as interesting.

Advice much appreciated. ]]>