Whats is the method to analyze characteristics from two groups?

Hello there,

Im new in the statistics world and i hope that you can help me.

i have two groups of sample that are divided by one y (depended) variable

y = binary, is a good employee or no

then i have ten variables (x) divided into demografic variables and profile test. Example: age, emotion level, if he/she has kids, etc

my question is...which is the best test to analyse the best predictor variables? what is the best way to try to construct a predictive model? Is Cluster Analysis useful for that?

So...I will be grateful if you can help this newbie in statistics
Last edited:


TS Contributor
I understand that you have a categorical Dependent Variable, which has 2 levels (good employee/bad employee). Then, you have different Independent Variables (i.e., predictors), and you want to build a predictive model.
Since your DV is binary, you may want to look into Binary Logistic Regression, in order to assess how the IV helps you in predicting the positive outcome of the DV (i.e., being a good employee).

I assume that you have no/scant familiarity with Logistic Regression. What I can suggest you is to read some books that I have found useful for non-math/non-stat people (like me):
Logistic Regression: A Primer
Logistic Regression Using SAS: Theory and Application
Discovering Statistics Using R


Thanks for reply.

At first i was think about using logistic regression. In fact, i started with that.
But recently the sales director of my company suggested me to analyse the two groups (dv) separately.

For example: analyse the good employee's sample and assume that all independent variables values are good. And do the inverse with the another sample, assuming that the variables values are bad.

He suggest a methodology called "Correlation and Segmentation", and said that with minitab i can do that.

But the correlation that i know is just like Spearman, Pearson, etc, that i have to put the two groups simultaneously in analysis. About segmentation, i thinked in cluster methodology, but i dont know if it can help.

Have you might an idea what he was talking about?


TS Contributor
he is probably referring to something known in the six sigma world (Minitab is a strong clue) as stratification and segmentation - look at the two groups and see whether there are significant differences in any of the underlying variables. I would still go with the logistic regression approach - statistically this is much cleaner.