I have been given an EXCEL model and my assigned task is (only) to fix it for problems that I might call "unintended consequences" of the underlying math. Currently, the model tries to predict how good a manager is likely to be, based on a weighted average of total scores from 5 categories. The 5 categories, in turn, contain multiple variables each which are related to goodness. The weights (‘importance factors’) of the categories have been pre-selected based on intuition as to what should produce a good manager, having nothing to do with how well the variable can discriminate between managers, for example if it has low variance among managers. Also, the variables within the categories (sometimes 5 variables, sometimes 7, 10, etc) have been pre-selected and are not to be second guessed. This is the way they want to do it for now, so please let that lie!

As will become obvious below, the numerical value of the resulting weighted average scores are pretty meaningless. But, in my mind, if one manager scores 60 and another 50, it is OK to conclude that the former is (likely to be) 20% better than the latter. This is what my end user cares about: who in the population is better (even if they are all bad, he will choose the better ones) and by how much (‘betterness’). Still, we should recognize that the betterness is a derived quantity, based on variables that measure only goodness. So that we may focus on ‘unintended consequences’ due to poor math only, I hope you can take all the above as a given! So, let’s assume that the weights (which have nothing to do with ability to discriminate, mind you) and the variables have been surmised divinely and are perfect!

The scores in the five categories are summed from the scores of multiple variables. The latter are always non-negative and a higher score is always better than a lower score. Each is intended to represent some characteristic that is likely to produce a 'good manager'. So, none of these variables have explicit weighting or ‘importance’. Obviously, points from one variable are just as helpful as points from another, within a category. Currently, each variable can have a score of 0 or 1, 0, 1, or 2, or 0, 1, 2 or 3, actually sometimes only 0 or 2, or 0 or 3. They think that the maximum amount (not average, mind you) of points allowed for each variable can account for the implied 'importance' of each variable. I wish they could all be put on the same scale (say from 1 to 10), but my end user also, at least for now, is inflexible on this. Some of these variables are subjective things, and, in some cases, he only sees two results (hence 0 or 1, or 0 and 2, or 0 and 3) and in other cases, he thinks there could be 4 results (0, 1, 2 and 3). The model’s creators felt that, by allowing a maximum (not average, mind you) of 10 points in each category, intuitively spread across the variables, it would be as if they had pre-specified the importance of the variables as well. And, if this worked, the category weightings, which represent ‘importance’ at the higher level would produce the intended results as well.

My qualms with this model, as it exists, are that, though the max points in each category is the same, the average points may not be. Even, if the averages were the same, the standard deviations may not be. Also, the ranges for the variables differ from variable to variable, some 0 to 1, some 0 to 3.

I have concluded, perhaps erroneously, that, it would be better if they would pre-assign the weightings of each variable, just as they have already pre-assigned the weightings of each of the 5 categories. Then, if we want them to truly represent importance, we need to divide each variable by its average value. But, even if I can’t get them to switch to explicit weightings, if I divided these variables by their average value, I figure that they will still get the ‘importance’ expected (which, right now, they think they are getting by specifying maximum points for each variable). What I don't like about this approach is that, if I change the score of one manager, it will slightly change the score of all managers and my end user may not like that.

I have also concluded, perhaps erroneously, that I should NOT attempt to control for standard deviation. I figure that, if a highly important variable (in terms of producing a good manager) had almost zero variance, then that variable, however important, would contribute almost nothing toward the ultimate goal of deciding who is the best manager (and by how much). Nevertheless, I have concluded that this is how it should be, since we (only) DERIVE the decisions as to who is a better manager (and, how much better in %) from assumptions about what produces a good manager. For example, if all managers score the same on a variable that is 99% important in predicting the goodness of a manager, the conclusion should be that all managers are virtually the same, with small distinctions being made from the other variable that is 1% important (and presumably has positive variance).

In order to power through this confusion, I figured there must be some analogy to how this would be done via linear regression, Also, someday, I hope to convince my end user to do this rigorously, rather than intuitively. I googled a bit and found stuff about standardized regression. It suggested that, with standardized regression, your multipliers really do represent relative importance, which would be great, though that importance may also reflect how good a discriminator the variable is, not just how important the variable is toward being a good manager. One URL talked about dividing by sigma, in another, it talked about dividing by sigma after first subtracting the mean, a familiar concept. Are these two different things or do you think the first was just being lazy with their wording? Even if not, they are both different from my approach which divided by the mean value but does not normalize for standard deviation.

Can anyone help me through this confusion, please? What I am permitted to do, behind the scenes, is cause the model to normalize, scale, or translate things. Sorry, this post was so long. Please ask questions if it is confusing.

Thanks

Dean

As will become obvious below, the numerical value of the resulting weighted average scores are pretty meaningless. But, in my mind, if one manager scores 60 and another 50, it is OK to conclude that the former is (likely to be) 20% better than the latter. This is what my end user cares about: who in the population is better (even if they are all bad, he will choose the better ones) and by how much (‘betterness’). Still, we should recognize that the betterness is a derived quantity, based on variables that measure only goodness. So that we may focus on ‘unintended consequences’ due to poor math only, I hope you can take all the above as a given! So, let’s assume that the weights (which have nothing to do with ability to discriminate, mind you) and the variables have been surmised divinely and are perfect!

The scores in the five categories are summed from the scores of multiple variables. The latter are always non-negative and a higher score is always better than a lower score. Each is intended to represent some characteristic that is likely to produce a 'good manager'. So, none of these variables have explicit weighting or ‘importance’. Obviously, points from one variable are just as helpful as points from another, within a category. Currently, each variable can have a score of 0 or 1, 0, 1, or 2, or 0, 1, 2 or 3, actually sometimes only 0 or 2, or 0 or 3. They think that the maximum amount (not average, mind you) of points allowed for each variable can account for the implied 'importance' of each variable. I wish they could all be put on the same scale (say from 1 to 10), but my end user also, at least for now, is inflexible on this. Some of these variables are subjective things, and, in some cases, he only sees two results (hence 0 or 1, or 0 and 2, or 0 and 3) and in other cases, he thinks there could be 4 results (0, 1, 2 and 3). The model’s creators felt that, by allowing a maximum (not average, mind you) of 10 points in each category, intuitively spread across the variables, it would be as if they had pre-specified the importance of the variables as well. And, if this worked, the category weightings, which represent ‘importance’ at the higher level would produce the intended results as well.

My qualms with this model, as it exists, are that, though the max points in each category is the same, the average points may not be. Even, if the averages were the same, the standard deviations may not be. Also, the ranges for the variables differ from variable to variable, some 0 to 1, some 0 to 3.

I have concluded, perhaps erroneously, that, it would be better if they would pre-assign the weightings of each variable, just as they have already pre-assigned the weightings of each of the 5 categories. Then, if we want them to truly represent importance, we need to divide each variable by its average value. But, even if I can’t get them to switch to explicit weightings, if I divided these variables by their average value, I figure that they will still get the ‘importance’ expected (which, right now, they think they are getting by specifying maximum points for each variable). What I don't like about this approach is that, if I change the score of one manager, it will slightly change the score of all managers and my end user may not like that.

I have also concluded, perhaps erroneously, that I should NOT attempt to control for standard deviation. I figure that, if a highly important variable (in terms of producing a good manager) had almost zero variance, then that variable, however important, would contribute almost nothing toward the ultimate goal of deciding who is the best manager (and by how much). Nevertheless, I have concluded that this is how it should be, since we (only) DERIVE the decisions as to who is a better manager (and, how much better in %) from assumptions about what produces a good manager. For example, if all managers score the same on a variable that is 99% important in predicting the goodness of a manager, the conclusion should be that all managers are virtually the same, with small distinctions being made from the other variable that is 1% important (and presumably has positive variance).

In order to power through this confusion, I figured there must be some analogy to how this would be done via linear regression, Also, someday, I hope to convince my end user to do this rigorously, rather than intuitively. I googled a bit and found stuff about standardized regression. It suggested that, with standardized regression, your multipliers really do represent relative importance, which would be great, though that importance may also reflect how good a discriminator the variable is, not just how important the variable is toward being a good manager. One URL talked about dividing by sigma, in another, it talked about dividing by sigma after first subtracting the mean, a familiar concept. Are these two different things or do you think the first was just being lazy with their wording? Even if not, they are both different from my approach which divided by the mean value but does not normalize for standard deviation.

Can anyone help me through this confusion, please? What I am permitted to do, behind the scenes, is cause the model to normalize, scale, or translate things. Sorry, this post was so long. Please ask questions if it is confusing.

Thanks

Dean

Last edited: