Weighted arithmetic mean of percentages

Hi everyone,
I got a basic question I can't quite wrap my head around.

The following values for occurrences from a table:
Group 1 average 60% (n=150 from 250 cases)
Group 2 average 53.93% (n=192 out of 356 valid cases)
Total average is ~56.4% (n=342 of 606)

As there are different sample sizes, weighted average applies which however is = 56.59
What am I missing? Where is the .2 difference coming from? As I gather it, they should be exactly the same, with the weighted average having the advantage not to be dependent of the number of cases/totals, being calculated just with in-group occurrences times means.

Last edited:
For clarity, by weighted mean I did ( n1 * x1 + n2 * x2 )/ (n1+n2),
which results in =(192*53.93+150*60)/ (192+150) = 56.59
I also let excel calculate it with several decimals more but no change there either.

I assume this has something to do with the fact that, as some sources on the weighted mean note, for it to be correct the weights need to sum up to 1. However with "192" and "150" expressed as percentages*0.01 (0.561 and 0.439) that fit this criteria, the result is unchanged.


TS Contributor
\( \frac {x_1 + x_2} {n_1 + n_2}
= \frac {\displaystyle n_1 \frac {x_1} {n_1} + n_2 \frac {x_2} {n_2}} {n_1 + n_2}
= \frac {n_1} {n_1 + n_2} \frac {x_1} {n_1} + \frac {n_2} {n_1 + n_2} \frac {x_2} {n_2}
Ok, lets say 60 and 53.93 are the means (and though they are percentages, lets view them unit-less for the sake of clarity), 192, 150 number of occurrences.

That would result in (60+53,93) / (192+150) = something below zero so it doesn't fit.

Then again the percentages ultimately are just other ways of displaying 150 and 192, which is probably wherein the problem lies. Still they have to be weighted somehow as the simple mean 56.9 doesn't account for different group sizes. And for the solution:

For the first form of your formula (as long as there are no %s involved) the w. mean is as follows:

(192+150)/(356+250)=0.5643; or 56,4 per 100... well; thanks! Seems I just misinterpreted occurrences for sample sizes because of the percentage befuddlement.
Last edited: