Mean and SD of two means and SD

#1
If we have two groups of males, group 1: age >60 and group 2 <60. each of these groups have mean and SD; can we calculate a mean and SD for all males?
The mean and SD are for glucose levels.
The only data provided is mean +- SD for each of these groups, without individual values. the n of each group is available.
Is there a statistical way to calculate a mean +- SD for N (total males in both groups)? Thanks
 

Dason

Ambassador to the humans
#2
So you want the mean and standard deviation for if you were to combine both groups and consider them a single data set?
 

Dragan

Super Moderator
#4
If the question is: Is it possible to compute the Mean and Standard Deviation of the combined data based on the Mean of group 1 (and STD of group 1) and the Mean of group 2 (and the STD of group 2), then the answer is yes - it certainly can be done.

Let M1, V1, and m be the M1=mean, V1=variance, and m=sample size of Group 1. Let M2, V2, and n be the M2=mean, V2=variance, and n=sample size of Group 2. Let Mo be the overall mean of the combined data and Vo be the Variance of the combined data. It follows that:

Mo = (m/(m+n))*M1 + ((n/m+n))*M2

Vo = (m^2*V1 +n^2*V2 - n*V1 - n*V2 - m*V1 - m*V2 +m*n*V1 + m*n*V2 + m*n(M1 - M2)^2) / ((n+m-1)(n+m))

Obviously, just take the sqrt[Vo] to get the STDo.
 
Last edited:
#5
Amazing forum. My question is exactly to combine two groups and calculate their mean and SD by using the subgroup n, mean, and SD.

I do not have any statistical software other than graphpad prism. these are my number:
Group A, n=343:
5.2 ± 1.3
Group B n=55
5.2 ± 1.9

can you help me calculate? thank you so much, I am very grateful.
 

hlsmith

Omega Contributor
#6
You don't need anything except a pencil and paper. That is what I would use. Just plug in the numbers.


It appears to just be a weighted recalculation.
 

Dragan

Super Moderator
#10
Well FYI: Combined data descriptive statistics can also be obtained for higher moments beyond the first and second moments (mean, variance/standard deviation) such as the skew (third moment) and kurtosis (based on the fourth moment) - without having any knowledge of the underlying data sets that are being combined.
 

Dragan

Super Moderator
#12
Is there any distribution assumption for all of these?
Yes, there are theoretical assumptions. For example, the theoretical distribution assumptions associated with a Student t-distribution with k=4 degrees of freedom will not work for kurtosis, in this case, because you have to have - degrees of freedom k +1 - such that k-moments that are finite (i.e., the moments exist). The same idea applies to the (primary) Central Limit Theorem i.e., finite population mean and finite population variance associated with a sampling distribution of mean(s) (Cauchy distributions does not work with the CLT).
 

hlsmith

Omega Contributor
#13
Here is my vapid question, so it has to be normal or large? Because I am guessing the OP is grabbing them from an article where the authors ran a ttest on them. Kurtosis per your example doesn't work because its tails are long at that sample size you gave and doesn't approximate standard normal?
 

Dragan

Super Moderator
#14
Here is my vapid question, so it has to be normal or large? Because I am guessing the OP is grabbing them from an article where the authors ran a ttest on them. Kurtosis per your example doesn't work because its tails are long at that sample size you gave and doesn't approximate standard normal?
Well, distributional assumptions do not really matter in a Descriptive sense i.e., if you have a number of data sets the formulae still apply to compute the mean, variance/standard deviation, skew, and kurtosis regardless if the data are drawn from a Cauchy distribution or any other distribution where the first k-moments do not exist.

However, the theoretical assumptions would apply if we are discussing Inferential matters - e.g., the sampling distribution of a mean - or any other moment that is not finite.