So you want the mean and standard deviation for if you were to combine both groups and consider them a single data set?
If we have two groups of males, group 1: age >60 and group 2 <60. each of these groups have mean and SD; can we calculate a mean and SD for all males?
The mean and SD are for glucose levels.
The only data provided is mean +- SD for each of these groups, without individual values. the n of each group is available.
Is there a statistical way to calculate a mean +- SD for N (total males in both groups)? Thanks
So you want the mean and standard deviation for if you were to combine both groups and consider them a single data set?
I don't have emotions and sometimes that makes me very sad.
docoftheworld (08-02-2016)
Yes, that appears to be their question. Can you use these parameters to constitute the overall mean and std.
Stop cowardice, ban guns!
docoftheworld (08-02-2016)
If the question is: Is it possible to compute the Mean and Standard Deviation of the combined data based on the Mean of group 1 (and STD of group 1) and the Mean of group 2 (and the STD of group 2), then the answer is yes - it certainly can be done.
Let M1, V1, and m be the M1=mean, V1=variance, and m=sample size of Group 1. Let M2, V2, and n be the M2=mean, V2=variance, and n=sample size of Group 2. Let Mo be the overall mean of the combined data and Vo be the Variance of the combined data. It follows that:
Mo = (m/(m+n))*M1 + ((n/m+n))*M2
Vo = (m^2*V1 +n^2*V2 - n*V1 - n*V2 - m*V1 - m*V2 +m*n*V1 + m*n*V2 + m*n(M1 - M2)^2) / ((n+m-1)(n+m))
Obviously, just take the sqrt[Vo] to get the STDo.
Last edited by Dragan; 08-02-2016 at 11:31 AM.
docoftheworld (08-02-2016), hlsmith (08-02-2016)
Amazing forum. My question is exactly to combine two groups and calculate their mean and SD by using the subgroup n, mean, and SD.
I do not have any statistical software other than graphpad prism. these are my number:
Group A, n=343:
5.2 ± 1.3
Group B n=55
5.2 ± 1.9
can you help me calculate? thank you so much, I am very grateful.
You don't need anything except a pencil and paper. That is what I would use. Just plug in the numbers.
It appears to just be a weighted recalculation.
Stop cowardice, ban guns!
docoftheworld (08-02-2016)
Got it.
Will do it.
So i should convert the SD i have to variance and then do the calculations. sorry i am a complete beginner so dont laugh
Guys, here is this webpage that provides the information using a calculator. Maybe others will find it helpful.
http://www.statstodo.com/ComMeans_Pgm.php
Thanks so much for answering me.
Well FYI: Combined data descriptive statistics can also be obtained for higher moments beyond the first and second moments (mean, variance/standard deviation) such as the skew (third moment) and kurtosis (based on the fourth moment) - without having any knowledge of the underlying data sets that are being combined.
Is there any distribution assumption for all of these?
Stop cowardice, ban guns!
Yes, there are theoretical assumptions. For example, the theoretical distribution assumptions associated with a Student t-distribution with k=4 degrees of freedom will not work for kurtosis, in this case, because you have to have - degrees of freedom k +1 - such that k-moments that are finite (i.e., the moments exist). The same idea applies to the (primary) Central Limit Theorem i.e., finite population mean and finite population variance associated with a sampling distribution of mean(s) (Cauchy distributions does not work with the CLT).
Here is my vapid question, so it has to be normal or large? Because I am guessing the OP is grabbing them from an article where the authors ran a ttest on them. Kurtosis per your example doesn't work because its tails are long at that sample size you gave and doesn't approximate standard normal?
Stop cowardice, ban guns!
Well, distributional assumptions do not really matter in a Descriptive sense i.e., if you have a number of data sets the formulae still apply to compute the mean, variance/standard deviation, skew, and kurtosis regardless if the data are drawn from a Cauchy distribution or any other distribution where the first k-moments do not exist.
However, the theoretical assumptions would apply if we are discussing Inferential matters - e.g., the sampling distribution of a mean - or any other moment that is not finite.
Tweet |