An average of standard deviations?

#21
Hi All,

I hope this question is related to this thread.

I have a dataset that describes values per local authority, for a number of indicators, with confidence intervals stated for each of the local authority value. However, this dataset does not record standard deviations.

I am trying to obtain the 'overall' confidence interval, for each indicator, of a number of local authorities - an 'average' confidence interval if you will.

Is there a way to derive this overall confidence interval?. Any help would be much appreciated.

Thanks!
 

Dason

Ambassador to the humans
#22
Do you know the sample size and confidence level? If so you can extract the standard deviation from the confidence interval (assuming these are confidence intervals for means).
 
#23
Hi Dragan,

Your answers are really helpful.
I have the same question of averaging standard deviations, however my samples have different weights.

A simplified version of my data is below:

n=1000 participants

1st part of the Questionnaire: Mean: 40, SD: 10, Weight: 8/96
2nd part of the Questionnaire: Mean: 50, SD: 15, Weight: 20/96
3rd part of the Questionnaire: Mean: 45, SD: 10, Weight: 68/96

I need to calculate the total score.

So what should I be doing?

Thanks
Christos
 
#24
Hello,

thank you this is very useful!!!

I have a question also....

if each mean has a very low standard deviation, but the means are all quite different from each other, then is it correct that the large difference between the means would not affect the standard deviation since only the standard deviations affect the final standard deviation, and therefore the standard deviation of the result would still be quite small?

Probably I am missing something here, it doesn't seem to make sense if a huge difference between all the means doesn't affect the overall standard deviation.

thanks very much!
 
#25
What you are providing is the variance for the combined data set (S^2) ---- which is not an average of the two separate variances (1, 1).

In fact, you really don't even need the data to obtain your result (S^2=15.5). All that is needed are the two sample sizes (3, 3), the two means (3, 10), and the two variances (1, 1) of the indivdual data sets and then the variance for the combined data (Variance = 15.5) can be obtained as:

\( s^{2}=\frac{n_{x}^{2}s_{x}^{2}+n_{y}^{2}s_{y}^{2}-n_{y}s_{x}^{2}-n_{y}s_{y}^{2}-n_{x}s_{x}^{2}-n_{x}s_{y}^{2}+n_{y}n_{x}s_{x}^{2}+n_{y}n_{x}s_{y}^{2}+n_{x}n_{y}\left ( \bar{X}-\bar{Y} \right )^{2}}{\left (n_{x}+n_{y}-1 \right )\left ( n_{x}+n_{y} \right )} \)

where you can see in the far right-hand side of the numerator how the square of the difference between the two means will play a role in the computation of the variance for the combined data.

Now, if you want the variance for three (or more) combined data sets, then all you need to do is just keep applying the equation I provided above separately as you combine the data sets one at a time...e.g. combine 1 & 2 and then (1, 2) & 3 ....and so on.... for any number of data sets. Obviously, as you progress you would also need the means of combined data -- which are, of course, easy to obtain.

I would also note that the original poster is asking two different questions. The first question asks for an average of the three variances (or standard deviations) and the second question asks for the variance (or standard deviation) for the combined data....these are different calculations. My first post addresses the first question and this (second) post addresses the second question.
I have question regarding the combination of three data sets (each with two replicates). If I combine sets 1 and 2 using the above formula to get combined set (1,2), then I combine (1,2) and 3, should the combined set (1,2) have two or four replicates?
 
#26
Hi Dragan,

I'm wondering if I can obtain a reference for the formula you've provided for "weighting the variances by their respective sample sizes"?
I'm trying to average two standard deviations of different sample size. My sample sizes only differ by a unit (n-1 and n), so intuitively I figured I would just square and add the two standard deviations prior to taking the square root. But I do want to use an accurate method, so would be great help if you could provide me with the reference.

Thanks,
Nisha
 
#27
Just weight the variances by their respective samples sizes before taking the square root---like this:

\( s_{w}^{2}=\frac{\left ( n_{1} -1\right )s_{1}^{2}+\left (n _{2}-1 \right )s_{2}^{2}+\cdots +\left ( n_{k} -1\right )s_{k}^{2}}{n_{1}+n_{2}+\cdots +n_{k}-k} \).
Can you provide a citation as to where this equation is derived?