Value that increases the Standard Deviation

#1
Hello,

I am puzzled by the following statement

" In order to increase the standard deviation of a set of numbers, you must add a value that is more than one standard deviation away from the mean"

What is the proof of that? I know of course how we define the standard deviation but that part I seem to miss somehow. Any comments? Thanks!
 

BGM

TS Contributor
#2
For a given set of numbers \( \{x_1, x_2, \ldots, x_n\} \), the sample variance is given by

\( s_n^2 = \frac {1} {n-1} \sum_{i=1}^n x_i^2 - \frac {1} {n(n-1)} \left(\sum_{i=1}^n x_i\right)^2 \)

With a new additional number \( x_{n+1} \), the sample variance becomes

\( s_{n+1}^2 = \frac {1} {n} \sum_{i=1}^{n+1} x_i^2 - \frac {1} {n(n+1)} \left(\sum_{i=1}^{n+1} x_i\right)^2 \)

\( = \frac {1} {n} \sum_{i=1}^{n} x_i^2 + \frac {x_{n+1}^2} {n}
- \frac {1} {n(n+1)} \left(\sum_{i=1}^n x_i\right)^2
- \frac {1} {n(n+1)} \left(2x_{n+1}\sum_{i=1}^n x_i + x_{n+1}^2\right)
\)

Then the difference is

\( s_{n+1}^2 - s_n^2 \)

\( = \frac {x_{n+1}^2} {n+1} - \frac {2x_{n+1}} {n(n+1)}\sum_{i=1}^n x_i
-\frac {1} {n(n-1)} \sum_{i=1}^n x_i^2
+ \frac {2} {n(n-1)(n+1)}\left(\sum_{i=1}^n x_i\right)^2 \)

and you see this is a quadratic expression in \( x_{n+1}^2 \)

To shorten our notation, let \( a = \sum_{i=1}^n x_i = n\bar{x} \) and \( b = \sum_{i=1}^n x_i^2 \). Then we can solve the quadratic inequality

\( s_{n+1}^2 - s_n^2 > 0 \)

\( \iff x_{n+1}^2 - 2\bar{x}x_{n+1} - \frac {n+1} {n(n-1)} b
+ \frac {2} {n(n-1)}a^2 > 0 \)

\( \iff x_{n+1} < \bar{x} - \frac {\sqrt{bn^3 - (a^2 + 2b)n^2 + bn + a^2}} {n(n-1)} ~~\text{or}\)
\( x_{n+1} > \bar{x} + \frac {\sqrt{bn^3 - (a^2 + 2b)n^2 + bn + a^2}} {n(n-1)}\)

So the width is not exactly 1 standard deviation as

\( s_n = \sqrt{\frac {nb - a^2} {n(n-1)}} \)

Anyway the answer change if you, e.g. divide the sample variance by \( n \) rather than \( n - 1 \)
 
#3
Thanks BGM, that is indeed rigorous and to the point. But may I ask how you derive the last results? After completing the square and substituting for \( \bar{x}\ \text{with}\ \frac{a}{n} \) what I get is:

\( \left( x_{n+1}-\bar{x} \right)^2 > \frac{a^2}{n^2} + \frac{ \left( n+1 \right)b -2a^2 } {n \left( n-1 \right) } \) which is not exactly the same even after adding the fractions.