# Distribution of Extreme Spread for (n, sigma)

#### dbooksta

##### New Member
Given a Rayleigh process R(s) generating samples $$X_i$$ -- which is equivalent to a bivariate normal process with 0 correlation and both sigmas = s -- what is the distribution of the Extreme Spread of n samples?

Extreme spread $$ES\{X\}_n \equiv max_{i, j}|X_i - X_j|$$

E[$$ES_n(\sigma)$$], and Pr($$ES_n(\sigma)$$ > a), seems like it should have a chi(n) distribution, but I am not good enough to derive the exact relationship.

I did find this paper from 1975 which on p.8 suggests as much, but which focuses on empirical solution instead of on the pure math. In contrast, I want to take the parameters of the distribution of X as given and find a formulaic distribution for ES.

Any guidance appreciated!

#### BGM

##### TS Contributor
Seems like you want to compute the CDF/pdf of the sample range (assume the underlying sample $$X$$ is a continuous random variable) - the difference between the sample maximum and sample minimum, $$X_{(n)} - X_{(1)}$$

http://en.wikipedia.org/wiki/Range_(statistics)

By elementary multinomial-type arguments, the joint pdf of $$(X_{(1)}, X_{(n)})$$ is

$$f_{X_{(1)}, X_{(n)}}(x_1, x_n)$$

$$= \frac {n!} {0!1!(n-2)!1!0!} F_X(x_1)^0f_X(x_1)^1[F_X(x_n) - F_X(x_1)]^{n-2}f_X(x_n)^1[1 - F_X(x_n)]^0$$

$$= n(n-1)f_X(x_1)f_X(x_n)[F_X(x_n) - F_X(x_1)]^{n-2}$$

where $$f_X, F_X$$ are the pdf and CDF of the orginal unordered random sample.

Using elementary transformation tricks, we know that the pdf of sample range

$$f_R(r) = \int_{-\infty}^{+\infty} n(n-1)f_X(x)f_X(x+r)[F_X(x+r) - F_X(x)]^{n-2}dx, r > 0$$

By integration, the CDF is

$$F_R(r) = \int_{-\infty}^{+\infty} nf_X(x)[F_X(x+r) - F_X(x)]^{n-1}dx, r > 0$$

as listed in the above wiki link.

Therefore one just need to apply the above integral to find the answer. However many of them are quite ugly and have no closed-form solution. For your case, lets try to calculate the CDF:

$$n\int_0^{+\infty} \frac {x} {\sigma^2} \exp\left\{-\frac {x^2} {2\sigma^2}\right\} \left[\exp\left\{-\frac {x^2} {2\sigma^2}\right\} - \exp\left\{-\frac {(x+r)^2} {2\sigma^2}\right\}\right]^{n-1} dx$$

$$= \frac {n} {\sigma^2} \int_0^{+\infty} x \exp\left\{-\frac {nx^2} {2\sigma^2}\right\} \left[1 - \exp\left\{-\frac {2rx + r^2} {2\sigma^2}\right\}\right]^{n-1}dx$$

$$= \frac {n} {\sigma^2} \int_0^{+\infty} x \exp\left\{-\frac {nx^2} {2\sigma^2}\right\} \sum_{k=0}^{n-1} \binom {n-1} {k} (-1)^k \exp\left\{-\frac {2krx + kr^2} {2\sigma^2}\right\} dx$$

$$= \frac {n} {\sigma^2} \sum_{k=0}^{n-1} \binom {n-1} {k} (-1)^k \int_0^{+\infty} x \exp\left\{-\frac {nx^2 + 2krx + kr^2} {2\sigma^2}\right\} dx$$

$$= \frac {n} {\sigma^2} \sum_{k=0}^{n-1} \binom {n-1} {k} (-1)^k$$ $$\int_0^{+\infty} x \exp\left\{- \frac {n} {2\sigma^2} \left(x^2 + 2 \frac {kr} {n} x + \frac {k^2r^2} {n^2}\right) + \frac {1} {2\sigma^2} \left(\frac {k^2r^2} {n} - kr^2 \right)\right\} dx$$

$$= \frac {n} {\sigma^2} \sum_{k=0}^{n-1} \binom {n-1} {k} (-1)^k$$ $$\int_0^{+\infty} x \exp\left\{- \frac {n} {2\sigma^2} \left(x^2 + 2 \frac {kr} {n} x + \frac {k^2r^2} {n^2}\right) - \frac {1} {2\sigma^2} \left(kr^2 - \frac {k^2r^2} {n}\right)\right\} dx$$

$$= \frac {n} {\sigma^2} \sum_{k=0}^{n-1} \binom {n-1} {k} (-1)^k \exp\left\{-\frac {kr^2} {2\sigma^2} \left(1 - \frac {k} {n}\right) \right\}$$ $$\int_0^{+\infty} x \exp\left\{- \frac {n} {2\sigma^2} \left(x + \frac {kr} {n}\right)^2 \right\}dx$$

Finally we consider:

$$\int_0^{+\infty} \left(x + \frac {kr} {n} \right) \exp\left\{- \frac {n} {2\sigma^2} \left(x + \frac {kr} {n}\right)^2 \right\}dx = \frac {\sigma^2} {n}$$

and

$$\int_0^{+\infty} \exp\left\{- \frac {n} {2\sigma^2} \left(x + \frac {kr} {n}\right)^2 \right\}dx$$

$$= \sqrt{2\pi\frac {\sigma^2} {n}} \int_0^{+\infty} \frac {1} {\sqrt{2\pi\frac {\sigma^2} {n}}} \exp\left\{- \frac {n} {2\sigma^2} \left(x + \frac {kr} {n}\right)^2 \right\}dx$$

$$= \sqrt{2\pi\frac {\sigma^2} {n}} \Pr\left\{\frac {\sigma} {\sqrt{n}} Z - \frac {kr} {n} > 0 \right\}$$

$$= \sqrt{2\pi\frac {\sigma^2} {n}}\left[1 - \Phi\left(\frac {kr} {\sigma\sqrt{n}}\right)\right]$$

where $$Z$$ is the standard normal random variable and $$\Phi$$ is its CDF.

As a result, the CDF is

$$= \frac {n} {\sigma^2} \sum_{k=0}^{n-1} \binom {n-1} {k} (-1)^k \exp\left\{-\frac {kr^2} {2\sigma^2} \left(1 - \frac {k} {n}\right) \right\}$$ $$\int_0^{+\infty} \left(x + \frac {kr} {n} - \frac {kr} {n} \right) \exp\left\{- \frac {n} {2\sigma^2} \left(x + \frac {kr} {n}\right)^2 \right\}dx$$

$$= \frac {n} {\sigma^2} \sum_{k=0}^{n-1} \binom {n-1} {k} (-1)^k \exp\left\{-\frac {kr^2} {2\sigma^2} \left(1 - \frac {k} {n}\right) \right\}$$ $$\left\{\frac {\sigma^2} {n} - \frac {kr} {n} \sqrt{2\pi\frac {\sigma^2} {n}}\left[1 - \Phi\left(\frac {kr} {\sigma\sqrt{n}}\right)\right] \right\}$$

But as you see it is quite a mess.

#### dbooksta

##### New Member
Wow. Just curious: are there any continuous distributions that have clean range distributions?

As for this case: Can you see any way to pull sigma outside of the outer sum?

I ask because in practice I will only be concerned with N in the range [2, 100], but for arbitrary sigma. If we can establish a relationship between the CDF(given n) and sigma, then I could compute the values for each N and sigma = 1 just once and then use those reference values to compute the probability given a different sigma.

Or is it known that there's no better way to get values than to do the full sum you derived?

#### dbooksta

##### New Member
Another way of putting my follow-up question: Intuitively we expect some scale invariance of the extreme spread w.r.t. sigma. I.e., holding n constant, if $$\mathrm{E}[ES_n(\sigma)] = X$$ then I would expect something like $$\mathrm{E}[ES_n(a \sigma)] = aX$$ or = $$\sqrt{a} X$$

#### BGM

##### TS Contributor
Yes you have raised a very good point which I missed - $$\sigma$$ is the scale parameter.

We have the following:

If $$U \sim \text{Rayleigh}(1)$$, then $$X = \sigma U \sim \text{Rayleigh}(\sigma)$$

Since the scaling does not affect the order (remember $$\sigma > 0$$), we have

$$X_{(n)} - X_{(1)} = \sigma U_{(n)} - \sigma U_{(1)} = \sigma (U_{(n)} - U_{(1)})$$

i.e. again $$\sigma$$ is the scale parameter of the sample range.

Note that $$U_{(n)} - U_{(1)}$$ is independent of $$\sigma$$;

and we have the relationship

$$E[X_{(n)} - X_{(1)}] = \sigma E[U_{(n)} - U_{(1)}]$$

$$F_{X_{(n)} - X_{(1)}}(r)$$

$$= \Pr\{X_{(n)} - X_{(1)} \leq r\}$$

$$= \Pr\left\{U_{(n)} - U_{(1)} \leq \frac {r} {\sigma} \right\}$$

$$= F_{U_{(n)} - U_{(1)}}\left(\frac {r} {\sigma} \right)$$

and you can see this relationship in the answer of my previous post:

$$\sum_{k=0}^{n-1} \binom {n-1} {k} (-1)^k \exp\left\{-\frac {k} {2} \left(\frac {r} {\sigma}\right)^2 \left(1 - \frac {k} {n}\right) \right\}$$ $$\left\{1 - k \left( \frac {r} {\sigma} \right)\sqrt{\frac {2\pi} {n}}\left[1 - \Phi\left(\frac {k} {\sqrt{n}} \times \frac {r} {\sigma}\right)\right] \right\}$$

- it can be written as a function of $$\frac {r} {\sigma}$$

#### dbooksta

##### New Member
Ah ha, I guess this makes sense. So there is scale invariance, but the probability I'm looking for is a function of both n and how far from the sigma I'm looking. I.e., a lookup table would have to provide precomputed values for two dimensions: $$(n, \frac{r}{\sigma})$$.

So to summarize
• There's a separate CDF for each n
• For a Rayleigh process the CDF of the extreme spread is the sum you derived:
$$\sum_{k=0}^{n-1} \binom {n-1} {k} (-1)^k \exp\left\{-\frac {k} {2} \left(\frac {r} {\sigma}\right)^2 \left(1 - \frac {k} {n}\right) \right\}$$ $$\left\{1 - k \left( \frac {r} {\sigma} \right)\sqrt{\frac {2\pi} {n}}\left[1 - \Phi\left(\frac {k} {\sqrt{n}} \times \frac {r} {\sigma}\right)\right] \right\}$$
• When looking for extreme spread probabilities from a Rayleigh process with n samples, once we compute a particular point on the CDF(n), the same value holds for all probabilities at the same scale distance $$(\frac{r}{\sigma})$$.

Is this correct?

#### BGM

##### TS Contributor
I think you got it. Hopefully there is no careless mistakes in the above calculations I did not check for many times.