Rayleigh estimator and correction factor

#1
We're using the Rayleigh distribution for some real-world scenarios. We often need to estimate its parameter (sigma) from samples R of size N where N is very small.

The estimator we're using for sigma, \(\widehat{\sigma} = \sqrt{\frac{\sum r_i^2}{2n}}\), is biased.

Using Monte Carlo analysis we happened to notice that \(c_4^2\) is a good correction factor for N > 10. But starting with N = 2 it has the following errors: 5.3%, 2.5%, 1.4%, 0.9%. Am I correct in assuming that the "correct" correction factor would have only rounding error for even small N?

Now, while we can use the Monte-Carlo correction factors in practice, we're curious to see the analytic estimator correction (even if it's as difficult to use as \(c_4\)). I don't think I have the capacity to derive that myself -- I lack experience with formal statistics. Could any statisticians help, or at least weigh in on whether a derivation is likely to be trivial, hard, or practically impossible?
 

BGM

TS Contributor
#2
One way to do this is to use Taylor Expansion about its mean.

Let \( Y = \frac {1} {2n} \sum_{i=1}^n R_i^2 \). We know that \( E[Y] = \sigma^2 \)

Now consider \( f(Y) = \sqrt{Y} \). Then

\( E[\sqrt{Y}] = E\left[\sqrt{\sigma^2} + \frac {1} {2\sqrt{\sigma^2}}(Y - \sigma^2) + \frac {1} {2} \frac {1} {2(-2)(\sigma^2)^{\frac {3} {2}}} (Y - \sigma^2)^2 + \ldots \right] \)

You can truncate the series at some point depending on your need for precision. Then you obtain an approximation for the expectation.
 
#3
I'm confused with what you show for the Taylor expansion: Y is the MLE for \(\hat \sigma^2\), but the true \(\sigma^2\) is unknown, so how can I use it in the expansion?

Also shouldn't the expansion be a function of n? I do observe that for large n the square root of the estimator becomes increasingly accurate.
 

BGM

TS Contributor
#4
It is because in the #1 you are claiming that \( \sqrt{Y} \) is a biased estimator of \( \sigma \) (which is obviously true due to Jensen's inequality)

Therefore you consider some constant \( c(n) \) such that

\( c(n)E[\sqrt{Y}] = \sigma \)

which you called the correction factor.

If this is what you mean, obviously you can calculate \( c(n) = \frac {\sigma} {E[\sqrt{Y}]} \)

and it is not hard for you to show that \( E[\sqrt{Y}] \) must in terms of the unknown parameter \( \sigma \); but it will be cancelled out with the numerator to obtain the constant \( c(n) \)

And most importantly here I just lazy to absorb \( n \) inside \( Y \); for more accurate representation you may use \( Y_n \) instead to emphasize it depends on \( n \) as well. For larger \( n \) it is more accurate as the Law of Large number suggested it will converge to the true mean \( \sigma^2 \) which makes the expansion more accurate.
 
#5
Could you spot me on the following? Since the Taylor expansion of square root is slow to converge and unwieldy I thought I'd try something like \(c_4\):

We (per wikipedia) know that if \(R \sim Rayleigh(\sigma)\) then \(\sum^N R_i^2 \sim \Gamma(N, 2\sigma^2)\).

Therefore \(Y_N = \frac{\sum^N R_i^2}{2N} \sim \frac{\Gamma(N, 2\sigma^2)}{2N}\)

We also know that if \(X \sim \Gamma(k, \theta)\) then \(\sqrt{X} \sim \ddot \Gamma(2, 2k, \sqrt{\theta})\) where \(\ddot \Gamma\) is the generalized gamma distribution.

Therefore, \(\sqrt{Y_N} \sim \frac{\ddot \Gamma(2, 2N, \sigma \sqrt{2})}{\sqrt{2N}}\)

so \(\hat \sigma = E[\sqrt{Y_N}] = E[\frac{\ddot \Gamma(2, 2N, \sigma \sqrt{2})}{\sqrt{2N}}] = \sqrt{\frac{2}{N}} \frac{\Gamma(\frac{2N + 1}{\sigma \sqrt{2}})}{\Gamma(\sqrt{2}\frac{N}{\sigma})}\).

If this is correct then we're close to something like \(c_4\)! (I think this is cool because of the tight relationship between the Rayleigh and the symmetric bivariate Gaussian distributions.)

Now I don't know what algebra is allowed on those gamma functions and their parameters: I assume there's a trick to get the sigma out so that, as you suggested, it will cancel in the expression of a correction factor. Can you show me or direct me to examples?

Also I'm wondering if I missed something because the divisor in the first term is N instead of (N - 1)?
 
#6
One way to do this is to use Taylor Expansion about its mean.

Let \( Y = \frac {1} {2n} \sum_{i=1}^n R_i^2 \). We know that \( E[Y] = \sigma^2 \)

Now consider \( f(Y) = \sqrt{Y} \). Then

\( E[\sqrt{Y}] = E\left[\sqrt{\sigma^2} + \frac {1} {2\sqrt{\sigma^2}}(Y - \sigma^2) + \frac {1} {2} \frac {1} {2(-2)(\sigma^2)^{\frac {3} {2}}} (Y - \sigma^2)^2 + \ldots \right] \)

You can truncate the series at some point depending on your need for precision. Then you obtain an approximation for the expectation.
I tried to go ahead with this approach and am getting confused: It looks like the correction factor is a function of both n and the sample value \(\widehat{Y_n}\)? This alone is noteworthy, and proof of it would be very valuable to me!

I don't follow the Taylor expansion you're using. Can you walk me through that or at least show the formula for the nth term?

I understand your solution as follows: We want to find the analytic correction factor for the distribution parameter of small Rayleigh samples. We know that this will be some function \(c(n, \widehat{Y_n})\) such that \(c(n, \widehat{Y_n}) \sqrt{\widehat{Y_n}} = \sigma\).

We're going to expand \(\sqrt{\widehat{Y_n}}\) about the unknown point \(\sigma\), knowing that \(\sigma\) is positive and therefore the square root is defined.

In order to facilitate extraction of the unknown term we'll look at the reciprocal \(\frac{1}{c(n, \widehat{Y_n})} = \frac{\sqrt{\widehat{Y_n}}}{\sigma}\). Using the partial expansion you show here this will look like \(1 + \frac{\widehat{Y_n} - \sigma^2}{2\sigma^2} + ...\) and right there I'm still stuck with that \(\sigma\) term not cancelling out!

I've tried to be as tedious as possible with the notation to ensure that's not where I'm losing this. Further guidance on this is greatly appreciated!
 
Last edited:

BGM

TS Contributor
#7
Sorry previously I am not familiar with Rayleigh distribution so I am unaware of the close relationship between Rayleigh and Gamma

As pointed out,

\( Y_N \sim \frac {1} {2N} \text{Gamma}(N, 2\sigma^2)
= \text{Gamma}\left(N, \frac {\sigma^2} {N}\right) \)

With this we do the integration:

\( E[\sqrt{Y_N}]
= \int_0^{+\infty} \sqrt{y} \frac {1} {\Gamma(N)}
\left(\frac {N} {\sigma^2}\right)^N y^{N-1}
\exp\left\{- \frac {N} {\sigma^2} y \right\} dy\)

\( = \frac {\Gamma(N + \frac {1} {2})} {\Gamma(N)}
\left(\frac {\sigma^2} {N} \right)^{\frac {1} {2}} \int_0^{+\infty}
\frac {1} {\Gamma(N + \frac {1} {2})}
\left(\frac {N} {\sigma^2}\right)^{N+\frac {1} {2}} y^{N + \frac {1} {2}-1}
\exp\left\{- \frac {N} {\sigma^2} y \right\} dy\)

\( = \frac {\Gamma(N + \frac {1} {2})} {\Gamma(N)\sqrt{N}} \sigma \)

Since \( N \) is a positive integer,

http://en.wikipedia.org/wiki/Gamma_function#Properties

the above expression reduced to

\( \frac {(2N)!\sqrt{\pi}} {4^N N!(N-1)!\sqrt{N}} \sigma \)

Now you see the \( \sigma \) can be cancelled out. We do expect this because \( \sigma^2 \) acts as a scale parameter in Rayleigh and Gamma distribution.

Furthermore by Stirling's Approximation, it is not hard to show that
\( \lim_{N\to+\infty} \frac {(2N)!\sqrt{\pi}} {4^N N!(N-1)!\sqrt{N}} = 1 \)

So the estimator itself is asymptotically unbiased.
 
#8
@BGM, that looks absolutely brilliant!

I couldn't follow you through the integration, so if you could elaborate that step I would be grateful.

In any case, after I straightened out my Monte Carlo code this appears to work beautifully for correcting estimates of the Rayleigh parameter based on small samples!
 
Last edited:

BGM

TS Contributor
#9
This is a standard trick in integration problems related to probability:

1) Figure out the form of (gamma) pdf

2) Integrating the pdf over the entire support equals to 1.