math to compute width of gaussian PDF(x) given sigma & N values of random variable x

#1
Hi Experts,

I'm working in industry and have an application requiring some expert knowledge on statistics/probability. I have a probability distribution function (PDF) for a Gaussian random variable. I know the standard deviation of the PDF. I also know total number of experiments conducted, where one experiment is one value of the random variable, x.

For example, the standard deviation in my application is 1 ps RMS (e.g. 1 ps = 1E-12 seconds). The number of measured values for my random variable is 600E+9 (e.g. 600E+9 individual values of x; I don't have the individual values, but I know 600E+9 of them were measured).

From this information, I need to predict the largest peak-to-peak deviation (e.g. the width of the PDF, from the end of one tail to the end of the other tail) that may be observed (as a function of a given confidence level, or confidence interval, not sure what's the right terminology here; I believe this level/interval is needed to define the goal, correct me if not).

Can anyone help me understand the equations involved? I know Gaussian PDF for random variable x is

PDF(x) = (1/sigma*sqrt(2*pi))*e^(x*x/(2*sigma^2))

Not sure how to quantify the largest peak-peak deviation expected based on number o acquired samples N and sigma. Thanks in advance. -Tim
 

Link

Ninja say what!?!
#2
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

I'm not sure that I understand you, so lets see if I get this right.

- You are assuming a normal distribution, of which you know the SD.
- You have 600 x 10^9 observations
- You want to determine the mean.
- You want to determine a confidence interval for this estimate of the mean

Am I right?
 
#3
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

Hi Link,

The mean is known to be zero. The standard deviation is also known, and referred to using the "sigma" variable above. I can provide the confidence level if needed to solve this problem (e.g. we can assume 95%), which I think should be required (correct me if not).

I don't actually have the N=600x10^9 individual observations, but I want to know that if I DO observe N separate measurements of the random variable, x, how far out into the tail of the Gaussian PDF can I observe (e.g. with 95% confidence level)? For example, given N, sigma, CL=95% (mean=0) for normal distribution, I can at most observe Q sigma into the left-tail of the PDF and (since it's symmetric distribution) Q sigma into the right-tail of the PDF, and therefore, I can observe 2*Q peak-peak deviation max -- my question is how to solve for Q? Hope that helps.
 

Dason

Ambassador to the humans
#4
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

So one way to think of this is you want to find a prediction interval for the observed maximum value for that sample?
 

BGM

TS Contributor
#5
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

Or you want to compute the expected sample range from 600E+9 samples?

(sample maximum - sample minimum)
 
#6
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

Hi Dason,

Sounds right, although honestly I'm not experienced enough to match my application with the precise meaning of "prediction interval". If we conduct N measurements of random variable x having a zero mean, and compute a histogram of the results, I want to compute the width of the histogram. I don't actually have the N measurements (otherwise I'd simply plot the histogram and measure it), but let's say I want to predict the histogram's width if I WERE to conduct N measurements. Is there any math to compute the histogram's width, which I assume requires one to provide given some confidence level or probability?
 
#7
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

Hi BGM, Yes, I want to predict (sample maximum - sample minimum), where sample maximum is the largest POSITIVE value of x measured, and sample minimum is the largest NEGATIVE value of x measured (e.g. the mean is zero). I assume a symmetrical distribution, so the absolute value of both of these values should be equal (and I'll multiply this value by 2 to get my final peak-to-peak result).
 
#9
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

The nature of the random variable is random (e.g. thermal noise).
 
#10
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

The nature of the random variable is random (e.g. thermal noise).
I dont know anything about thermal noise but why does that make it normal?

Wikipedia says:
In probability theory, the normal (or Gaussian) distribution is a continuous probability distribution that is often used as a first approximation to describe real-valued random variables that tend to cluster around a single mean value.

This to me implies that data distributed this way is not random.

There are several tests for normality that you can use to prove it.
 

Dason

Ambassador to the humans
#11
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

This to me implies that data distributed this way is not random.
What? Are you saying that data that is distributed like a normal distribution isn't random?

There are several tests for normality that you can use to prove it.[/QUOTE]

Typically you can't 'prove' normality. You can provide evidence against the idea the the distribution is normal but there isn't a very good way to provide evidence that the distribution is normal. Given the sample size even if the data is almost exactly normal, if it isn't perfectly normally distributed then it will fail basically any test of normality.

By the way - a lot of times 'noise' does tend to follow an approximate normal distribution.
 
#12
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

Even if the noise is perfectly not Gaussian (or Normally distributed), by the central limit theorem, as the number of independent noise sources increase, the resulting distribution converges to Gaussian, and more so as the number of samples increases.
 

BGM

TS Contributor
#13
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

Since you have a very large sample size, you may use some asymptotic theory to get
an approximation.

You may go to search for the book written by H.A. David. Also you may search for the
extreme value theory. Sorry I have very limited knowledge in order statistics, which
cannot help much.
 
#14
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

Normal distribution implies some kind of weighting around a central point. Can that occur and still be "random"?

My understanding of clt is that the underlying process that produces the noise cannot be non-stationary.
 

Dason

Ambassador to the humans
#15
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

Normal distribution implies some kind of weighting around a central point. Can that occur and still be "random"?
I think you're confused as to what "random" means. You might be thinking that "random" implies some sort of uniform distribution. That is not the case.
 

Dason

Ambassador to the humans
#17
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

Because you don't think that data can be created by a random process and still end up having a weighting around a central point? It's very possible that I'm misunderstanding you but why don't you explain to me why you think that.
 
#18
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

By saying noise is normally distributed you are implying a distribution of some characteristic of the noise.
 
#19
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

I might be missing something here, but it seems a quick and dirty way to find the answer would be to simulate.
 

Dason

Ambassador to the humans
#20
Re: math to compute width of gaussian PDF(x) given sigma & N values of random variabl

I might be missing something here, but it seems a quick and dirty way to find the answer would be to simulate.
But we can get an analytic solution. Simulation is great but if we can get an analytic solution I would recommend that. Plus with the sample size the OP is dealing with it would take quite a bit of work just to simulate a single run of this data (N = 600000000000).