Would you mind defining what you mean by peak-to-peak value
I’m trying to find the pdf for the peak-to-peak value of n events from a normal distribution. I was hoping to use this to help me compare some actual experiments to assumed models although at this point it has become more of an academic puzzle to me. Not a homework problem, but this seemed like the best forum.
Suppose I have a random variable X that follows a standard normal distribution. Now suppose I take 1e6 independent trials from this distribution, and define a new random variable Y equal to the peak-to-peak value of that population. What is the pdf of Y?
I was thinking to use the binomial distribution. If p is for a single trial, then the probability of exactly one trial out of n yielding is:
(This would give the peak value. By symmetry, the peak-to-peak value is twice this number.)
For an example population of 1e6, this gives a normal-like (but asymmetrical) distribution centered around 9.5, which seems reasonable since for a single trial, P(x > 4.75) = 1e-6. But I did a simulation with MATLAB and the empirical results for 10000 trials of Y give a narrower distribution than my computed values, and with a different mean.
I've attached a graph of my empirical (blue) vs. analytical (red) results for the peak-to-peak value when n = 1e6. The vertical scale is obviously the histogram counts for the simulation; the red curve has been scaled arbitrarily so the general shapes of the distributions can be compared.
Where did I go wrong? I’m concerned that the event “exactly one of n trials with ” isn’t precisely the same as “finding the peak value”. (For this reason, I also tried computing the probability of “exactly 0 trails with ” and then using 1-(this probability). This gave very slightly different results but not enough to explain the difference shown by the graphic.) Another possibility is that my analytical approach is correct, but that MATLAB’s double precision math begins to fail when a number very close to 1 is raised to the 1e6 power. Or that MATLAB’s randn() method is biased if you push it to this level, leading my empirical results to be misleading.
Thanks for any ideas.
Would you mind defining what you mean by peak-to-peak value
I don't have emotions and sometimes that makes me very sad.
Sorry if my terminology is imprecise or out of step with traditional statistics vocabulary. Would range or span be more meaningful? In the n values of X that compose a single trial for Y, I mean to define peak-to-peak value as the maximum on this set minus the minimum.
There isn't a very nice formula for the distribution of the range. It can be done but it is via numeric integration for the most part. Here is the wikipedia article on it: http://en.wikipedia.org/wiki/Range_(...)#Distribution
I don't have emotions and sometimes that makes me very sad.
mcantre (01-24-2014)
Thanks Dason, this reference is just what I was looking for. It's nice to be set upon the right path (even if the path is short and gets too steep for someone without climbing gear). True to your signature line, I suppose.
At least there's no evidence that MATLAB has failed me.
Best regards...
Tweet |