Empirical cumulative distribution function? Some non-parametric estimation? Not sure.
Hello,
I am sorry for how confusing this will be. I was reading a while back about making a determination about the probability of a variable in a population without having precise information about that population. Basically, if there were a specific number of data point the probability of a certain data point falling in a certain percentage could be estimated without knowledge of the distribution. Any idea what area or topic I am talking about? Thanks.
Empirical cumulative distribution function? Some non-parametric estimation? Not sure.
You can always estimate the probability that a random variable will fall in an interval by taking the sample mean of the indicator of that interval.Basically, if there were a specific number of data point the probability of a certain data point falling in a certain percentage could be estimated without knowledge of the distribution
Say we wanted to estimate P(0 < X < 1) we could just usewhere
is just the indicator function.
The Strong law of large numbers gives us that for any interval of interest this converges to the true probability almost surely.
If we want a stronger result we can basically get a uniform convergence over the real line of the empirical CDF to the true CDF using the Glivenko-Cantelli theorem.
"His programming is malfunctioning. It begins! Get your weapons, he's going to become a killbot!!!" - bryangoodrich
You might want to consider using Chebyshev's inequality and a collection of numbers (x1, x2, x3,...xN; their empirical distribution, mean (xbar), and variance (v)). A statement could be made like; "The proportion of numbers x1, x2,..., xN that lie within k*Sqrt[v] of the mean xbar is at least 1 - 1/k^2, where k>=1.
|
|