# Thread: How well a few values fit a given distribution?

1. ## How well a few values fit a given distribution?

Dear all,

I have approximated two one-dimensional random variables and using Gaussian Mixture Models (3 gaussians). I used gmdistribution.fit Matlab function with 10000 values. The resulting distributions are called and (shown in attached Figure 1, being in red and in blue).

Now I have a few values (e.g. v=[-1 0 0.5 1 5 6 6.5 7 14], as in Figure 1). This vector produces a very sparse histogram, since there are not many values.

How probable is that these values were generated by distribution? or How probable is that these values were generated by distribution? I would like to obtain a probability value in order to classify the set of values to category A ( distribution) or category B ( distribution).

-> Joint distribution (product of probabilities)... but, what happens with outliers? Since and are approximations of the real distribution, a outlier might produce zero probability for some value of x (see attached Figure 2). So I am not sure about this way.

-> Average probability: This is a trivial solution I though, and probably it's wrong.

-> Hypothesis tests (Squared Chi, or Kolmogorov–Smirnov): In the case of squared Chi, data should be binned... what is the optimal size of these bins? In addition, these hypothesis tests produce a p-value, which is not the probability value I am looking for (as far as I understood).

Best regards,
Emliio.

2. ## Re: How well a few values fit a given distribution?

Any idea?

My questions are:

How to measure the goodness of fit between a few samples and a non-gaussian distribution?

Given two possible distributions: What is the probability that some few samples are generated by each of them?

Thanks,
Emilio.

 Tweet