I have a probability function, and I would need the distribution for it, so I can calculate confidence intervalls.
A nominal variable X can take R discrete values:
Consider now two samples i and j, with Ni and Nj observations respectively.
Let Fi(r) be the frequency of the value xr in sample i:
and similarly for sample j.
Let L(i,j) be the probability of an observation in sample i having the same X value as an observation in sample j:
since
Fi(r) Fj(r) is the number of pairs of observations from the two samples that have the same value xr, and
Ni Nj is the total number of pairs of observation.
Which distribution does L(i,j) follow???
If someone can help me find the distribution function, I can use this statistic to test hypotheses like:
"Is the tendency of sample i to have the same value as sample j significantly different from its tendency to have the same value as sample k?"
HO: L(i,j) = L(i,k)
H1: L(i,j) <> L(i,k)
Last edited by Goran_L; 08-21-2008 at 07:06 AM.
Thanks! Yes, I am missing a distribution function for X. I am making this up as I go along, so this kind of help is really useful for me.
What I am trying to get to is some kind of measure for whether (and how much) sample i tends to get the same values as sample j.
There is indeed a "reference" distribution, which is Fk(r). But I don't want to test whether Fi(r) follows the same distribution as Fk(r), because for my specific problem whether it does or not is not relevant. What I want to test is whether Fi(r) is significantly more similar to Fj(r) than to Fk(r).
I don't know how to go about testing that.
Would you have any suggestions?
|
|