1. ## comparing two Poissonians

What is the proper statistical test for the following:

I know A1 and A2 to be Poisson distributed with (unknown) means x1 and x2.
I have a single measurement of both: n1 and n2. I want to test the zero hypothesis that x1=x2.

Example: number of problems in an appliance over one year follows a Poisson dist. I have two appliances. One had 10 problems over the year, and the other had none. Can I conclude there is a real difference in the underlying mean numbers of problems per year? what is the p-value?

I look for a statistic that would have a lambda independent distribution, and a value that would be "small" for identical dist. My first guess was (n1-n2)^2/(n1+n2) which has the right scaling but does not work (besides having a problem at n1=n2=0).

Ideas anyone?

2. You should use the LRT reported by your stats package. Are you using SPSS or SAS. I dont think this LRT is based on the difference, as i recall.

The intuitive explanation behind this is that having counts both equal to one does not carry the same evidence for the null as does counts both equal to say 50, since the variance is a function of lambda.

3. My first thought was also to use the LRT, but there might be a problem with the 0 count. You would have to assume that 0^0=1, and I'm not sure what a statistical software package would do when it tried to evaluate that expression.

If you make the 0^0=1 assumption, I'm getting a p-value of .049.

4. if the count is 0 then there is no variance. Nothing to explain = nothing to test.

There is no test for events that have never been observved.

Sounds like square has the right formula, care to share?

5. The MLE for u for a count of 0 is straight-forward. It's 0.

Let u1 be the mean parameter for the first distribution and u2 be the mean for the second distribution.

Under the null, u1=u2=u.

The MLE (again, straight-forward) is u=5.
So, the numerator of the Likelihood ratio will be [ e^(-5)*5^10/10! ] * [ e^(-5)*5^0/0! ].

Under the alternative hypothesis, the MLE for u1 is 10, and the MLE for u2 is 0.

Now, let's think about what it means to be Poisson with mean parameter 0. Technically, it's undefined, as the mean parameter is defined as positive. However, it seems to me that it would have to take the value 0 with probability one. So that implies that (e^0*0^0/0!)=1. Since e^0=(0!)=1, that would imply that 0^0 has to equal 1 (technically, 0^0 is indeterminate).

So, if we assume that (e^0*0^0/0!)=1, the denominator of the Likelihood ratio would be:

e^(-10)*10^10/10! *1

Simplifying the numerator and denominator of the Likelihood ratio, we get (5^10)/(10^10) = 2^(-10).

-2*ln( 2^(-10) ) = 20*ln(2) = 13.8629, which follows Chi-square with one degree of freedom (we estimated one additional parameter under the alternative). This has a p-value of about .0002.

6. Yeah, only one count 0 is fine,

the estimate of u under the null should be 10.

7. Originally Posted by fed1
the estimate of u under the null should be 10.
No, it shouldn't.

L(u) = [ e^(-u)*u^10/10! ] * [ e(-u)*u^0/0! ]
=e^(-2u)*u^10/10!
ln(L(u))=-2u+10ln(u)-ln(10!)
Take derivative wrt u and set to 0:
0=-2+10/u
u=5

That's the derivation, and it should be intuitive that the estimate of the mean is going to be the mean of the observations: (10+0)/2=5

8. oops, I experienced brain fart, I was thinking it was the sum of the counts for some reason!

9. Yeah, those brain farts happen. It is essentially the sum of the counts divided by the total time. You can think of it as a count of 10 over 2 years (even though the processes were in parallel over the same year), so the mean per year is obviously 5.

10. ...****, speaking of brain farts, my numerator isn't right. I forgot to multiply by e^(-5)*5^0/0!

Ok, the test statistic should be 13.86, which follows Chi-square df=1 and p-value=.0002

Previous post has been edited to correct the error...

11. Hey guys. I have a question and hope that you'll chime in.

I had the same reasoning to Square's solution when I read the question. However, there's one thing I haven't come to accept. If the two observations do indeed come from different distributions, is it really logical to use one obs per parameter?

Even if we're forced to accept a Poisson distribution, the difference between the log likelihood for the model and the null would be so small that we would have to reject the model.

12. Yeah, samples of size 1 might be a problem because -2*log-likelihood-ratio is asymptotically Chi-square. For small samples, it may deviate significantly from the Chi-square distribution.

13. Great! Thanks for verifying my suspicion! Good thread.

14. Thank you all!

The ratio of liklihoods sounds like a good idea, but it's not chi^2 distributed. Any idea what it *is* distributed like for one measurement for each sample?

Thanks!

15. PS does it distribute like chi^2 at least when the count number is large (e.g. 500 and 450 instead of 10 and 0 as in my example) ?