# comparing two Poissonians

#### ee00

##### New Member
What is the proper statistical test for the following:

I know A1 and A2 to be Poisson distributed with (unknown) means x1 and x2.
I have a single measurement of both: n1 and n2. I want to test the zero hypothesis that x1=x2.

Example: number of problems in an appliance over one year follows a Poisson dist. I have two appliances. One had 10 problems over the year, and the other had none. Can I conclude there is a real difference in the underlying mean numbers of problems per year? what is the p-value?

I look for a statistic that would have a lambda independent distribution, and a value that would be "small" for identical dist. My first guess was (n1-n2)^2/(n1+n2) which has the right scaling but does not work (besides having a problem at n1=n2=0).

Ideas anyone?

#### fed1

##### TS Contributor
You should use the LRT reported by your stats package. Are you using SPSS or SAS. I dont think this LRT is based on the difference, as i recall.

The intuitive explanation behind this is that having counts both equal to one does not carry the same evidence for the null as does counts both equal to say 50, since the variance is a function of lambda.

Last edited:

#### squareandrare

##### New Member
My first thought was also to use the LRT, but there might be a problem with the 0 count. You would have to assume that 0^0=1, and I'm not sure what a statistical software package would do when it tried to evaluate that expression.

If you make the 0^0=1 assumption, I'm getting a p-value of .049.

Last edited:

#### fed1

##### TS Contributor
if the count is 0 then there is no variance. Nothing to explain = nothing to test.

There is no test for events that have never been observved.

Sounds like square has the right formula, care to share?

#### squareandrare

##### New Member
The MLE for u for a count of 0 is straight-forward. It's 0.

Let u1 be the mean parameter for the first distribution and u2 be the mean for the second distribution.

Under the null, u1=u2=u.

The MLE (again, straight-forward) is u=5.
So, the numerator of the Likelihood ratio will be [ e^(-5)*5^10/10! ] * [ e^(-5)*5^0/0! ].

Under the alternative hypothesis, the MLE for u1 is 10, and the MLE for u2 is 0.

Now, let's think about what it means to be Poisson with mean parameter 0. Technically, it's undefined, as the mean parameter is defined as positive. However, it seems to me that it would have to take the value 0 with probability one. So that implies that (e^0*0^0/0!)=1. Since e^0=(0!)=1, that would imply that 0^0 has to equal 1 (technically, 0^0 is indeterminate).

So, if we assume that (e^0*0^0/0!)=1, the denominator of the Likelihood ratio would be:

e^(-10)*10^10/10! *1

Simplifying the numerator and denominator of the Likelihood ratio, we get (5^10)/(10^10) = 2^(-10).

-2*ln( 2^(-10) ) = 20*ln(2) = 13.8629, which follows Chi-square with one degree of freedom (we estimated one additional parameter under the alternative). This has a p-value of about .0002.

Last edited:

#### fed1

##### TS Contributor
Yeah, only one count 0 is fine,

the estimate of u under the null should be 10.

#### squareandrare

##### New Member
the estimate of u under the null should be 10.
No, it shouldn't.

L(u) = [ e^(-u)*u^10/10! ] * [ e(-u)*u^0/0! ]
=e^(-2u)*u^10/10!
ln(L(u))=-2u+10ln(u)-ln(10!)
Take derivative wrt u and set to 0:
0=-2+10/u
u=5

That's the derivation, and it should be intuitive that the estimate of the mean is going to be the mean of the observations: (10+0)/2=5

#### fed1

##### TS Contributor
oops, I experienced brain fart, I was thinking it was the sum of the counts for some reason!

#### squareandrare

##### New Member
Yeah, those brain farts happen. It is essentially the sum of the counts divided by the total time. You can think of it as a count of 10 over 2 years (even though the processes were in parallel over the same year), so the mean per year is obviously 5.

#### squareandrare

##### New Member
...****, speaking of brain farts, my numerator isn't right. I forgot to multiply by e^(-5)*5^0/0!

Ok, the test statistic should be 13.86, which follows Chi-square df=1 and p-value=.0002

Previous post has been edited to correct the error...

Last edited:

##### Ninja say what!?!
Hey guys. I have a question and hope that you'll chime in.

I had the same reasoning to Square's solution when I read the question. However, there's one thing I haven't come to accept. If the two observations do indeed come from different distributions, is it really logical to use one obs per parameter?

Even if we're forced to accept a Poisson distribution, the difference between the log likelihood for the model and the null would be so small that we would have to reject the model.

#### squareandrare

##### New Member
Yeah, samples of size 1 might be a problem because -2*log-likelihood-ratio is asymptotically Chi-square. For small samples, it may deviate significantly from the Chi-square distribution.

##### Ninja say what!?!
Great! Thanks for verifying my suspicion! Good thread.

#### ee00

##### New Member
Thank you all!

The ratio of liklihoods sounds like a good idea, but it's not chi^2 distributed. Any idea what it *is* distributed like for one measurement for each sample?

Thanks!

#### ee00

##### New Member
PS does it distribute like chi^2 at least when the count number is large (e.g. 500 and 450 instead of 10 and 0 as in my example) ?

#### BGM

##### TS Contributor
If you do not want to approximate asymptotic LRT, you may via the following
LRT test:

Let $$X_1 \sim Poisson (\theta_1), X_2 \sim Poisson (\theta_2)$$

$$H_0 : \theta_1 = \theta_2$$ vs $$H_1 : \theta_1 \neq \theta_2$$

$$\sup_{\theta \in \Theta_0} L(x_1, x_2) = e^{-\frac{x_1+x_2} {2}} \frac {(\frac {x_1 + x_2} {2})^{x_1}} {x_1!} e^{-\frac{x_1+x_2} {2}} \frac {(\frac {x_1 + x_2} {2})^{x_2}} {x_2!}$$

$$\sup_{\theta \in \Theta} L(x_1, x_2) = e^{-x_1} \frac {x_1^{x_1}} {x_1!} e^{-x_2} \frac {x_2^{x_2}} {x_2!}$$

$$\Lambda(x_1, x_2) = \frac {\sup_{\theta \in \Theta_0} L(x_1, x_2)} {\sup_{\theta \in \Theta} L(x_1, x_2) } = \frac {(\frac {x_1 + x_2} {2})^{x_1 + x_2}} {x_1^{x_1}x_2^{x_2}}$$

Note $$\Lambda(x_1, x_2) \in [0, 1]$$
and we reject $$H_0$$ when $$\Lambda(x_1, x_2) < c$$
$$\Lambda(0, 10) = \frac {1} {2^{10}} = \frac {1} {1024}$$

You can choose the corresponding critical value by setting
$$\Pr\left\{\Lambda(X_1, X_2) \leq c \bigg|H_0\right\} = \alpha$$
where the significance level $$\alpha$$ is prespecified

or you evaluate the P-value
$$= \Pr\left\{\Lambda(X_1, X_2) \leq \frac {1} {1024}\bigg|H_0\right\}$$

But that is hard to evaluate so taking the approximation seems much easier

#### fed1

##### TS Contributor
Getting back to the business about asymptotic distribution of the LRT, I was thinking that the convergence ought to depend on how large lambda is, and not the number of sampling volumes.

Doesnt poisson become more "normalish" as lambda gets bigger, or am I mistaken in that regard.

I think increasing lamda and increasing sample size are related cuz

X1 + X2 ~ Poisson(2*lambda) is like doubling lambda.

It suggests that convergence of the lrt depends on both number of samples and the lambda?

#### squareandrare

##### New Member
It does depend on lambda. Because of the stationarity of the Poisson process, samples for a year could be considered 12 samples, one from each month. In that case, it seems like it would depend more on the observation count (which of course depends on lambda).

##### Ninja say what!?!
It does depend on lambda. Because of the stationarity of the Poisson process, samples for a year could be considered 12 samples, one from each month. In that case, it seems like it would depend more on the observation count (which of course depends on lambda).
My understanding is that once the unit of time is set (ie months or years), lamba is then independent of the number of observations, correct? Though I can see how the convergence of the lrt depends on both number of samples and the lambda.

#### squareandrare

##### New Member
Lambda is proportional to the time scale. Lambda will be 12x larger using years as opposed to months.

I guess the best way to say it is that the convergence of the LRT depends on the total number of counts observed, whether it be a large count in a single time period or small counts over multiple time periods.

Last edited: