# How do I test if my data is Poisson Distributed?

#### DBM

##### New Member
I'm unsure as to what the correct way is to calculate Pearson's Chi-Squared statistic from my data to then test if it is Poisson distributed.
(I'm working in RStudio)
Here is my data:

> X
 70 60 60 57 100 100 75 60 65 70 120 60 65 60 80 90 65 80 75 75 100 65 90 60 150 60 65 75 75 85 50 75 90 100 60 85
 80 65 80 80 100 110 90 80 85 60 65 70 65 80

I first used the table() function to see the frequencies of my data. My plan is to use these frequency values as my observed values.

> Y = table(X)
> Y
X
50 57 60 65 70 75 80 85 90 100 110 120 150
1 1 9 8 3 6 7 3 4 5 1 1 1

I saw I had 13 different values in my data so I assigned each frequency an r indicator from 0 to 12.
I then calculated the expected values using the dpois() function with lambda equal to the mean of my observed values

> N = length(X)
> N
 50
> r = 0:12
> r
 0 1 2 3 4 5 6 7 8 9 10 11 12
> E = dpois(r, lambda = mean(Y))*N
> E
 1.06808696 4.10802676 7.90005147 10.12827112 9.73872223 7.49132479 4.80213128 2.63853367 1.26852580 0.54210504 0.20850194 0.07290278
 0.02336627

Because some of the values are quite small, I summed the first two and the last six values that they are close to 5 (as a rule of thumb). Then I did the same with the observed values.

> Y = c(sum(Y[1:2]), Y[3:7], sum(Y[8:13]))
> Y
2 9 8 3 6 7 15
> E = c(sum(E[1:2]), E[3:7], sum(E[8:13]))
> E
 5.176114 7.900051 10.128271 9.738722 7.491325 4.802131 4.753936

Using these values I calculated Pearson's Chi-Squared statistic:

> X_squared = sum((Y - E)^2/E)
> X_squared
 30.59809

The lastly I calculated the p-value. Because I doing a goodness-of-fit test for the Poisson distribution I use n-2 degrees of freedom

> 1 - pchisq(X_squared, df = 7-2)
 1.124392e-05

Is this the correct way to test for the Poisson distribution?

#### Alex MacMillan

##### New Member
Have you been able to look at the distribution of your data by plotting it?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
I would agree that looking at it is important and that if mean count 8 or greater data can be model with linear reg. I think there may be a q-q plot that can be created for Poisson dist.