Poissonness plot

noetsi

Fortran must die
#1
This is essentially the equivalent of a QQ plot for Poisson distribution, which can not use QQ plots since the distribution is of discrete data. Michael Friendly tells you how to use these and gives SAS code on 51 on in his book "Visualizing Categorical Data"

Sample code

Title ' Poissoness Plot: Horskicks';
Data horskicks;
input deaths corpsyrs;
label deaths = 'number of Deaths'
corpsyrs =' Number of Corps years':
datalines;

0 109
1 65
2 22
3 3
4 1
;

%poisplot (data=horskick, count=Deaths, freq=corpsyrs);


Obviously you need to run the macro %poisplot first per the link. Hoaglin created the poissonnees plot. Tukey helped create similar ones for logarithmic, geometric and binomila distributions. That dude got around...
 
Last edited:

Dason

Ambassador to the humans
#3
Is there a question or are you just sharing? You can do this very easily in R as well using the distplot function in the vcd package.

Code:
x <- 0:4
kicks <- c(109, 65, 22, 3, 1)
dat <- rep(x, kicks)

library(vcd)
distplot(dat, "poisson")

# How is this made?

N <- sum(kicks) # total number of observations
y <- log((factorial(x) * kicks)/sum(kicks))

o <- lm(y ~ x) # We get the slope/intercept given in plot
plot(x, y)
abline(o)
 

noetsi

Fortran must die
#5
The input code above inputs a actual poisson distribution - one based on a famous analysis. The macro will generate everything else. Note you will get lots of error messages when you run this, at least in EG, there are issues with the macro and more recent SAS code I believe. But it generates the same results the author finds in his book so the warnings don't seem to matter.

I ran a test to see if this works. First I ran a simulation from a known poisson distribution.

%Let N = 500;
Data Poisson (keep=x);
call streaminit(4321);
lambda = 4;
do i = 1 to &N;
X= rand('Poisson',lambda);
output;
End;
Run;

Then I tested this with the method shown above (test is the data set that has the counts and frequency in it, this is obviously an easier way to do this then input them manually).

title 'Poissoness Plot: Death by Horsekicks';
data horsekick1;
SET test1;
label count = 'Number of Deaths'
freq ='number of corps years' ;


%poisplot (data=horsekick1, count=count, freq=freq);


This showed correctly the distribution was poisson. Note the leveled poisson plot suggested to me, incorrectly, that the distribution was not poisson. I am not sure I know how to run this.

Then I generated a binomial distribution.

%let N =500;
Data binomial (keep=x);
Call streaminit(4321);
p = 1/2;
Do i=1 to &N;
x = RAND('Binomial',p,10);
output;
End;
Run;

I tested this with the code I used above again.

title 'Poissoness Plot: Death by Horsekicks';
data horsekick3;
SET test3;
label count = 'Number of Deaths'
freq ='number of corps years' ;


%poisplot (data=horsekick3, count=count, freq=freq);

The result is a curve around the diagnonal which the author states to be an indication the distribution is not poisson. Which of course is correct.


I have not got to the other plots - this stuff is not easy for me:p Distplot is the macro apparently for these plots similar to what Dason noted in R.

This is useful by the author.

http://www.datavis.ca/courses/grcat/grcfoils1_4.pdf
 
Last edited:

noetsi

Fortran must die
#6
Is there a question or are you just sharing? You can do this very easily in R as well using the distplot function in the vcd package.

Code:
x <- 0:4
kicks <- c(109, 65, 22, 3, 1)
dat <- rep(x, kicks)

library(vcd)
distplot(dat, "poisson")

# How is this made?

N <- sum(kicks) # total number of observations
y <- log((factorial(x) * kicks)/sum(kicks))

o <- lm(y ~ x) # We get the slope/intercept given in plot
plot(x, y)
abline(o)
Just sharing for a change. Does R created leveled Poissoness plots? These are the one element I have trouble with in SAS - although I may be doing them wrong.
 

noetsi

Fortran must die
#7
Whatever you do don't run the macros word gskip etc attached to this macro. It totally messes up poisplot and there appears to be no way to fix this even opening up different EG or SAS and starting from scratch :mad:

Sorry I forgot to post the link to the macro...

http://euclid.psych.yorku.ca/SCS/vcd/poisplot.html

This is the link to the macro distplot used for the other distribution types.

http://www.datavis.ca/books/vcd/distplot.html

"The DISTPLOT macro constructs plots of a discrete distribution designed to diagnose whether the data follows one of the standard distributions: the Poisson, Binomial, Negative Binomial, Geometric, or Log Series, specified by the DIST= parameter. The usual (PLOT=DIST) plot is constructed so that the points lie along a straight line when the data follows that distribution. An influence plot (PLOT=INFL) shows the influence of each observation on the choice of the distribution parameter(s)."
 
Last edited:

hlsmith

Not a robit
#8
Will play around with it next week. Out of the office with a sick kid. Currently watching the first season of Manhattan. Takes two episodes to get into it but now I am on episode 11.
 

noetsi

Fortran must die
#9
Sorry about your child :(

When I run distplot I run into problems.

This is the error message.

WARNING: Apparent invocation of macro WORDS not resolved.
WARNING: Apparent invocation of macro WORDS not resolved.
ERROR: Required operator not found in expression: &nparm = 0
ERROR: The macro DISTPLOT will stop executing.

The warnings may be meaningless, I get the samewhen using the poissonness macro and the results are fine. According to the link by Friendly the only things you have to include are the following:

%distplot(data=qtest6,count=count,freq=freq,dist=binomial);

but that generated the error message above. Unfortunately the way the macro is formatted makes it very hard to read and find the problem {which I might not understand anyhow} :p
 

Dason

Ambassador to the humans
#10
This is the error message.

WARNING: Apparent invocation of macro WORDS not resolved.
WARNING: Apparent invocation of macro WORDS not resolved.
ERROR: Required operator not found in expression: &nparm = 0
ERROR: The macro DISTPLOT will stop executing.

The warnings may be meaningless, I get the samewhen using the poissonness macro and the results are fine.
You get the same errors? Are you sure they're the same for the poisplot macro as is above?
 

noetsi

Fortran must die
#12
Actually it said that it could not find three macros including word. aftr I ran that it still can't find LABEL and GSKIP. I will look for those and run them. Thanks for assistance dason. Friendly does not suggest you need to run those macros in the book or his web site.
 

noetsi

Fortran must die
#13
I ran all three macros word label and gskip. It runs now, but no graph is produced. I get the following error message.

ERROR: No valid observations are found.
24 order=(. to . by 1 )
_
22
200
ERROR 22-322: Syntax error, expecting one of the following: a quoted string, a numeric constant, a datetime constant.

ERROR 200-322: The symbol is not recognized and will be ignored.

WARNING: AXIS statement 1 not found. The default axis description will be used.
WARNING: Graph has missing data for the required role ( y * count ), no graph will be produced.
WARNING: No output was generated due to all missing data.

The problem for me is that the macro is so long figuring out where the error is is very difficult.

Maybe HLSMITH can figure out what is wrong, or dason :p I do think that you have to run all three macros word label and gskip although that does not fully correct the problem. Looking at the data produced it appears that the confidence intervals are not being generated.
 
Last edited:

noetsi

Fortran must die
#15
What I found really anoying about this is that it was working fine. Then I ran the macros noted above to fix something and I could never get the basic macro to work again. Even when I started a brand new session of SAS.
 

noetsi

Fortran must die
#16
hlsmith I have actually run that successfully. But it is a lot of work to get that going, much less than Friendly. And it does not apply to many distributions that friendly addressed.