Uninformative gamma prior for Poisson distribution


I want to use Poisson distribution in my research to model some waiting times and I want to set the gamma prior to its parameter. I want the prior to be as uninformative as possible.

From literature I've read that gamma prior cannot really be considered as uninformative, but nevertheless Gamma(0.001, 0.001) is in practice used as a prior for Poisson parameter.

The problem for me is that I don't understand, how could I use this Gamma(0.001, 0.001) prior. It is relatively flat by bigger numbers, but there is a peak near zero. If I draw a value from it, then most probably I get a value between 0 and 1 and I should draw lots of values to get one which would be bigger than one. This, however, means that I cannot really use it as uninformative prior for my Poisson, because it is very heavily biased to values between 0 and 1.

Can you give me some explanation, how this "uninformative" gamma prior is actually used for drawing Poisson parameter?
If you don't have a prior, why do a Baysian analysis?

I'd need to know more about the nature of your research to give you a more definitive answer, but if you have a sample of wait times, you can fit to a Poisson model and give a confidence interval on its parameter without ever worrying about an assumed prior.
I'll try to explain my research problem a bit:

I am actually dealing with NLP and I want to do some text segmentation. The Poisson distribution would be used to model the length of the segments. As the segments I try to learn do not correspond exactly to any explicit structural elements in language (like words for example) then I really don't know the mean value of the length of segments before.

I use MCMC (Gibbs sampling) to find the segmentation and I also plan to sample the Poisson parameter during every iteration by taking the gained posterior as a new prior.

I have some experiences about sampling the parameters in this way by setting the first prior to be uninformative and updating it accoring to the data and the results were really good and it seems a clean way to model such parameters.

The alternative approach I could think of would be to make an arbitrary segmentation first according to some other model that does not involve the segmentation length, then learn the gamma parameters from that data and use this as a prior for my Poisson.


Ambassador to the humans
If you don't have a prior, why do a Baysian analysis?
You're not much for the Bayesian approach I take it? There are quite a few reasons why somebody might opt to do a bayesian approach even if they don't necessarily have strong/any prior beliefs. One reason would be that it's very difficult/impossible to do the things you want in a frequentist setting. The OP clearly isn't in this situation. One other reason is that it is a lot nicer to interpret things in a bayesian setting. The last reason would be that Bayesians have several formulations why a bayesian route is the 'right' way to do things. So you could be doing it for a philosophical reason. I'm not saying that you have to be a Bayesian. But there definitely reasons why somebody would take a bayesian route even if they have to use some sort of noninformative prior.


Ambassador to the humans
I'll merge this with my previous post later:

OP - If you want to know why many people consider that an uninformative prior consider what the parameters of the posterior distribution are. If the prior parameters are really small do they have much impact on the posterior distribution? The gamma-poisson model is one of the nice ones that we can work out analytically which makes it nice.
Karamnula: While I still don't entirely get what you are doing, it sounds to me like the prior is an integral part of your algorithm: you can't just choose to run without a prior, as I originally suggested. What I would suggest, then, is that you test your system with many different original priors. I would hope that it always converges to the same result across a wide range of original priors. If it does, you can stop worrying about details of the original prior. If it does not, you have a complex engineering problem, not really a statistics problem, on your hands to determine which is the best for your application.


Ambassador to the humans
I guess I can be a little more explicit now. If we assign a gamma prior for our poisson parameter we would have:

\(\pi(\theta) \sim Gamma(\alpha, \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)}\theta^{\alpha -1}e^{-\beta \theta}\)

as our prior. I explicitly stated the form of the density because the gamma can be parameterized many different ways. If you go through the motions to analytically figure out what the posterior distribution is we get:

\(p(\theta | y) \sim Gamma(\alpha + \sum_{i=1}^n Y_i, \beta +n) \)

You can see that if we make \(\alpha\) and \(\beta\) really small then they have very little influence on the posterior distribution. Note also that the posterior mean is:

\(E[\theta | y] = \frac{\alpha + \sum_{i=1}^n Y_i}{\beta + n}\)

which if we let \(\alpha\) and \(\beta\) go to zero this approaches the sample mean.
Hi and thank you all who have tried to help me.

Actually the relationship between Poisson and Gamma via conjugation is known to me and I understand the principle well. The hard part for me was that before I have observed any data, I still need some parameter for Poisson process to start doing the simulation. Meaning that I have to draw some value from the initial Gamma(a, b).

But I also understand now that maybe I don't have to set the gamma parameters so small as 0.001 and it's enough for them to be somewhere around 0.2 and 0.1, and they still don't influence posterior much.

In that sense I agree that the most reasonable thing to do would be to try different priors and see if they affect the result or not.

Anyway, I got some thoughts and I think I can go further with my work now.