Background:
I'm currently trying to learn how to use Bayesian Analysis for split testing two sets of binomial data. (Web analytics, Multi-Armed Bandit approach, etc). The Beta distribution kept coming up in this research, so Im trying to understand and apply it better.
So, the best explanation I found so far was here:
http://stats.stackexchange.com/questions/47771/what-is-intuition-behind-beta-distribution
My question is in regards to the approximation of alpha and beta.
In the example, the approximations were 81 & 221 respectively. So, what is the best method for approximating these values?
I was under the impression that we should stick with integer values for alpa and beta, so it seemed these values were picked as the best integer values to satisfy the prior distribution and create a mean of .27
However, it seems to me that the magnitude of these values will greatly skew the performance of the bayesian update. As in, new data will have a much smaller effect on the posterior distribution if alpha + beta > 1000, vs. alpah + beta < 10.
I hope I explained my question well enough, any help would be greatly appreciated
I'm currently trying to learn how to use Bayesian Analysis for split testing two sets of binomial data. (Web analytics, Multi-Armed Bandit approach, etc). The Beta distribution kept coming up in this research, so Im trying to understand and apply it better.
So, the best explanation I found so far was here:
http://stats.stackexchange.com/questions/47771/what-is-intuition-behind-beta-distribution
My question is in regards to the approximation of alpha and beta.
In the example, the approximations were 81 & 221 respectively. So, what is the best method for approximating these values?
I was under the impression that we should stick with integer values for alpa and beta, so it seemed these values were picked as the best integer values to satisfy the prior distribution and create a mean of .27
However, it seems to me that the magnitude of these values will greatly skew the performance of the bayesian update. As in, new data will have a much smaller effect on the posterior distribution if alpha + beta > 1000, vs. alpah + beta < 10.
I hope I explained my question well enough, any help would be greatly appreciated