# One-Sample Bayesian Proportion Test

#### hlsmith

##### Less is more. Stay pure. Stay poor.
https://www.lexjansen.com/phuse/2011/sp/SP08.pdf

In the above (short) paper, the authors present an approach for a one-sample Bayesian proportion test with uninformed and informed priors. Below I have presented code for the informed setting, which consists of just a few lines of code. However, something about the approach isn't clicking in my head. I think first issue is that in the past I have done most of my Bayesian analyses via MCMC and not much experience with beta distribution.

What is tripping me up in particular is:

Code:
a=&x+&alpha;
b=&n-&x+&beta;
x1=probbeta(&H0,a,b);
where: x = # events
n = # trials
alpha = alpha for prior
beta = beta for prior
Ho = is the value for hypothesis

So to test this, you just need to add #events to alpha for a and number of trials for non-events plus disperse for b? This part just isn't clicking. Any insight would be appreciated. Here is the full code:

Code:
/******************************NOTES**********************************************/

/*Bayesian macro to test two hypotheses with a Beta prior distribution

(Beta (alpha,beta));

Variables and parameters needed

x: number of successes in the sample

n: sample size

Alpha: alpha parameter of the prior beta distribution

Beta: beta parameter of the prior beta distribution

H0: Null hypothesis

H1: Alternative hypothesis

The conjugate distribution is a Beta(x+alpha,n-x+beta)

****************************************************************************/

%MACRO Bayes_test (x=,n=,H0=,H1=,alpha=,beta=);

DATA bayes1;

length test \$255.;

a=&x+&alpha;

b=&n-&x+&beta;

h0="p<="||left(trim(&H0));

h1="p>"||left(trim(&H1));

x=&x;

n=&n;

x1=probbeta(&H0,a,b);

x2=1-probbeta(&H1,a,b);

If x1>x2 then test='H0 is more probable than H1';

else if x1<x2 then test='H1 is more probable than H0';

else if x1=x2 then test='Equally probable hypotheses';

run;

Proc print data=bayes1 noobs l;

var h0 h1 x n test x1 x2;

label h0='H0' h1='H1' x='X' n='N' test='Test' x1='Prob. under H0' x2='Prob. under H1';

title "Bayes test of &x successes in &n samples";

footnote "Prior distribution Beta (&alpha,&beta)";

run;

title;

footnote;

%MEND Bayes_test;

%Bayes_test (x=24,n=40,H0=0.40,H1=0.60,alpha=12,beta=12);

#### Dason

I didn't check completely but the beta distribution is a conjugate prior for a binomial distribution which really reduces the complexity so the posterior distribution is able to be derived analytically (and is another beta). We really only 'need' mcmc when we can't analytically derive the posterior which is pretty much any case when you don't use a conjugate prior or you have a relatively complex model.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
So to update the data with your prior, you can just get away with summing the two? That just seems too simple.

Prior:
alpha (# events)
beta (# on non-events)

Data:
x (# events)
n - x (# non-events)

Posterior:
a = x + alpha;
b = (n - x) + beta;
beta(a, b)

Can I grab the HDI and 95% CrI from beta(a, b) then?

#### Dason

Yes. It is that simple. Conjugate priors are nice.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Yes, I have heard all of this before, but had not seen it in practice. Thank you for the link!

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Getting ready to leave work and ran out of time for the day. I was trying to plot the prior, data, and posterior in a plot for the above scenario. Could someone let me know what I am missing and how I can do this without cueing the histogram first. Ideally I have three density lines on a plot. I also feel like I am doing something wrong using the rbeta, since there are data to use.

Thanks!

Code:
x=24
n=40
Ho=0.40 #p<=Ho

# Prior: mean 0.50 with 95% data between 0.3 and 0.7
m = 0.50
s = 0.1

alpha = ((m**2)  * ((1 - m) / s**2) - m)
beta  = (m * (1 - m)**2 /s**2) + m - 1

a = x + alpha

b = n - x + beta

x1 = pbeta(Ho,a,b);
x1

#plot
prior = rbeta(40, alpha, beta)
posterior = rbeta(40, a, b)
dataset = rbeta(40, x, n-x)
hist(prior)
lines(density(prior), col='yellow')
lines(density(posterior), col = 'blue')
lines(density(dataset), col='green')
@Dason I am looking at you for help on this simple coding!