Bayesian confuson


I am trying to reconcile what I think are conflicting philosophies in my head about an experiment.

The experiment is a hierarchical model where:

X ~ Bernoulli(p)

p has an exact functional relationship of the form:

p = p(theta, B, f).

B is a parameter that is fixed between all realizations of an ideal experiment, but unknown, and we want to infer it using Markov Chain Monte Carlo techniques. "f" is a fixed, known number for the entire experiment.

In each realization of an ideal experiment, however, theta is random and distributed with PDF sin(theta)/2.

What is measured in the practical experiment is an *average* of these repetitions, i.e. total number of successes, i.e. so we measure a number that is the fraction of successes over ~10^8 realizations of the experiment.

The problem I am having is that I initially thought I would assume a prior distribution of sin(theta)/2 on theta and then just do MCMC to infer the other parameters, but this is illogical and is really the assumption that theta is *fixed* and our prior knowledge follows a particular distribution, which should get much tighter as the repetitions of the experiment reveal the "true" value of theta in the posterior distribution.

In fact, though, I want to know the average p*n value, translate that into some sort of Poisson distribution, and then apply MCMC to infer B and marginalize out everything else.

In truth, the only way I can imagine this experiment being setup in a fully Bayesian way is that for each realization of the experiment, we multiply by the likelihood and prior for theta_i, where i runs from 1 to N = 10^8, since theta_i has a definite value in the experiment but we just don't know it, and I want to assume that it's drawn from a distribution with PDF sin(theta)/2. Then perhaps there is some clever way to marginalize over all theta_i values and come up with an answer that is equivalent to just doing some sort of distribution transformation (integration).

Can someone clear up the fog I am in about how to properly think about/treat this situation? Thank you so much.


Ambassador to the humans
What is your final goal? I don't really quite even know which of the parameters you're interested in and which you aren't. It sounds like you don't actually care about the individual values of the the \(\theta_i\). If that's the case you could probably marginalize them out. Then again I don't know how easy that would be since it depends on what your p(theta, B, f) looks like.
Hi Dason,

Thanks for the reply! I should have been more clear -- I am interested in determining B, where f is a constant that is known. The thought was to run MCMC to marginalize over any other uncertain variables (even though here we're just assuming we know everything except B) given multiple "average measurements," which should have some likelihood function.

Unfortunately the problem is p(theta, B, f) is a function of the eigenvalues of a 3x3 matrix that are convolved with a normal density.

Is what I am trying to do just a simple transformation of a distribution? I.e. I have X as a sum of Bernoulli variables with different p values for each realization, so given the density of p values, could I just integrate in some fashion to obtain a new distribution?