model comparisons for a binomial distribution

Dear all,

I have a stats question that relates to a bigger piece of research, but I've simplified the bit I'm confused about...

Suppose I run an experiment 10 times (M=10) where for each experiment I toss a coin 20 times (n=20), I have for each experiment the number of heads (x). From this its is quite easy to estimate the probability of a success p^hat = sum(x_i/n_i)/M

Suppose in the same experiment I actually swapped the coin half way through, and I have the results for each of the coins' trials. Again the best fit is quite easy to find.

What I would like to do is compare whether assuming the probabilities for both coins are the same. I would like to have separate models (model 1: 20 trials each, model 2: 10 trials for each coin) for each hypothesis and compare hypotheses. But I'm unsure how to do this...usually I would compare the log-likelihood, but the models aren't nested so I don't think this is appropriate. (The values in the combination of the binomial likelihood differ for each model) I could assume that for the same model p1=p2, find the log-likelihood and compare to parameters being unconstrained. This is ok for the parameters but I'm not comparing models, so its not really what I want.

I think I might be able to use Bayes Factors, can anyone comment on this and perhaps provide introductory references? Or is there a simpler solution?


Ambassador to the humans
I'm confused as to what is giving you trouble. The reduced model (where you assume the probability for each coin is the same) is nested within the full model (where each coin has a possibly different probability) but you make it sound like you don't think that is the case?
Hi Dason,

I don't think the models are nested, but please correct me.

For the reduced model

Pr(x=X) = c(n,x) . p^x . (1-p)^{n-x}

For the full model

Pr(x1=X1,x2=X2) = c(n1,x1) . p1^x1 . (1-p1)^{n1-x1} . c(n2,x2) . p2^x2 . (1-p2)^{n2-x2}

where n=1+n2, x=x1+x2

whilst the p's can reduce to the reduced model, I don't think the combinations do,

c(n,x) does not equal c(n1,x1) . c(n2,x2) as there are typically more combinations for the full model than the reduced model.