To be more specific, I was looking for a formula for P(D, C) with D = A|B.
The idea being that I want to know how much additional information C brings about A when we already know B.
But I figured that if "we know B for A", "we know B, period" (assuming it makes any sense).
So I'm really looking for the mutual information of A and C knowing B.
P(D, C) = P(A, C | B)
p(a, c | b) = p(a, c, b) / p(b)
I(A, C | B) = Sum_B Sum_A Sum_C p(a, c | b) log( p(a, c | b) / (p(a | b) p(c | b)) )
= Sum_B Sum_A Sum_C (p(a, c, b) / p(b)) log( p(a, c, b) p(b) / (p(a, b) p(c, b)) )
For which a simple (biased) estimator would be
Iest(A, C | B) = Sum_B Sum_A Sum_C (#(a, c, b) / #b) log( #(a, c, b) #b / (#(a, b) #(c, b)) )
where #(x1, ..., xn) is the number of joint occurrences of (x1, ..., xn).
Is this correct?
Thanks,
X.