The probability of an observation sequence, O, given a model, M, and initial state Si is given as:

P(O|M, q1 = Si) = (aii)^(d-1) * (1 - aii)The first part, (aii)^(d-1), makes perfect sense to me but for some reason I can't realize the second part. Why do you multiply by (1 - aii), which is the probability of leaving the state?

I'm going to go take a bike ride and hope to lower the probability of me being dense.