Hi you’ll,
I have a thought problem that might not have an easy/correct answer. It looks to be a Bayesian at first glance – and it might be – but because I am using some complicated machine-learning algorithms, and because of the nature of the problem, the water are muddy for me. I am just hoping some awesome person on here can help point me into the right direction.
Hypothetical Scenario:
You are trying to calculate the chance someone will die in the next year. There are many “ways” someone might die – some more likely than others – but it is only going to happen one “way” in the end.
Let us say we know all the possible ways someone can die, and we have created a model for each way that provides a probability of such an event occurring in the next year. Additionally, the variables used to model the probability of any particular “way” occurring are the same variables used to model other “ways.” For instance, a person might die of an Aneurysm or a stroke. Having 180/130 blood pressure (not a binomial variable) for a prolonged time might be a good predictor of both events occurring. Yet, we do not care which will kill the individual; we just want to know if the person will die in the next year.
In this example we can say that the chances of getting an aneurysm or stroke are correlated because both events are more likely to occur under certain conditions, however, in the real world this relationship might be far more complex, and only exist under certain situations. Additionally, there are some completely unrelated “ways” modeled for a person to die such as motorcycle accident. These must also be considered as a whole.
This problem does not seem to be a simple Bayesian probability. At first glance, one might suggest finding the chances that none of the events occur, and then take the inverse of that probability.
Example:
Event 1: p = 0.2
Event 2: p = 0.2
Event 3: p = 0.2
Event 4: p = 0.2
Event 5: p = 0.2
Odds of no events occurring: 1-(0.8*0.8*0.8*0.8*0.8) = 0.67.
This method assumes all events are mutually exclusive, and independent. However, they are correlated (in some cases non-linearly), and the probability that more than one success occurs cannot not just be calculated and removed from the chances because it was never possible to observe more than on event occurring in the first place.
If anyone can shed any light on my problem, I would greatly appreciate it. I have found a few articles that may more may help me, but I am unsure if I should spend the amount of time necessary to understand them.
Correlated Binomial Models and Correlation Structures
http://arxiv.org/pdf/physics/0605189.pdf
Correlated Binomial Distribution
http://www.appliedbusinesseconomics....5Cgvscbd02.pdf
Last edited by SombreNote; 09-08-2014 at 01:21 PM.
Tweet |