Bayesian Approach To Combine Multiple Weighted Inputs

cmarin

New Member
I'm beginning to learn about Bayesian theory but I'm stumped on the ideal approach for combining multiple weighted inputs. Here's an example to make this more concrete. Let's say that I want to determine the probability that John will like a particular cookie. I know that generally John likes 10% of the cookies he tries. Jill tries the cookie beforehand and says that there is a 70% chance that John will like the cookie. Jill and John been to many cookie tasting events beforehand, so I want to weigh her input highly (say 0.8). Jack also tries the cookie and thinks there is a 60% chance John will like the cookie. However he and John have not been to many cookie tasting events together so he's not as reliable an input as Jill, so I'd want to weigh his input less, say at 0.2.

Given what I know, how do I calculate the overall probability that John will like that particular cookie using a Bayesian approach? Once I figure out the mechanism I'll eventually want to apply this to a real world problem using either R or Python and there will also be more than two weighted inputs.

staassis

Active Member
This question is not as much about Bayesian statistics as about the joint distribution of the preferences of John, Jill and Jack. You cannot answer this question without postulating correlation between John, Jill and Jack. After you've done that, you simply solve for the conditional distribution of John given Jill and Jack. Statements like "say 0.8" or "say 0.2" are non-scientific.

cmarin

New Member
This question is not as much about Bayesian statistics as about the joint distribution of the preferences of John, Jill and Jack. You cannot answer this question without postulating correlation between John, Jill and Jack. After you've done that, you simply solve for the conditional distribution of John given Jill and Jack. Statements like "say 0.8" or "say 0.2" are non-scientific.
I appreciate you taking the time to give me a response. It seems like the example that I gave may not have been clear in terms of the need to account for some of the variables (I was trying to simplify it as much as possible). Here's a more realistic example from a business context. Let's say you work in Sales and need to determine the best way to allocate a finite set of resources to chase down the most valuable leads. There are a multitude of services and models that you can employ that will score the viability of these leads, though all of them are imperfect (in part because none of them have all of the available information). In some cases you also have additional information about that lead which makes them more attractive (i.e. they did business with your company before or they put out an RFP that aligns with your services or your CEOs are best buds). What you want to do is blend all of these inputs together to come up with a composite probability score for each prospect that takes all available inputs into account.

To further flesh out this scenario, let's say:
1. There are three candidate prospects-A,B,C.
2. On average, 10% of the prospects you pursue become leads.
3. On a 0-100 scale, with higher being better, ModelX rates prospect A as a 90, B as 50, and C as 20. In the past we've determined that this model is correct 80% of the time.
4. On a 0-100 scale, with higher being better, ModelY rates prospect A as a 40, B as 60, and C as 20. In the past we've determined that this model is correct 60% of the time.
5. Prospect A has purchased from us in the past, and we know that in those cases the chance of a prospect becoming a lead increases by 50%.

As you can see, this is not really about predicting the similarity between items/individuals, which can be easily done with non-Bayesian collaborative filtering/clustering algorithms. The weighting comes into play in terms of trust level I should assign to a given input. If a weatherman with a PhD and 30 years of experience tells me there is a 70% chance of rain tomorrow then I should probably weigh that input more highly than if a 10 year that rarely ventures outside tells me the same thing. There's plenty of other real world examples where this sort of scenario gets played out-say a national security context where experienced analyst A says there's an 80% chance of war and neophyte analyst B says it is only 20%.

My question then is if there is an applicable Bayesian approach to this type of decision making process and if so what might that look like?