Sure. The probability is 1. Both normal distributions are defined on the entire real line.
I guess you'll have to explain what you mean a little more clearly. Also, what distribution are you calculating the probability with respect to?
Is there a way that I can calculate the probability that a data point would fall in the region where two normal distributions with different means and standard deviations overlap?
Sure. The probability is 1. Both normal distributions are defined on the entire real line.
I guess you'll have to explain what you mean a little more clearly. Also, what distribution are you calculating the probability with respect to?
Sorry for not explaining my question better. I have two normal distribution curves, each one with a different mean and standard deviation. I want to limit each curve to 3*sigma on each side so they don't extend to infinity. From here, I want to calculate the probability that a data point falls in the area where the limited curves overlap.
Mean 1: 54.94
Sigma 1: .663
Mean 2: 61.7556
Sigma 2: 8.02087
What distribution do you want to calculate that probability with respect to? You have two probability distributions that you specify and the probability of falling in an interval will be different depending on which distribution you're using to calculate the probability.
I'm not sure I understand your question. If I am trying to find something in the range that the curves overlap, wouldn't you use both distributions?
And how do you suggest we use both distributions? I mean we need both distributions to figure out the interval that they overlap for your specifications but... consider this.
(Fake numbers just to make a point)
Men's heights are distributed Normal with mean 74 and sd 6. The scores in the stats class that I teach on the previous exam were distributed Normal with mean 65 and sd 10.
What's the probability of a number falling in the interval where they overlap? <- This is the question you're asking. And I'm saying ok... I guess we can figure out the interval but we can only calculate probabilities with respect to some distribution. So do we want to know the probability of a new observation from the men's heights falling in the interval... or do we want to know the probability of a new score on the exam falling in the interval? Two different questions that will give us two different answers. Or do we want to know the probability that a women's height will fall in that interval... in which case we would calculate a probability with respect to the distribution of women's heights.
The question you're asking is meaningless without specifying which distribution you want to calculate probabilities with respect to. It's easy enough to use the information you gave us to calculate the interval but without a probability distribution we can't assign a probability to that interval.
Thank you for clarifying. I would want to use the distribution with the mean of 61.7556 and SD of 8.02087. Also, how can I calculate the interval they overlap? Is the an equation for that?
Ha. Yeah, they were not too pleased (but neither was I). I like it more like this though, easier to tell who actually understands the material if you spread the material out. Of course things get curved eventually but they get a good scare before I tell them that.
Well you already told us what you're looking for. Why don't you figure out what the 3*sigma interval looks like for each distribution and then just look where they overlap.
I think that Dason is setting up a hypothesis test type of scenario whereas boiler is asking a different question entirely. Boiler's question is really a calculus problem. I think that you could solve the problem in the following way:
Suppose v - N(m1,s1) and w - N(m2,s2), in order for to calculate P(v > w) do the following:
D1 = Fv - Fw (from a - m) where Fv is the CDF of v and Fw is the CDF of Fw
D2 = Fw - Fv (from m - b)
where a = -inf, m = arg(P(v) = P(w)), and b = inf
P(v > w) = 2 - (D1 + D2)
I don't believe I did anything with hypothesis testing in any of my replies. Their question was how to calculate the area of an overlap so you could consider a calculus question but in reality you're not going to do any integration.
Where are you getting the idea they want the probability that v will be larger than w? If that's what they want then it seems you might have a working approach but your notation is hard to follow and I think there are easier ways of getting that probability.Suppose v - N(m1,s1) and w - N(m2,s2), in order for to calculate P(v > w) do the following:
I wanted to solve this same problem for a work-related project. Overlapping normal distribution curves could be a very powerful tool for data comparison. Much easier to comprehend directly than the student's t test and all the other statistical hyperbole.
I'm surprised that other replies did not seem to understand the request. In statistics everybody tries to apply the 'learned by rote' solutions without looking at what is really being examined.
The area of intersecting normal distribution curves is a straight-forward geometry problem. Sample size is not relevant. Start with two means and two standard deviations.
The algebra was kind of messy. The equation for a normal distribution curve is (1/SQRT(2*PI*StDev^2)) *e^(-(x-mean)^2/(2*StDev^2)) Looks even messier in text.
Put in mean1 Stdev1 and make it equal to the same equation with mean2 and Stdev2. Solve for x. Eventually you have a quadratic equation with a ln() in it. Solve with the good old quadratic equation solution and you get the x coordinate of where the curves intersect. To find the area you put that x in the so called error function (.5-erf(x/Stdev/SQRT(2)). That will give you the area under part of the curve. Use the other StDev to get the area in the other direction. There are really two points of intersection, but often the other point is so far out there that the area is negligible. The error function is the same as integrating the curve, but the integration solution is not a straightforward algebraic equation. That's why everybody looks it up in a table or uses calculator (Excel has the error function). It was integrated with a Taylor series and the solution is an infinite series.
If anybody is still interested in this problem, post a reply and I will email the equation in more readable form. I set it up in Excel and use it everyday to compare data at work.
Of course some people with only a vague understanding of what they were taught apply things by rote. Please don't assume that everybody here takes that approach.
I guess I'm going to ask you what you think is being examined by calculating this area.The area of intersecting normal distribution curves is a straight-forward geometry problem. Sample size is not relevant. Start with two means and two standard deviations.
I don't have emotions and sometimes that makes me very sad.
There is a nifty applet at http://dmbru.net/Critical_Thinking/N...ersection.html that shows the problem and solution clearly, but it doesn't give the math behind it. There was an identical question on this forum at http://www.talkstats.com/showthread....-distributions with equally helpful replies. Neither person stated why they wanted the solution.
I've never thought this to be a difficult problem but I typically refuse to give an actual solution because of what I quoted. I just don't see why somebody would want this for a practical application. If somebody can explain that to me then I might help (or come up with a better way to assess what they're interested in).
I don't have emotions and sometimes that makes me very sad.
Tweet |