# Thread: Overlap of Two Normally Distributed Samples

1. ## Overlap of Two Normally Distributed Samples

Hi,

I am trying to determine the percentage of overlap of two normally distributed variables (darker pink color below). In particular I am using a very straightforward example, where the only difference is the mean's location, everything else between the variables is the same. See code below, which is in SAS, but should be readable for anyone. The second part is a formula that I found online to determine the overlap. The formula generates a very small value in comparison to the overlap in the histograms, I am thinking that it is the overlap past the other variables mean. Is this correct?

If so, how do I get the overall overlap of the two samples? Is it '- mean2 + (3(sigma))'? I saw you can do some type of integral procedure in R, but I was just hoping to run something simple in SAS, but I welcome all replies and suggestions.

Thanks!

Code:
``````%LET Mean1   =    200;
%LET Var1    =    15;
%LET N       =    150;
%LET Mean2   =    150;
%LET Var2    =    15;
%LET N       =    150;

/*if both normal distributions have the same sigma*/
data test;
d = (&mean1 - &mean2) / (sqrt(&var1));
Overlap = 2 * probnorm(-abs(d)/2);
run;``````
Overlap: 1.0824E-10

2. ## Re: Overlap of Two Normally Distributed Samples

It seems like if the two distributions are approximately equal, one is just shifted over, then if you could determine where the two overlaid lines intersect, then you could say: red tail area after that point + blue tail area prior to that point. So the intersection point is the maxim red value and maximum blue value that they both have in common. Now, if that makes sense, how do I get it along with areas?

3. ## Re: Overlap of Two Normally Distributed Samples

They "overlap" on the entire real line. Are you just looking for the area under the min of the two PDFs?

4. ## Re: Overlap of Two Normally Distributed Samples

I will try to paraphrase, given the provided moments above, what percentage of the first sample would be expected to overlap the second sample. So generic example, if I had two samples (men and women body weights), what percentage of men and women should have the same weights? The value for the dark pink area in graph.

5. ## Re: Overlap of Two Normally Distributed Samples

That doesn't seem to be a well defined problem. Your first question seemed to be in terms of populations and their PDFs and this latest refers to samples.

Like I mentioned before though it's important to keep in mind that both distributions, if we assume them to be normally distributed, have support on the entire real line so all values are possible for both distributions.

It sounds like you might be asking about expectations with respect to the min/max for the corresponding distributions though.

6. ## The Following User Says Thank You to Dason For This Useful Post:

hlsmith (12-27-2016)

7. ## Re: Overlap of Two Normally Distributed Samples

Yes, you are correct on the real line component.

You were also correct, I am not that specific in my description. I was simulating a fictitious scenario and trying to say that the two simulated distribution have the following overlap, so it was sample specific. Though given some of the other things I am working on, I need to think about what I really want. Thanks.

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts