+ Reply to Thread
Results 1 to 6 of 6

Thread: geometric argument for constraints on r_xz given r_xy and r_yz

  1. #1
    Cookie Scientist
    Points: 13,806, Level: 76
    Level completed: 39%, Points required for next Level: 244
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,297
    Thanks
    66
    Thanked 584 Times in 438 Posts

    geometric argument for constraints on r_xz given r_xy and r_yz




    Today a labmate asked the following question: if we have three random variables x, y, z, and we know the correlations r_{xy} and r_{yz}, what constraints if any does this place on the correlation r_{xz}?

    At the time I reflexively answered that the remaining correlation must be the product of the two known correlations. Which of course is totally wrong. I think I was getting some mental interference from some of the equations for simple mediation floating around in my head. Anyway, after thinking about it for a while I have come up with a convincing geometric argument for what the constraints actually are. I have also verified that my answer agrees with a more complicated-looking answer to this question that I found elsewhere online.

    Because I ending up spending a lot of time on this and I thought some of you would find the results interesting, I thought I would share my work here. Comments are welcome!

    Okay. So imagine our variables x, y, z as vectors in n-dimensional space. The Pearson correlation coefficient between any two of these variables can be interpreted as the cosine of the angle between the corresponding vectors. This is an interesting and well-known geometric fact about correlation coefficients.

    So now imagine that the x and y vectors are fixed (and hence so is their correlation), but that the vector z is free to vary so long as r_{yz} is constant. This constraint on r_{yz} means that the set of possible z vectors will form a sort of "cone" around the y vector, as in the following image:

    Now it is intuitively obvious (I know this is a sneaky phrase, but that's why I call this just an "argument" and not a "proof") that the two possible z vectors that will lead to the minimum/maximum values of r_{xz} are the z vectors that lie on the same plane as the x and y vectors. This leads to the following expression for the minimum/maximum values of r_{xz} given r_{xy} and r_{yz}:

    cos[arccos(r_{xy}) \pm arccos(r_{yz})]

    One notable result following from this is that if x is orthogonal to y and y is orthogonal to z, then there is no constraint on r_{xz}, it can be anywhere from -1 to +1. But under any other circumstances, fixing r_{xy} and r_{yz} will place some constraint on the range of r_{xz}.

    Okay, now for the verification part, which requires a bit of math.

    So in this stats.stackexchange.com thread (LINK), it is stated that the three correlations must satisfy

    1+2r_{xy}r_{xz}r_{yz}-(r_{xy}^2+r_{xz}^2+r_{xy}^2) \ge 0

    The reasoning here being "because this is the determinant of the correlation matrix and it cannot be negative." Anyway, this can be viewed as a quadratic inequality in r_{xz}, already in standard form:

    (-1)r_{xz}^2 + (2r_{xy}r_{yz})r_{xz} + (1 - r_{xy} - r_{yz}) \ge 0

    So if we apply the quadratic formula and simplify the result, we get the following for the minimum/maximum values of r_{xz}:

    r_{xy}r_{yz} \pm \sqrt{(1-r_{xy}^2)(1-r_{yz}^2)}

    Now taking my answer and applying the trig identity cos(a \pm b) = cos(a)cos(b) \mp sin(a)sin(b) we get

    r_{xy}r_{yz} \pm sin(arccos(r_{xy}))sin(arccos(r_{yz}))

    Now applying the identity sin(x) = \sqrt{1 - [cos(x)]^2} we get

    r_{xy}r_{yz} \pm \sqrt{(1-r_{xy}^2)(1-r_{yz}^2)}

    Which is the answer we got from the stackexchange post. So our simpler, geometrically based answer agrees with the more conventional answer that is harder to understand.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

  2. The Following User Says Thank You to Jake For This Useful Post:

    TheEcologist (09-17-2013)

  3. #2
    Devorador de queso
    Points: 97,410, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent PosterActivity Award
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,981
    Thanks
    308
    Thanked 2,639 Times in 2,254 Posts

    Re: geometric argument for constraints on r_xz given r_xy and r_yz

    Quote Originally Posted by Jake View Post
    Which is the answer we got from the stackexchange post. So our simpler, geometrically based answer agrees with the more conventional answer that is harder to understand.
    Nice work. I would argue against your last statement that the conventional answer is "harder to understand". I will admit that it's harder to interpret geometrically - but algebraically we know that that constraint (positive determinant) must hold. So that argument is really just mindlessly going through the motions starting from that one fact that we know.

    To get honest that alternative approach is ultimately where I ended up when I was thinking about the problem. My first thought was just to find the values that makes the correlation matrix positive semi-definite. After *a little* bit of work we can show that this is equivalent to the determinant criteria in this case. I like your approach more - geometric approaches always seem more creative.
    I don't have emotions and sometimes that makes me very sad.

  4. #3
    TS Contributor
    Points: 22,410, Level: 93
    Level completed: 6%, Points required for next Level: 940

    Posts
    3,020
    Thanks
    12
    Thanked 565 Times in 537 Posts

    Re: geometric argument for constraints on r_xz given r_xy and r_yz

    Here we just view the covariance to be an inner product such that the correlation actually is just the cosine of included angle \cos\theta. It is always interesting to have this geometric interpretation in our mind. And actually there are quite a lot of tools in Hilbert space can be used in probability context.

  5. #4
    Points: 201, Level: 4
    Level completed: 2%, Points required for next Level: 49

    Posts
    5
    Thanks
    1
    Thanked 0 Times in 0 Posts

    Re: geometric argument for constraints on r_xz given r_xy and r_yz

    Hi,

    I have a somewhat similar and easier question about relationships between correlations that I can't figure out, would any of you be able to put me on the right track?

    I have two variables, x and z, for which Corr(x,z) > 0. I define a new variable y = x - z. I want to know what are the constraints on Corr(x,y) given the above fact about Corr(x,z). Intuitively it seems like it must hold that Corr(x,y) >= Corr(x,z). I can't show this mathematically however.

    Any ideas?

  6. #5
    TS Contributor
    Points: 22,869, Level: 93
    Level completed: 52%, Points required for next Level: 481
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,157
    Thanks
    166
    Thanked 540 Times in 433 Posts

    Re: geometric argument for constraints on r_xz given r_xy and r_yz

    Quote Originally Posted by oracle133 View Post
    Hi,

    I have a somewhat similar and easier question about relationships between correlations that I can't figure out, would any of you be able to put me on the right track?

    I have two variables, x and z, for which Corr(x,z) > 0. I define a new variable y = x - z. I want to know what are the constraints on Corr(x,y) given the above fact about Corr(x,z). Intuitively it seems like it must hold that Corr(x,y) >= Corr(x,z). I can't show this mathematically however.

    Any ideas?
    well... i don't think that's necessarily true. if x=z then your first condition of Cor(x,z)>0 holds (because Cor(x,z) = Cor(z,z) = 1) but then you would have Cor(x,y) = Cor(x,x-z)=Cor(x,z-z)=Cor(x,0)=0 and you would end up with 0>=1, which is obviously wrong.

    i feel like you would need to tell us more about the relationship between x, y and z to come up with suitable boundaries.
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  7. The Following User Says Thank You to spunky For This Useful Post:

    Jake (05-18-2015)

  8. #6
    Cookie Scientist
    Points: 13,806, Level: 76
    Level completed: 39%, Points required for next Level: 244
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,297
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: geometric argument for constraints on r_xz given r_xy and r_yz


    I replied to your other thread about this.
    “In God we trust. All others must bring data.”
    ~W. Edwards Deming

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats