+ Reply to Thread
Results 1 to 5 of 5

Thread: Phi and Cramer's V

  1. #1
    Points: 585, Level: 11
    Level completed: 70%, Points required for next Level: 15

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Phi and Cramer's V




    There must be something I'm not getting with these measures of association for nominal variables. Please help me understand.

    What I've got so far is that they are (1) based on chi-squared and the X^2 value is calculated with the expectation that the variables are independent of each other {i.e., not associated}, (2) phi {for 2x2 tables} is the square root of X^2 divided by the total number of observations {Cramer's V, for bigger tables, is slightly more complicated, with a diviision also by the lesser of rows or columns minus one.} and that the possible range is 0 to 1. So how can I get phi values > 1?

    Say the question is assocation between a particular surname and Y-DNA matches. The surname has a frequency within the population of 0.3% {making it one of the more common}; the random-chance expected frequency would be 0.003. Observation of whether Y-DNA matches agree with the surname or disagree are 147 agree and 648 disagree, for a total of 795 matches. That gives us a table like this:
    - - - - - - Observed - - Expected - - (O-E)^2/E
    Agree - - - 147 - - - - - - - 3 - - - - - 6912
    Disagree - - 648 - - - - - 792 - - - - - - 26
    Total - - - - 795- - - - - - 795 - - - - - 6938

    Phi = SQRT(X^2/n) = SQRT(6938/795) = 2.95 ???

    Thanks in advance for helping me see where I've gone wrong.
    -rt_/)
    Last edited by rt-sails; 10-27-2011 at 06:10 PM. Reason: get table right

  2. #2
    Super Moderator
    Points: 9,849, Level: 66
    Level completed: 50%, Points required for next Level: 201
    Dragan's Avatar
    Location
    Illinois, US
    Posts
    1,781
    Thanks
    0
    Thanked 142 Times in 128 Posts

    Re: Phi and Cramer's V

    Quote Originally Posted by rt-sails View Post
    There must be something I'm not getting with these measures of association for nominal variables. Please help me understand.

    What I've got so far is that they are (1) based on chi-squared and the X^2 value is calculated with the expectation that the variables are independent of each other {i.e., not associated}, (2) phi {for 2x2 tables} is the square root of X^2 divided by the total number of observations {Cramer's V, for bigger tables, is slightly more complicated, with a diviision also by the lesser of rows or columns minus one.} and that the possible range is 0 to 1. So how can I get phi values > 1?

    Say the question is assocation between a particular surname and Y-DNA matches. The surname has a frequency within the population of 0.3% {making it one of the more common}; the random-chance expected frequency would be 0.003. Observation of whether Y-DNA matches agree with the surname or disagree are 147 agree and 648 disagree, for a total of 795 matches. That gives us a table like this:
    - - - - - - Observed - - Expected - - (O-E)^2/E
    Agree - - - 147 - - - - - - - 3 - - - - - 6912
    Disagree - - 648 - - - - - 792 - - - - - - 26
    Total - - - - 795- - - - - - 795 - - - - - 6938

    Phi = SQRT(X^2/n) = SQRT(6938/795) = 2.95 ???

    Thanks in advance for helping me see where I've gone wrong.
    -rt_/)

    Your fundamental problem is that you are not computing a chi-square test of independence based on a 2x2 contingency table. Rather, you are computing a basic chi-square goodness-of-fit test with 2 categories.

  3. #3
    Points: 585, Level: 11
    Level completed: 70%, Points required for next Level: 15

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Phi and Cramer's V

    I would appeciate if you could please explain how a "chi-square test of independence" differs from a "basic chi-square goodness-of-fit test" when the hypothesis is that the veriables are independent?

    If the variables are independent, one would expect the name to occur in DNA matches no more frequently than in the population. Part of the problem may be that the expected frequency for the surname variable is so low (<5). But, that's built into the situation; the most common surname in the US (Smith) is held by less than one percent of the population (880:100,000); including variants (Smythe, etc.) adds only a small bit.

  4. #4
    Super Moderator
    Points: 9,849, Level: 66
    Level completed: 50%, Points required for next Level: 201
    Dragan's Avatar
    Location
    Illinois, US
    Posts
    1,781
    Thanks
    0
    Thanked 142 Times in 128 Posts

    Re: Phi and Cramer's V

    Quote Originally Posted by rt-sails View Post
    I would appeciate if you could please explain how a "chi-square test of independence" differs from a "basic chi-square goodness-of-fit test" when the hypothesis is that the veriables are independent?
    No, based on your calculations, you are not testing that 2 variables are independent of each other. Rather, you're testing the hypothesis of how the observed data "fits" the 2 expected frequencies you have provided - which is different.

    I would suggest that you find a textbook and review the differences between these two chi-square statistics.

    That said, the basic difference that you need to understand is that you need a contingency table e.g. 2X2 with 4 observations and 4 expected frequencies. You don't have that.

    You have 2 observations and 2 expected frequencies.

  5. #5
    Points: 585, Level: 11
    Level completed: 70%, Points required for next Level: 15

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Phi and Cramer's V


    I figured it out from an analytic geometry viewpoint:
    With only one degree of freedom, the chi-squared distribution is hyperbolic and asymptotic to the X & Y axes. Therefore, the X2 value can exceed N. When X2>N, X2/N>1 and the square root is also >1.

    (The satistics texts didn't help much, except for the graphics of the distributions.)

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats