# Thread: mode of the chi-square distribution

1. ## mode of the chi-square distribution

I don't understand why the mode of the chi-square dist is df-2. With three categories, the mode would be 1, that is, the distribution would have its highest point at 1, indicating that when the null is true, the most probable or most frequent chi-sq value would be 1. Shouldn't it be 0 instead, regardless of number of categories, since, when the null is true, the obtained frequencies would be equal to the expected frequencies? I surely am misunderstanding something somewhere...I would greatly appreciate an explanation

2. ## Re: mode of the chi-square distribution

Originally Posted by joeygor
I don't understand why the mode of the chi-square dist is df-2. With three categories, the mode would be 1, that is, the distribution would have its highest point at 1, indicating that when the null is true, the most probable or most frequent chi-sq value would be 1. Shouldn't it be 0 instead, regardless of number of categories, since, when the null is true, the obtained frequencies would be equal to the expected frequencies? I surely am misunderstanding something somewhere...I would greatly appreciate an explanation
What you need to understand is the concept of a sampling distribution that is associated with any particular sample statistic (e.g. chi-square, t, F, Z, etc.)

Basically, sampling distributions are derived assuming the null hypothesis is true. There will also be an expected value (mean) and a standard error associated with the sampling distribution. As such, the sample statistics (under the null) will vary due to random chance factors alone....They can be small e.g chi-square = 0 or large chi-square = 18.5 (just to throw a number out).

To use your example, the mean of a sampling distribution of a chi-square statistic with df=3 (assuming the null is true) will be 3 (not zero). Note that zero is a lower limit.

3. ## Re: mode of the chi-square distribution

Originally Posted by Dragan
What you need to understand is the concept of a sampling distribution that is associated with any particular sample statistic (e.g. chi-square, t, F, Z, etc.)

Basically, sampling distributions are derived assuming the null hypothesis is true. There will also be an expected value (mean) and a standard error associated with the sampling distribution. As such, the sample statistics (under the null) will vary due to random chance factors alone....They can be small e.g chi-square = 0 or large chi-square = 18.5 (just to throw a number out).

To use your example, the mean of a sampling distribution of a chi-square statistic with df=3 (assuming the null is true) will be 3 (not zero).
Sampling distributions can be based on all possible outcomes as well as repeating sampling an infinite number of times, right? Is it not the case that with repeated sampling, when the null is true, the observed frequencies would most often be equal to the expected frequencies, in which case, their differences, and the squares of their differences, and the sum of these (the chi-square) would be equal to zero? Wouldn't this then be the modal probability value? Certainly there would be observed frequencies different from the respective expected frequencies, but they should of course be less frequent/probable than the expected frequencies, right?

Also, the mean and the mode of the chisq are not the same right? The mean = df, the mode = df-2, and the median approx df -.7. I am interested in the mode, that is, the value with the highest probability (which is the mean,and of course also the median and the mode in a normal dist, but not in the chisq dist). Thanks Dragan, looking forward to your reply. I acutally teach basic stats, and this part of the course confounds me

4. ## Re: mode of the chi-square distribution

Originally Posted by joeygor
Is it not the case that with repeated sampling, when the null is true, the observed frequencies would most often be equal to the expected frequencies, in which case, their differences, and the squares of their differences, and the sum of these (the chi-square) would be equal to zero?
Well, no, in general this is not true, because the sampling distribution of a chi-square distribution becomes more normal-like as the degrees of freedom increase. And, further, the mode converges closer to mean as the number of df increases (mean = df).

Note that a chi-square value of 0 is THE Lower limit of the sampling distribution.

I would also note that for the specific case of a sampling distribution that is chi-square on 2df then what you are suggesting would be true.

5. ## Re: mode of the chi-square distribution

Originally Posted by Dragan
Well, no, in general this is not true, because the sampling distribution of a chi-square distribution becomes more normal-like as the degrees of freedom increase.
You might be misunderstanding me here. I was referring to repeated sampling as the basis of the null distribution (the sampling distribution when the null is true), as opposed to all possible outcomes (tossing ten coins an infinite number of times versus listing down all possible outcomes of tossing ten coins). I was not referring --- as it seems to me you do in the quotation above --- to increasing the number of categories, which would result in the df increasing (to k-1, where k is the number of categories).

When the number of categories = 3, df =2, and the mode = df-2 =0, that is, the most probable chi value is zero. I would think this is because the observed freqs are most often equal to the expected freqs. However, and this is my confusion, this, it seems to me, should ALWAYS be the case, regardless of number of categories (and hence df), so that, as I said, it seems to me that the mode should always be zero.

6. ## Re: mode of the chi-square distribution

If you are teaching stats, it's really important to "get" this, so you can pass that understanding on to your students.

What's going on here is basically related to counting. The only way to get X^2 = 0 is to have every contributing z = 0 exactly. For any non-zero X^2, though, there is an infinite family of combinations of z-values that could give it. Even though 0 is the most likely value of any individual z, the counting effect outweighs the individual likelihood effect for three or more terms.

Here is a toy model to illustrate. Let's model a normal distribution by a simple discrete distribution over -1, 0, and 1, where 0 is twice as likely as the "extreme" values -1 or 1. That is, P(+/-1) = 1/4 and P(0) = 1/2. Now let's derive the distribution of the three-term X^2 = z1^2 + z2^2 + z3^2.

The only one way of obtain X^2 = 0 is to have z1 = z2 = z3 = 0. The probability of that is (1/2)^3 = 1/8. There are, however, 6 ways to obtain X^2 = 1; any combination of two z's being 0 and one z being +/-1 will do. The probabiliy of any one of those combinations is (1/2)^2 (1/4) = 1/16, but there are 6 of them, so the total probability of X^2 = 1 is 3/8, three times larger than the probability of X^2 = 0.

7. ## Re: mode of the chi-square distribution

Originally Posted by joeygor
You might be misunderstanding me here. ... I was not referring --- as it seems to me you do in the quotation above --- to increasing the number of categories.....
I don't think I am misunderstanding you. Rather, you're not making yourself clear.

More specifically, you stated in your original post the following:

Shouldn't it be 0 instead, regardless of number of categories,...

8. ## Re: mode of the chi-square distribution

Originally Posted by ichbin
The only way to get X^2 = 0 is to have every contributing z = 0 exactly...Even though 0 is the most likely value of any individual z, the counting effect outweighs the individual likelihood effect..
Thanks Icbin! The above cleared up things for me. No more need to worry about my students

9. ## Re: mode of the chi-square distribution

Originally Posted by ichbin

Here is a toy model to illustrate. Let's model a normal distribution by a simple discrete distribution over -1, 0, and 1, where 0 is twice as likely as the "extreme" values -1 or 1. That is, P(+/-1) = 1/4 and P(0) = 1/2. Now let's derive the distribution of the three-term X^2 = z1^2 + z2^2 + z3^2.
I don't understand your explanation, it actually confused me more. How does "having something twice as likely as the 'extreme' values" showed that mode is n-2? I don't even see how it is relevant.

10. ## Re: mode of the chi-square distribution

Note Chi-square distribution is a special case of the Gamma distribution:

And hence the p.d.f. is

where

Note

Hence, when ;
and thus is strictly decreasing, the mode is at

when

i.e. The mode is at

Combining together, the mode is

11. ## Re: mode of the chi-square distribution

Thanks so much, BMG, that's what I'm looking for.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts