Chi-square post hoc?

#1
Hello
I have been researching various options to interpret output for a 3 x 2 chi-square, as although I can confidently determine that my chi-square test is significant, I am not sure what the best approach to determine which particular cell is significantly contributing to the significance. I've investigated using adjusted residuals and column proportions (z-test). I've also read about another approach whereby you square the residual, then compute an individual p value for each cell. I'm not sure which would be most appropriate or if my interpretation of these procedures is correct.

I'll try to use an example to explain my current understanding (or lack of!). I have surveyed over 2000 girls about the feelings they experienced when they went through puberty. This included a list of feelings (e.g. happy, sad, relieved, embarrassed, normal etc) and they could answer yes/no to these questions. I am interested in whether the feelings reported (e.g. % that reported feeling happy) differ by school type (single sex, mixed, single sex with boys at college only). I've provided the output of the chi square test for the feeling 'happy' by school type

Felt Happy * School Type Crosstabulation

Boys in college only Mixed Single sex Total

Count 114 a,b 105 a 112 b 331
Expected Count 114 85 132 331
% felt happy 34.4% 31.7% 33.8% 100.0%
Yes % in School Type 15.9% 19.6% 13.5% 15.9%
% of Total 5.5% 5.0% 5.4% 15.9%
Adjusted Residual .0 2.7 -2.5

Count 602 a,b 430 a 718 b 1750
Expected Count 602 449 698 1750
% felt happy 34.4% 24.6% 41.0% 100.0%
No % in School Type 84.1% 80.4% 86.5% 84.1%
% of Total 28.9% 20.7% 34.5% 84.1%
Adjusted Residual .0 -2.7 2.5

Count 716 535 830 2081
Expected Count 716 535 830 2081
% felt happy 34.4% 25.7% 39.9% 100.0%
Total % in School Type 100.0% 100.0% 100.0% 100.0%
% of Total 34.4% 25.7% 39.9% 100.0%


X2 = 9.146, p = 0.010

My interpretation of the above is that there is a significant association between the feeling 'happy' and school type (X2 = 9.146, p < 0.05). Based on the adjusted residuals it looks like a significantly higher proportion of girls from mixed schools reported being happy (19.6%) and significantly less girls from single sex schools reported being happy (13.5%). Would I say that this is significantly more/less than expected by chance? Or does this mean significantly more/less than all the other groups, or from the overall group average of 15.9%. (I think it is than expected by chance yes?)

When looking at the z test for column proportions the 'boys in college only' school type had both an 'a' and 'b' assigned to it, therefore I assumed that there is not a difference between this school type and each of the other school types. However, because the 'mixed' school has an 'a', and the 'single sex' school has a 'b' (i.e. different subscript letters assigned to them) that these do differ from each other. Is this right? Does this mean that there are sig less girls reporting feeling happy in single sex schools compared to mixed schools, but not compared with boys in college school? Because then I feel like this interpretation is a little different from when using the residuals. i.e. when using the residuals I'm saying its just sig less/more than expected by change, but when using z test I'm saying the difference is between the single and mixed school only?

I hope I have explained things ok and that you understand my queries. I have tried to read around the area and watched various video tutorials on youtube etc however almost all things I have read/seen use a 2x2 example and don't go any further. If anyone can offer any advice it would be much appreciated.

Thanks in advance.
 
#2
Hi,
Does anyone have any advice to give? I really have tried to find the answer to this problem myself but would really appreciate some guidance.
Thanks in advance.
 

rogojel

TS Contributor
#3
hi,
I did not really understand the structure of your table, the ones I used with Minitab had 3 elements in each cell - observed number in the cell, expected number based on the null hypothesis, chi-squared value. The p value for the table is calculated by aggegating the chi-squared numbers from the cells IIRC.
If the overall result is significant, imho you can check the cells that have the largest chi-squared numbers - they are the most probable contributors to the effect.

regards
 
#4
rogojel thankyou for your response. I am grateful that someone took the time to respond.
I have not used minitab myself (I used SPSS) but my table shows the observed count, expected count, and a standardised adjusted residual. I haven't been able to obtain a chi-square value for each cell that you refer to, I only have the overall chi-square which tells me if the test is significant. Do you have any idea how I can view/obtain the chi square for each individual cell? Thankyou in advance.
 

gianmarco

TS Contributor
#5
You have to focus on the standardized adj residuals: they tell you which cell is significantly contributing to the rejection of the Null Hypothesis of independence between rows and columns.
Values larger than 1.96 are significant at 0.05. Their sign (positive or negative) tells you if a particular cell has an observed value larger or smaller than the expected value.
In summary: the analysis of the adj stnd residuals helps you in spotting which row category(ies) is significantly strongly or weakly "associated" to which column category(ies).