# A basic question about chi-square test

#### rkm567

There is a test to examine the usage of different English words by male and female students. So there are two columns of data (male and female students), 30 rows(each row is the frequency of different English words used by male and female students). I find an online chi-square calculator, which asks to input data into the 4 blanks. If I want to calculate whether there is a significant difference in each row, how should I input the data correctly? Thank you!

#### gianmarco

It would seem that the online calculator can only handle 2x2 tables (2 rows x 2 cols).

#### katxt

As I see it, the correct analysis isn't one 30x2 table with 29 df, but 30 1x2 tables, each with 1 df. These can be combined.
Are there equal numbers of males and females?

#### katxt

Does the table have the number of people using the word, or the total number of times the word is used?

#### rkm567

Here is a reference paper in which the total number of words in NS and NNS is 114022 and 149574 respectively. The "x2" value calculated like that is exactly what I am looking for. I input the data like this but the result is wrong. Which step should be corrected?

#### rkm567

This for example can handle larger tables:
https://www.icalcu.com/stat/chisqtest.html
Thank you for your reply! But what about to see whether each row has a significant difference, like whether there is a significant difference between A and B groups on the use of English word "finally"? Not the comparison between two whole sets of data?

#### Karabiner

So, seemingly you want to perform 30 comparisons. That could be done with
one-sample Chi² tests. The idea is this: "finally" was used 38 times. If there was
no association with gender, then one would expect 19 "finally" by males and
19 "finally" by females. The actual distribution (23/18) is compared with the
expected distribution.

One remaining problem would be that 30 tests bear 30 risks of a false-positive
result. You should maybe use a more conservative signficance level than
the common 5%, e.g. 1% instead.

#### rkm567

Here is a reference paper in which the total number of words in NS and NNS is 114022 and 149574 respectively. The "x2" value calculated like that is exactly what I am looking for. I input the data like this but the result is wrong. Which step should be corrected? Thank you again!

#### katxt

There are things here that don't make sense to me yet.
What are the 14 and 18?
What are Raw and Normal? NN and NNS?

#### rkm567

Sorry for the unclear information.
The author maybe made a typo, but NS means native speaker group, and NNS means non-native speaker group.
14 is the frequency of "finally" used by NS group, and 18 is the frequency of "finally" used by NNS group.

#### katxt

So, what is the male/female data connection?
like whether there is a significant difference between A and B groups on the use of English word "finally"
and what are A and B?

#### rkm567

So, what is the male/female data connection?

and what are A and B?
A/B or male/female data is another example in my paper. I was to take that for example to ask you guys, but then I think it would be clearer to quote the example in the thesis I read before.
I want to get the x2 value like that thesis does, but when I calculated the data in that thesis, I got the wrong x2 value. So I think maybe the x2 value in my paper that I calculated is wrong too.

#### katxt

I have tried to interpret things as you have explained. I can't see where the "finally" answers come from (nor the other words I have tried.) Your answer is right enough to show that there must be something else going on that we don't know about. Cheers, kat

#### rkm567

I have tried to interpret things as you have explained. I can't see where the "finally" answers come from (nor the other words I have tried.) Your answer is right enough to show that there must be something else going on that we don't know about. Cheers, kat