Chi-square Vs. Fisher's exact test (FET)

#1
I'd really like to have a statistician answer this because I cannot find the answer anywhere. Perhaps you could be kind enough to direct me towards a text that would answer this.

Given the general rule of thumb that FET is used when the cell count is less than 5 and that computers have made statistical calculations very easy, why not use the FET over chi-square for most if not all applicable analyzes. Essentially, what are the statistical advantages of using chi-square over FET?

Thank you

Any one able to shed some light on this for me? :)
 
Last edited:

vinux

Dark Knight
#3
Hi,
It’s appropriate to use Fisher’s exact test, in particular when dealing with small counts. The chi square test is basically an approximation of the results from the exact test.
If you do chisquare test for small counts you may come up with erroneous results because of the approximation.

And it is difficult to calculate Pvalue for FET for large counts.
 
#5
Actually fisher gets quite unwieldy pretty fast. On a side note I read a paper about 6 different ways to analyze a simple two by two table that was fascinating. I always wish I had bookmarked it. There are a few subtle assumptions involved in these things that are easily overlooked and result in slightly different p-values.
 
#8
thanks as the websites are very helpful. Again, I suppose since the FET is more difficult to calculate, chi-square is used but with statistical programs doing the "number crunching", it appears, at least to me, that FET should be used for most purposes. Is the FET even too time consuming for modern computer processing capabilities?

My point being, why use an approximation (ie., chi-square) when you can get the exact answer (FET) regardless of sample size.
 

TheEcologist

Global Moderator
#9
thanks as the websites are very helpful. Again, I suppose since the FET is more difficult to calculate, chi-square is used but with statistical programs doing the "number crunching", it appears, at least to me, that FET should be used for most purposes. Is the FET even too time consuming for modern computer processing capabilities?

My point being, why use an approximation (ie., chi-square) when you can get the exact answer (FET) regardless of sample size.
Point noted, However as I understand it the difference between the Chi-square and the FET becomes smaller as the expected cell count and table size grows. Therefore it doesn't really matter anyway.

however at what size the tests become acceptably similar I dont know.

Edit: There was actually a discussion on the R help list on this (not resolved):

http://tolstoy.newcastle.edu.au/R/help/05/09/11961.html
 
Last edited:
#11
I've got a question related to this thread and I've been looking everywhere on the web but I can't find an answer.

I know that if the assumptions (n<20, one of the cells < 5 [or a little bit less stringent maximal 20%]) one can stick to Fisher's Exact Test.

When should I use the exact version of the CHISQ (in SPSS)? Can it be used as an substitute for the FET (if so, i which cases)? In other words, is it equivalent?

In 2x2 tables one will get both for the FET and CHISQ the 1-sided exact test in SPSS. Does the 1-sided CHISQ have the sample principle as the FET which uses upward diagonals and downward diagonals? Besides that for bigger tables one can use the 2-sided exact test for each only I found out. And does the exact test for CHISQ also assumes fixed margins (conditional) like the FET?

Further I am thinking about if my data meets the assumptions of the FET, because I did an oberservational study and registered looking (left, right, not) in case of turning (left, right) as a driver and I registered looking (left, right,not) in combination with a bicyclist from (left, right, none) as a driver.

See the first document at the bottom. In the case of the lady example with coffee and milk poured a new data collection would be the same since the lady must guess 48 of each (first or second milk) again and there are 48 cups of each (these are both know by forehand). Also, the document states that if one would do a recount or collecting new data and when those row or column marginals change then it's voilated. if only one marginal is fixed one can do a FET says the document.

Does a constant row marginal in my study mean that I do for example 20 turning left and 40 turning right (row totals like in my first data collection) in a new data collection (i know/choose it by forehand) and observe looking behaviour (which probably is not the same). On the other hand, if I would have recorded those drivers on tape and score them again it ould result in the same 2x2 table with the same total column and row margins (this might be not a logical or allowed idea).


This might be an interesting link:
http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Chi-square/Contingency%20Tables.pdf

And (cast doubt on the relevance of marginals, did not read the article myself):
http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V0M-45GVWV2-4&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&_docanchor=&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=ef731c40b4e88804bab7ad6e0fe61024

Are there any other alternatives to CHISQ (for my problem, with 2x2 and 2x3 tables)?

Is the Yates' correction a good alternative (I've read somewhere that it is very convervative)? I think it's called in SPSS 'Likelihood Ratio'. And maybe Barnard’s Test? It is only for 2x2 and not in SPSS.
 
Last edited:
#12
Kindly help me on the following.
1. How to understand the result of Fisher's Exact Test? This test facility is available for free on certain websites. Is it a p-value or the value similar to Chisquare. For example, please see this.

http://www.danielsoper.com/statcalc/calc29.aspx

2. Are FET and Hypergeometric distribution one and the same? In Excel 2007, hypgeomdist function is available to find this out. How to punch data into it and understand the result?
 

vinux

Dark Knight
#13
My Answer is in Red.
Kindly help me on the following.
1. How to understand the result of Fisher's Exact Test? This test facility is available for free on certain websites. Is it a p-value or the value similar to Chisquare. For example, please see this.

http://www.danielsoper.com/statcalc/calc29.aspx

It is based on P value we take the decision. The test statistic is a probability here.


2. Are FET and Hypergeometric distribution one and the same? In Excel 2007, hypgeomdist function is available to find this out. How to punch data into it and understand the result?

In 2 x 2 FET , the test statistic is sum of hypergeometric probabilities. To calculate this in excel the 2 x 2 scenario... create all a ,b,c,d combinations such that a+b , a+c , b+c ,b+d are constant.
Then calculate the probability.
 
#16
I am sorry I don't grasp how I can feed the contingency table values into Hypgeomdist function of Excel 2007? I have enclosed the contingency table. Kindly tell me how I can feed those values into the Excel function dialogue box.

[(HYPGEOMDIST(sample_s,number_sample,population_s,number_population)

Sample_s is the number of successes in the sample. Which value from the table?

Number_sample is the size of the sample. Which value from the table?

Population_s is the number of successes in the population. Which value from the table?

Number_population is the population size. Which value from the table?]
 

Dason

Ambassador to the humans
#19
Not quite helpful
What are you trying to accomplish with this post? You decide to necro a thread whose last post was in 2009 and then tell the OP that their attempt to bump the thread isn't helpful? Was anything you said helpful? If you didn't have other posts that appear to be legit I would have assumed you were spam bot.