Hi,
What is the test statistic for the Kolmogorov Smirnov Two sample test for testing whether two distributions are the same. I have been using the Handbook of Parametric and Nonparametric Statistical Procedures, but I have also found online a conflicting option for the test statistic:
1) As stated on wikipedia (which http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test
2) and in this set of lecture notes http://ocw.mit.edu/courses/mathematics/18-443-statistics-for-applications-fall-2006/lecture-notes/lecture14.pdf
there is this option of a test statistic which has an extra factor of
sqrt [(n1*n2/(n1+n2))]
so that instead of :
where F and G are empirical cumulative distribution functions of two samples of size n1 and n2, it uses a test statistic:
This is confusing, as I also have a large sample size and I was wondering if this was because they are referring to different tables in the two cases where one table is multiplied by a factor or whether this is a mistake? Especially as I am working on a large sample size, so when I look up the tables, I see there is this sqrt factor but it is inverted.
Any suggestions/help would be great!
Thanks!
What is the test statistic for the Kolmogorov Smirnov Two sample test for testing whether two distributions are the same. I have been using the Handbook of Parametric and Nonparametric Statistical Procedures, but I have also found online a conflicting option for the test statistic:
1) As stated on wikipedia (which http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test
2) and in this set of lecture notes http://ocw.mit.edu/courses/mathematics/18-443-statistics-for-applications-fall-2006/lecture-notes/lecture14.pdf
there is this option of a test statistic which has an extra factor of
sqrt [(n1*n2/(n1+n2))]
so that instead of :
D_n1n2 = sup|F-G|
where F and G are empirical cumulative distribution functions of two samples of size n1 and n2, it uses a test statistic:
D_n1n2 = sqrt [(n1*n2/(n1+n2))] * sup|F-G|
This is confusing, as I also have a large sample size and I was wondering if this was because they are referring to different tables in the two cases where one table is multiplied by a factor or whether this is a mistake? Especially as I am working on a large sample size, so when I look up the tables, I see there is this sqrt factor but it is inverted.
Any suggestions/help would be great!
Thanks!