# Thread: I.I.D assumption of T-test

1. ## I.I.D assumption of T-test

I am writing my second master's thesis now and I came across a situation where . So I thought I'd test whether this need to be the case in order to be able to use the usual T-test. It turned out that even if the identically distributed assumption doesn't hold, the T-statistic is still asymptotically standard normal. In this simulation the variances are "well behaved", so I guess this result doesn't hold in all cases. But in my paper, the variances are unequal and "well behaved", so I guess I could have used a T-test, but this is another story.

My question is, is this a well known result I came up with here? And what is the implications of this, does it "open up any doors" to whole new situations where the T-test can be used for large samples?

See a preliminary appendix in my paper below for simulation set-up.

Appendix B
In this appendix I will give a simulation-based proof of that the distribution of the ordinary T-test is asymptotically normal despite non-equal variances of the observations of (x_1,x_2,…,x_n), atleast under the condition used in this simulation. The outline of the simulation is as follows: (1) simulate x_i,i=1,2,…,200 and assign a random variance with pdf 1, that is, a uniformly distributed random variable, independently to each x_i; (2) calculate the T-statistic; (3) repeat step (1) and (2) 100 times; attain p-value from the Shapiro-Wilks normality test; (4) repeat steps (1), (2) and (3) 10000 times; (5) count number of times the p-values are below α=0.05. If T is asymptotically standard normally distributed then this simulation should give a result between 0.0457 and 0.0543 95 percent of the times. The result of the simulation described was 0.0497, as expected.

B.1 Code in R
Code:
prog <- function(N,n,M) {
set.seed(12345)
Z <- numeric(N); shap <- numeric(M); x <- numeric(n)
for (j in 1:M) {
for (i in 1:N) {
for (k in 1:n) {
x[k] <- rnorm(1,0,runif(1))
}
Z[i] <- mean(x)/(sd(x)/sqrt(n))
}
shap[j] <- shapiro.test(Z)\$p.value
}
p <- shap<0.05
return(mean(p))
}
prog(100,200,10000)

2. ## Re: I.I.D assumption of T-test

I am one of the least statistically savvy people here, but I don't think this is breaking news. I have been under the impression that every week more people throw out more times when the t-test assumptions don't have to hold true when sample sizes increase, etc. I saw this one earlier in the week.

Norman G. Likert scales, levels of measurement and the "laws" of statistics, Adv health sci educ theory pract. 2010;15:625.

3. ## Re: I.I.D assumption of T-test

While dason might know I suspect this is an area where you will have to do a literature review to know for sure. If you are writing a thesis your committe will probably require you do this to comment on if this is well known or not.

4. ## Re: I.I.D assumption of T-test

Seems like a simple corollary / special case of the Lyapunov CLT: http://en.wikipedia.org/wiki/Central...m#Lyapunov_CLT

5. ## The Following User Says Thank You to Dason For This Useful Post:

Englund (01-14-2014)

6. ## Re: I.I.D assumption of T-test

Originally Posted by noetsi
While dason might know I suspect this is an area where you will have to do a literature review to know for sure. If you are writing a thesis your committe will probably require you do this to comment on if this is well known or not.
It is not an important part of my thesis, not at all. That's why I put it in an appendix. So I don't think any prof or opponent will put any emphasis on this subject. I've had a quick glance in Casella and Berger on the matter but unfortanetely I haven't found anything :/

7. ## Re: I.I.D assumption of T-test

Note though that we really get asymptotic normality. The distribution *probably* isn't T-distributed but if you take a large enough random sample the T and the standard normal are close enough that you can't really tell.

8. ## Re: I.I.D assumption of T-test

Originally Posted by Dason
Note though that we really get asymptotic normality. The distribution *probably* isn't T-distributed but if you take a large enough random sample the T and the standard normal are close enough that you can't really tell.
Ah, yes of course. Thanks

9. ## Re: I.I.D assumption of T-test

Originally Posted by Englund
I've had a quick glance in Casella and Berger on the matter but unfortanetely I haven't found anything :/
You could use the Lindeberg condition to justify the asymptotic normality too and that is in Casella & Berger (although somewhat hidden). If you look at the first part of 5.8 Miscellanea you'll see it's about the Lindeberg condition.

10. ## Re: I.I.D assumption of T-test

In my case Lyapunov's condition holds.

.

Thanks again guys.

11. ## Re: I.I.D assumption of T-test

Another way to justify this is to recognize that you're really just sampling iid from the mixture distribution. It should be fairly obvious that the variance is finite for this distribution so you can just use the 'typical' CLT then.

12. ## Re: I.I.D assumption of T-test

Right you are. It is true for the simulation setting described in the first post. But in my case the variances are unequal and fixed, so the variance is not a random variable that can be "written into" the "parent" distribution (dunno the correct terminology, thus my pseudo-correct terminology :P). I can attach a preliminary report in a couple of days so you can read up on the subject. But as I said in a previous post, this problem is not really a problem and thus only treated in an appendix. The thesis is about a completely different subject.

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts