+ Reply to Thread
Results 1 to 11 of 11

Thread: I.I.D assumption of T-test

  1. #1
    TS Contributor
    Points: 6,786, Level: 54
    Level completed: 18%, Points required for next Level: 164

    Location
    Sweden
    Posts
    524
    Thanks
    44
    Thanked 112 Times in 100 Posts

    I.I.D assumption of T-test




    I am writing my second master's thesis now and I came across a situation where V[x_i]\neq{V[x_j]}. So I thought I'd test whether this need to be the case in order to be able to use the usual T-test. It turned out that even if the identically distributed assumption doesn't hold, the T-statistic is still asymptotically standard normal. In this simulation the variances are "well behaved", so I guess this result doesn't hold in all cases. But in my paper, the variances are unequal and "well behaved", so I guess I could have used a T-test, but this is another story.

    My question is, is this a well known result I came up with here? And what is the implications of this, does it "open up any doors" to whole new situations where the T-test can be used for large samples?

    See a preliminary appendix in my paper below for simulation set-up.

    Appendix B
    In this appendix I will give a simulation-based proof of that the distribution of the ordinary T-test is asymptotically normal despite non-equal variances of the observations of (x_1,x_2,…,x_n), atleast under the condition used in this simulation. The outline of the simulation is as follows: (1) simulate x_i,i=1,2,…,200 and assign a random variance with pdf 1, that is, a uniformly distributed random variable, independently to each x_i; (2) calculate the T-statistic; (3) repeat step (1) and (2) 100 times; attain p-value from the Shapiro-Wilks normality test; (4) repeat steps (1), (2) and (3) 10000 times; (5) count number of times the p-values are below α=0.05. If T is asymptotically standard normally distributed then this simulation should give a result between 0.0457 and 0.0543 95 percent of the times. The result of the simulation described was 0.0497, as expected.

    B.1 Code in R
    Code: 
    prog <- function(N,n,M) {
      set.seed(12345)
      Z <- numeric(N); shap <- numeric(M); x <- numeric(n)
      for (j in 1:M) {  
        for (i in 1:N) {
          for (k in 1:n) {
            x[k] <- rnorm(1,0,runif(1))
          }      
          Z[i] <- mean(x)/(sd(x)/sqrt(n))
        }
      shap[j] <- shapiro.test(Z)$p.value
      }
      p <- shap<0.05
      return(mean(p))
    }
    prog(100,200,10000)
    Last edited by Englund; 01-14-2014 at 03:50 PM.

  2. #2
    Omega Contributor
    Points: 38,303, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,993
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: I.I.D assumption of T-test

    I am one of the least statistically savvy people here, but I don't think this is breaking news. I have been under the impression that every week more people throw out more times when the t-test assumptions don't have to hold true when sample sizes increase, etc. I saw this one earlier in the week.

    Norman G. Likert scales, levels of measurement and the "laws" of statistics, Adv health sci educ theory pract. 2010;15:625.
    Stop cowardice, ban guns!

  3. #3
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: I.I.D assumption of T-test

    While dason might know I suspect this is an area where you will have to do a literature review to know for sure. If you are writing a thesis your committe will probably require you do this to comment on if this is well known or not.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  4. #4
    Devorador de queso
    Points: 95,705, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,931
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: I.I.D assumption of T-test

    Seems like a simple corollary / special case of the Lyapunov CLT: http://en.wikipedia.org/wiki/Central...m#Lyapunov_CLT
    I don't have emotions and sometimes that makes me very sad.

  5. The Following User Says Thank You to Dason For This Useful Post:

    Englund (01-14-2014)

  6. #5
    TS Contributor
    Points: 6,786, Level: 54
    Level completed: 18%, Points required for next Level: 164

    Location
    Sweden
    Posts
    524
    Thanks
    44
    Thanked 112 Times in 100 Posts

    Re: I.I.D assumption of T-test

    Quote Originally Posted by noetsi View Post
    While dason might know I suspect this is an area where you will have to do a literature review to know for sure. If you are writing a thesis your committe will probably require you do this to comment on if this is well known or not.
    It is not an important part of my thesis, not at all. That's why I put it in an appendix. So I don't think any prof or opponent will put any emphasis on this subject. I've had a quick glance in Casella and Berger on the matter but unfortanetely I haven't found anything :/

  7. #6
    Devorador de queso
    Points: 95,705, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,931
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: I.I.D assumption of T-test

    Note though that we really get asymptotic normality. The distribution *probably* isn't T-distributed but if you take a large enough random sample the T and the standard normal are close enough that you can't really tell.
    I don't have emotions and sometimes that makes me very sad.

  8. #7
    TS Contributor
    Points: 6,786, Level: 54
    Level completed: 18%, Points required for next Level: 164

    Location
    Sweden
    Posts
    524
    Thanks
    44
    Thanked 112 Times in 100 Posts

    Re: I.I.D assumption of T-test

    Quote Originally Posted by Dason View Post
    Note though that we really get asymptotic normality. The distribution *probably* isn't T-distributed but if you take a large enough random sample the T and the standard normal are close enough that you can't really tell.
    Ah, yes of course. Thanks

  9. #8
    Devorador de queso
    Points: 95,705, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,931
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: I.I.D assumption of T-test

    Quote Originally Posted by Englund View Post
    I've had a quick glance in Casella and Berger on the matter but unfortanetely I haven't found anything :/
    You could use the Lindeberg condition to justify the asymptotic normality too and that is in Casella & Berger (although somewhat hidden). If you look at the first part of 5.8 Miscellanea you'll see it's about the Lindeberg condition.
    I don't have emotions and sometimes that makes me very sad.

  10. #9
    TS Contributor
    Points: 6,786, Level: 54
    Level completed: 18%, Points required for next Level: 164

    Location
    Sweden
    Posts
    524
    Thanks
    44
    Thanked 112 Times in 100 Posts

    Re: I.I.D assumption of T-test

    In my case Lyapunov's condition holds.

    \lim_{n\rightarrow\infty}\frac{1}{s^{2+\delta}}\sum_{n=1}^{\infty}E[|x_i-\mu_i|^{2+\delta}]=0.

    Thanks again guys.

  11. #10
    Devorador de queso
    Points: 95,705, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,931
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: I.I.D assumption of T-test

    Another way to justify this is to recognize that you're really just sampling iid from the mixture distribution. It should be fairly obvious that the variance is finite for this distribution so you can just use the 'typical' CLT then.
    I don't have emotions and sometimes that makes me very sad.

  12. #11
    TS Contributor
    Points: 6,786, Level: 54
    Level completed: 18%, Points required for next Level: 164

    Location
    Sweden
    Posts
    524
    Thanks
    44
    Thanked 112 Times in 100 Posts

    Re: I.I.D assumption of T-test


    Right you are. It is true for the simulation setting described in the first post. But in my case the variances are unequal and fixed, so the variance is not a random variable that can be "written into" the "parent" distribution (dunno the correct terminology, thus my pseudo-correct terminology :P). I can attach a preliminary report in a couple of days so you can read up on the subject. But as I said in a previous post, this problem is not really a problem and thus only treated in an appendix. The thesis is about a completely different subject.
    Last edited by Englund; 01-16-2014 at 02:39 PM.

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats