+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 25

Thread: paired samples-equality of variance and 95% CI around difference

  1. #1
    Points: 5,209, Level: 46
    Level completed: 30%, Points required for next Level: 141

    Posts
    16
    Thanks
    0
    Thanked 1 Time in 1 Post

    paired samples-equality of variance and 95% CI around difference




    Hi,

    Can a few of you please review the approach I plan to take for obvious errors?

    I have 50 subjects and each have a measure taken on the same variable before and after treatment. So, this is standard paired t-test time, but what I am actually interested in is the variance of the treatment versus the control. I would like to test the equality of variance for these two groups of values (treatment and control) and also place a 95% confidence interval around the difference of these two variances. I would prefer randomization/resampling methods to be used for each as normality assumptions do not hold and I would like a robust result. I have not found any routines specifically like what I would want, so I think I may have to do the following in R. Any advice on an easier or better approach is welcome.

    I know that equality of variance for paired data can be tested using the pitman-morgan statistic. I was planning on calculating this for the original data and then randomly switching the values within pairs the the pre-treatment and post-treatment measures in order to achieve randomization that respects the paired nature of the data. I could then extract p-values based upon the percent of randomizations with more extreme pitman-morgan statistic.

    For the 95% CI interval around the differences, I thought I would resample pairs of values with replacement. So, I would select among the 50 subjects 50 times with replacement. I would then calculate the variance for the pre-treatment measures and for the post-treatment measures and I would then take the difference and store this value. I would do this many times and then determine the 95% confidence interval by ordering my resamples and simply taking the 2.5% and 97.5% percentiles.

    Does this make sense at all?

    Thanks,
    Seth

  2. #2
    TS Contributor
    Points: 8,362, Level: 61
    Level completed: 71%, Points required for next Level: 88

    Location
    Crete, Greece
    Posts
    717
    Thanks
    0
    Thanked 35 Times in 34 Posts

    Re: paired samples-equality of variance and 95% CI around difference

    The third paragraph looks nice to me. But bear in mind that since they are paired data, calculate the differences and construct a 95 ci for the mean difference.
    Now if you want to see whether the two groups are diffrent I suggest you do something like repeated measures anova. it is let's say a paired t-test that will also see if the two groups are significantly different or not. There is definetely a routine in R for this, but I haven't used it. I use SPSS for this usually.
    What you sauid in the second paragraph. Why do you need a pitman-morgan test? for two dependent samples?

  3. #3
    Points: 5,209, Level: 46
    Level completed: 30%, Points required for next Level: 141

    Posts
    16
    Thanks
    0
    Thanked 1 Time in 1 Post

    Re: paired samples-equality of variance and 95% CI around difference

    Thanks for the reply. I was wanting a confidence interval around the differences in variances between control/treatment. That's why I mentioned resampling and taking the variance of control and treatment from each resample and then taking the difference in these two variances for each iteration. What you wrote was encouraging but left me a bit confused, so maybe I didn't express myself well. It would be like this for one resample (using 3 subjects for ease of illustration).

    subject control treatment
    1 10 15
    2 15 20
    3 7 9

    A sample with replacement might be subject 1, subject 2, and subject 2 (again).

    The variance for the control would be the variance of (10, 15, 15) = X and that of the treatment would be variance of (15,20, 20) = Y. I would then subtract X from Y (Y-X) and store this value. If I did this 1,000 times, let's say I would have a mean difference in variances as well as 95% CI using percentile method or another. Make sense?

    As far as the Pitman-Morgan statistics, it is a test of equality of variance for paired data. Other's like Levene's and Bartlett's test are tests of equality for independent data (no pairs). There is a paper with a permutation test using the Pitman-Morgan stat, but it attempts to test the joint null hypothesis that there is no difference in mean or variance between the paired samples, so I would want to eliminate the difference in mean part as that is not of interest in my case.

    Thanks again,

    Seth

  4. #4
    TS Contributor
    Points: 8,362, Level: 61
    Level completed: 71%, Points required for next Level: 88

    Location
    Crete, Greece
    Posts
    717
    Thanks
    0
    Thanked 35 Times in 34 Posts

    Re: paired samples-equality of variance and 95% CI around difference

    Ok is see, yes seems ok for the bootstrap. But I strogly suggest not the difference in the variances, but the ratio.

  5. #5
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: paired samples-equality of variance and 95% CI around difference

    Quote Originally Posted by Masteras View Post
    Ok is see, yes seems ok for the bootstrap. But I strogly suggest not the difference in the variances, but the ratio.
    I second this suggestion.

  6. #6
    Points: 5,209, Level: 46
    Level completed: 30%, Points required for next Level: 141

    Posts
    16
    Thanks
    0
    Thanked 1 Time in 1 Post

    Re: paired samples-equality of variance and 95% CI around difference

    Thanks, as luck would have I was just reading about this. Yes, it seems more straightforward. Is this suggested because it is similar to an F-test? Because it holds more information?

  7. #7
    TS Contributor
    Points: 8,362, Level: 61
    Level completed: 71%, Points required for next Level: 88

    Location
    Crete, Greece
    Posts
    717
    Thanks
    0
    Thanked 35 Times in 34 Posts

    Re: paired samples-equality of variance and 95% CI around difference

    because it hoilds more inofrmation. The F-test uses the same test statistic, (not because it is similar to an F-test, this statement is not quite correct).

  8. #8
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: paired samples-equality of variance and 95% CI around difference

    I have to confess that I can't really see how the ratio of variances is in general a better choice than the difference of variances for this application. I was curious exactly how one would implement this type of bootstrapping in R anyway, so I worked it out on some simulated data and compared the results of var(A)-var(B) to var(A)/var(B). Code and results are below.

    I'm going to cross-post a version of this code in the code sticky from the R/Splus forum shortly, along with some geeky details about the implementation and a discussion of how to get it to play even nicer with the boot() function, if anyone is interested in that. But you should be able to adapt this code with minimal modification to work on your data as is, seth.

    Code: 
    > library(data.table)
    > library(boot)
    > set.seed(12345) # I've got the same combination on my luggage!
    > 
    > ### make some paired data with unequal variances
    > dat <- data.table(subject=rep(1:50,2),
    +                   prepost=rep(c(-1,1),each=50),
    +                   subint=rep(rnorm(50,mean=0,sd=5),2),
    +                   subslope=rep(rnorm(50,mean=5,sd=3),2),
    +                   error=c(rnorm(50,mean=0,sd=5),rnorm(50,mean=0,sd=10)),
    +                   key="subject,prepost")
    > dat$dv <- round(55 + dat$subint + dat$subslope*dat$prepost + dat$error,2)
    > dat <- data.table(subject=1:50,
    +                   pre=dat[prepost==-1]$dv,
    +                   post=dat[prepost==1]$dv,
    +                   key="subject")
    > 
    > ### examine
    > head(dat)
         subject   pre  post
    [1,]       1 55.67 45.11
    [2,]       2 41.92 74.87
    [3,]       3 51.40 61.57
    [4,]       4 40.05 50.72
    [5,]       5 55.75 59.93
    [6,]       6 37.40 49.23
    > nrow(dat)
    [1] 50
    > dat[,list(mean_pre=mean(pre),mean_post=mean(post))]
         mean_pre mean_post
    [1,]  49.8998   62.8648
    > dat[,list(var_pre=var(pre),var_post=var(post))]
          var_pre var_post
    [1,] 79.73543 142.5389
    > cor(dat$pre,dat$post)
    [1] 0.2546421
    >                              
    > ### bootstrap!
    > getvarstats <- function(data, seeds) {
    +   index <- max.col(matrix(c(c(seeds[2:length(seeds)],seeds[1]),seeds),ncol=2))-1
    +   index[length(seeds)] <- !index[length(seeds)]
    +   index <- c(1:length(seeds)+index*length(seeds),
    +              1:length(seeds)+index*length(seeds)+length(seeds))
    +   values <- c(data$pre,data$post,data$pre)
    +   d <- data.table(pre=values[index[1:length(seeds)]],
    +                   post=values[index[seq(length(seeds)+1,2*length(seeds))]])
    +   return(c(vardiff = var(d$post) - var(d$pre),
    +            varratio = var(d$post)/var(d$pre),
    +            postvar = var(d$post),
    +            prevar = var(d$pre)))
    + }
    > resamples <- 1000000
    > system.time({results <- boot(data=dat, statistic=getvarstats, R=resamples)})
        user   system  elapsed 
     983.075    8.630 1048.131 
    > hist(results$t[,1],breaks=100)
    > p_diff <- mean(results$t[,1] > var(dat$post)-var(dat$pre) 
    +                | results$t[,1] < var(dat$pre)-var(dat$post))
    > p_diff
    [1] 0.085984
    > hist(results$t[,2],breaks=100)
    > p_ratio <- mean(results$t[,2] > var(dat$post)/var(dat$pre) 
    +                 | results$t[,2] < var(dat$pre)/var(dat$post))
    > p_ratio
    [1] 0.015534
    The bootstrapped p-value for the variance difference is .086, while the bootstrapped p-value for the variance ratio is .016, despite that they used the exact same resamples. Since I simulated the data such that the variance for post "truly is" greater than the variance for pre, we might say that if alpha=.05 then the result from the variance difference is a type II error while the result from the variance ratio is not. Testing relative power and type 1 error rates for these using many simulated data sets would be interesting and wouldn't involve much more work at all if anyone is interested.

    But again, I confess that I don't have an intuitive grasp on why these two statistics differ practically in terms of the results they get. My prediction was that the two results from above would be identical. I would appreciate some insight on this. Even the apparently obvious fact that the variance ratio "holds more information" than the variance difference is not at all obvious to me...

  9. #9
    TS Contributor
    Points: 8,362, Level: 61
    Level completed: 71%, Points required for next Level: 88

    Location
    Crete, Greece
    Posts
    717
    Thanks
    0
    Thanked 35 Times in 34 Posts

    Re: paired samples-equality of variance and 95% CI around difference

    Opa wait, you said you wanted to calculate confidence intervals, not perform hypothesis testing. Furthermore, the absolute difference of the variances could be something, not just the differences. Is the differecne a pivotal statistic? no, the ratio is. But anyway, did you do resampling under the null hypothesis? I think not.

  10. #10
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: paired samples-equality of variance and 95% CI around difference

    Yes, it resamples the variance difference and the variance ratio under the null hypothesis that they are 0 and 1, respectively.
    Code: 
    > quantile(results$t[,1],probs=c(.025,.5,.975))
            2.5%          50%        97.5% 
    -71.14886036  -0.00124751  71.13353136 
    > quantile(results$t[,2],probs=c(.025,.5,.975))
         2.5%       50%     97.5% 
    0.6216802 0.9999918 1.6083679
    Histogram of variance difference resamples:


    Histogram of variance ratio resamples:

  11. #11
    TS Contributor
    Points: 8,362, Level: 61
    Level completed: 71%, Points required for next Level: 88

    Location
    Crete, Greece
    Posts
    717
    Thanks
    0
    Thanked 35 Times in 34 Posts

    Re: paired samples-equality of variance and 95% CI around difference

    Ok, let's say that you have a point. You said before

    "The bootstrapped p-value for the variance difference is .086, while the bootstrapped p-value for the variance ratio is .016, despite that they used the exact same resamples. " the results agree, there is not difference in the variances based upon the confidence intervals of either way.

  12. #12
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: paired samples-equality of variance and 95% CI around difference

    How do you figure that they agree? The results based on the variance difference would lead us to believe that our observed results are over 5x more likely under the null hypothesis than the results based on the variance ratio.

  13. #13
    TS Contributor
    Points: 8,362, Level: 61
    Level completed: 71%, Points required for next Level: 88

    Location
    Crete, Greece
    Posts
    717
    Thanks
    0
    Thanked 35 Times in 34 Posts

    Re: paired samples-equality of variance and 95% CI around difference

    the confidence intervals you gave us and the plot of the bootstrapped data also.

  14. #14
    Cookie Scientist
    Points: 13,431, Level: 75
    Level completed: 46%, Points required for next Level: 219
    Jake's Avatar
    Location
    Austin, TX
    Posts
    1,293
    Thanks
    66
    Thanked 584 Times in 438 Posts

    Re: paired samples-equality of variance and 95% CI around difference

    ...also what?

  15. #15
    TS Contributor
    Points: 8,362, Level: 61
    Level completed: 71%, Points required for next Level: 88

    Location
    Crete, Greece
    Posts
    717
    Thanks
    0
    Thanked 35 Times in 34 Posts

    Re: paired samples-equality of variance and 95% CI around difference


    also the plots of the bootstrapped data.

+ Reply to Thread
Page 1 of 2 1 2 LastLast

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats