I realize the term represents the interaction of the treatment with subjects, i.e., the treatment effects each participant differently, I am just trying to see its calculation more directly.

- Thread starter DrJBN
- Start date
- Tags anova error terms mixed

I realize the term represents the interaction of the treatment with subjects, i.e., the treatment effects each participant differently, I am just trying to see its calculation more directly.

Code:

```
> # set random number seed
> set.seed(12345)
>
> # create and examine wide-format dataset
> wide <- data.frame(pre=rnorm(10))
> wide$post <- wide$pre + 1 + rnorm(10)
> wide
pre post
1 0.5855288 1.469281011
2 0.7094660 3.526778061
3 -0.1093033 1.261324550
4 -0.4534972 1.066719284
5 0.6058875 0.855355461
6 -1.8179560 -0.001056128
7 0.6300986 0.743741030
8 -0.2761841 0.392238305
9 -0.2841597 1.836552908
10 -0.9193220 0.379401697
>
> # create and examine long-format dataset
> long <- data.frame(stack(wide), subject=factor(rep(1:10, 2)))
> names(long)[1:2] <- c("score", "time")
> long
score time subject
1 0.585528818 pre 1
2 0.709466018 pre 2
3 -0.109303315 pre 3
4 -0.453497173 pre 4
5 0.605887456 pre 5
6 -1.817955968 pre 6
7 0.630098551 pre 7
8 -0.276184105 pre 8
9 -0.284159744 pre 9
10 -0.919322002 pre 10
11 1.469281011 post 1
12 3.526778061 post 2
13 1.261324550 post 3
14 1.066719284 post 4
15 0.855355461 post 5
16 -0.001056128 post 6
17 0.743741030 post 7
18 0.392238305 post 8
19 1.836552908 post 9
20 0.379401697 post 10
>
> # do the ANOVA the traditional way
> aovmod <- aov(score ~ subject + time + subject:time, data=long)
> summary(aovmod)
Df Sum Sq Mean Sq
subject 9 11.756 1.306
time 1 8.269 8.269
subject:time 9 3.189 0.354
> # like you said, SS_subject*time = SS_total - SS_subject - SS_time
>
> # and the F for the time effect is MS_time / MS_subject*time
> c(F_time = 8.269 / 0.354)
F_time
23.35876
>
> # an equivalent way to do this ANOVA is as a one-sample t-test on the
> # difference scores between "post" and "pre" scores for each subject
> diffs <- wide$post - wide$pre
> t.test(diffs)
One Sample t-test
data: diffs
t = 4.8308, df = 9, p-value = 0.0009328
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
0.6837866 1.8881689
sample estimates:
mean of x
1.285978
> c(F_time = 4.8308 ^ 2)
F_time
23.33663
> # the same F except for a little rounding error
>
> # the subject*time MS is literally the variance of these differences
> # except that, for uninteresting mathematical reasons, we have to
> # divide them by square root of 2 first to get the exact same MS value
> var(diffs/sqrt(2))
[1] 0.354318
> # compare to ANOVA table above
>
> # in other words, these 10 differences ARE our 10 subject*time
> # interaction effects (multiplied by a constant)
> diffs
[1] 0.8837522 2.8173120 1.3706279 1.5202165 0.2494680 1.8168998 0.1136425
[8] 0.6684224 2.1207127 1.2987237
> # so their variance, i.e. mean square, represents MS_subject*time
>
> # another interesting thing is that the subject mean square
> # represents the variance of the 10 subject "main effects"
> # in other words, the variance of these subject means
> (sub_effects <- c(subject = rowMeans(wide)))
subject1 subject2 subject3 subject4 subject5 subject6
1.0274049 2.1181220 0.5760106 0.3066111 0.7306215 -0.9095060
subject7 subject8 subject9 subject10
0.6869198 0.0580271 0.7761966 -0.2699602
> # except, again, for uninteresting mathematical reasons, we have to
> # first multiply them by a constant (sqrt 2) to get the exact same MS
> var(sub_effects * sqrt(2))
[1] 1.306209
> # compare to ANOVA table above
```

I was aware of the relationship between the paired-samples t-test and the repeated measures ANOVA, and it was that relationship that was giving me problems.

As you demonstrate, you can get the error for the ANOVA by looking at the differences between the two samples. That the error represents the interaction of treatment with subjects becomes clear there. If the treatment had a constant effect, independent of the subject, then the differences would all be the same and the error zero.

What I could not do was "see" the literal extension to more than two levels (e.g., A, B, C). Where could I "see" the treatment by subjects interaction among the three variables?

I did finally make some sense of it, at least for myself (I think) by working with the 2-level case again.

8 5

2 4

7 2

5 2

Doing the calcs for a simple repeated measures ANOVA gave me

8 5 6.5

2 4 3

7 2 4.5

5 2 3.5

Source Sum of Squares

Total 37.88

Effect 10.12

Between Subjs. 14.38

Error (S x T) 13.76

Then, I apporached it differently by subtracting the mean of each subject from each A & B

A-Mean B-Mean

1.5 -1.5

-1 1

2.5 -2.5

1.5 -1.5

The individual differences have been removed completely and the "error" must be contained within each variable rather than in the differences. That is, the SS of each variable represents how the treatment affected the individuals differently at that level of the variable, rather than how the difference in the variables varied across subjects.

Am I thinking about this correctly?

The sum of squares within each variable is now 6.688, which in a typical between subjects ANOVA where the error is contained within each group, they sum to 13.76, effectively producing the error term from the repeated measures ANOVA.

p.s., how in the world do you get tabs/spaces into these posts so as to format numbers intro rows & groups?

p.s., how in the world do you get tabs/spaces into these posts so as to format numbers intro rows & groups?

I will respond to the substantive portion of your post as soon as I find some time today.