# Mixed ANOVA in SPSS - Output interpretation

#### twentyseven

##### New Member
Hello everyone

A knee prosthesis is well positioned when the line that goes from the center of the hip to the center of the ankle passes through the middle of the knee (more or less). We have three methods of knee alignment (conventional, navigation or robotic surgery). And we measured the alignment before and after surgery in the x-rays.
So we have the following variables:
• error_0: Difference (in absolute value) between the optimum (180 degrees) and the preoperative alignment. [error_0 = abs(180 - axis_0)]
• error_1: Difference (in absolute value) between the optimum (180 degrees) and the postoperative alignment. [error_1 = abs(180 - axis_1)]
• group: Type of treatment (conventional, navigation or robotic surgery)
We would like to see what method is better to get a better alignment. The trick here is that the postoperative alignment is the main indicator for a good result, but obviously is not the same to start from 179 degrees than from 160.

So I used a repeated measurements general linear model using error_0 and error_1 as the repeated measurement and group as grouping variable.

Looking at the graph, it looks that there might be some interaction:

And, if I am not in a mistake, looking at the following tables it seems that:
• There is a general change in the error from preop to postop if we do not look at the type of treatment (p<0.001)
• There is not differences in the mean error [(preop+postop/2)] between the 3 different types of treatment (p=0.174)
• But there is an interaction effect between the time and the type of treatment. Meaning that one or several treatments gets more improvement from the preop to the postop (p=0.020). And, by looking at the graph, robotic surgery is our best candidate (as expected).

But, when we run the post hoc test, we do not get significative differences. Why is that? What are those posc hot test referred to?

Any help would be greatly appreciated.
Thank you.

#### Karabiner

##### TS Contributor
Looking at the graph, it looks that there might be some interaction:
All three achieve about the same post-operative outcome. The (seeming) interaction comes from the fact
that "robotic" had a worse baseline. Why this? Weren't patients randomly allocated to groups?
And, by looking at the graph, robotic surgery is our best candidate (as expected).
As mentioned before, this is based on worse pre-operative errors (on average).
Wouldn't it be plausible to assume that the other methods would have achieved the
same post-op error, even if they had the same baseline as the robotic method?
But, when we run the post hoc test, we do not get significative differences. Why is that? What are those posc hot test referred to?
This is averaged main effect. It does not take into account interaction effect.
https://www.ibm.com/support/pages/repeated-measures-post-hoc-tests

With kind regards

Karabiner

#### twentyseven

##### New Member
Thanks for your kind explanation, Karabiner

Why this? Weren't patients randomly allocated to groups?
The study is retrospective (not randomized) and I guess that they used robotic in the patients that could benefit the most from the system (meaning, the ones that had worst alignment).

Wouldn't it be plausible to assume that the other methods would have achieved the
same post-op error, even if they had the same baseline as the robotic method?
It would. That was the purpose of the analysis.

This is averaged main effect. It does not take into account interaction effect.
I thought that might be the reason. So I guess that this post hoc contrast is related to the main effect of the group factor (p=0.174)

So I guess that the best thing here is to forget about the repeated measurement and just perform a regular ANOVA for the post hoc (indicating that there are preoperative differences between the groups)

Just for general knowledge... is there a way of getting post hoc pairwise comparisons for the interaction?

#### Karabiner

##### TS Contributor
So I guess that the best thing here is to forget about the repeated measurement and just perform a regular ANOVA for the post hoc (indicating that there are preoperative differences between the groups)
The message here seems to be "that method achieved results comparable to those from the other methods,
although the starting conditions were worse". Maybe there is the possibility of matching cases by pre-op
error, if baseline error values were overlapping, and maybe in addition a few prognostic factors could be
included.
Just for general knowledge... is there a way of getting post hoc pairwise comparisons for the interaction?
Repeating the analysis with k=2 groups.

With kind regards

Karabiner

#### twentyseven

##### New Member
Maybe there is the possibility of matching cases by pre-op
error, if baseline error values were overlapping, and maybe in addition a few prognostic factors could be
included.
I'll try that... although my only experience with propensity score matching was using diff command in Stata. Do you know where can I find how to do matching in a general linear model with repeated measurements?

Repeating the analysis with k=2 groups.
Thank you! Obviously I found differences (p=0.010) but, as you stated, they are probably due to the different baseline conditions.

Thanks a lot for your kind help.