Mixed models in SPSS -- swapping predictor and predicted?

#1
Hi,

I'm working on a project where I have repeated measures of a (continuous) IV and DV. At each measurement "time" I would like to predict the DV from the IV. I also have a few categorical IVs that do not vary across measurements. My goal is to see if the strength of the relationship between the continuous IV and DV depends on either of the categorical IVs.

I've been trying to use SPSS's mixed procedure to accomplish this (more specifically, using /REPEATED and COVTYPE(AR1)). However, I've encountered several problems I was hoping someone here could help me with:

1: mixed only gives unstandardized coefficients, and my tests of the interactions between the continuous IV and categorical IVs seem to be based on these unstandardized coefficients. Therefore, these tests are not telling me anything about changes in the strength of relationships, correct? My reasoning is: the unstandardized coefficient may very well be bigger in Condition A than in Condition B; however, if I swap the IV and the DV, the effect will be the opposite, i.e., a bigger coefficient in Condition B than in Condition A.

This leads me to my first general question: can you switch the predictor and predicted variables in a mixed model and expect the same results?

2: in an attempt to get around this and make sure the tests are based on standardized values, I converted my IV and DV values to Z scores and reran the analyses. Still, I get totally different results depending on which variable I use as the predictor and which I use as the predicted.

How is this possible? I'm thinking that it may have to do with the fact that the size of the continuous IV values is strongly related to the Conditions they are in. Could such a relationship cause this type of result?

Am I totally misunderstanding the mixed procedure?

I would extremely appreciate any help I can get here. I'm more convinced of my stupidity with each passing hour of going over these SPSS manual chapters. Thank you!
 

spunky

Can't make spagetti
#2
can you switch the predictor and predicted variables in a mixed model and expect the same results?
nope, you will most certainly wont get the same results. but that is true even of regular OLS regression so no surprises there i think...

2: in an attempt to get around this and make sure the tests are based on standardized values, I converted my IV and DV values to Z scores and reran the analyses. Still, I get totally different results depending on which variable I use as the predictor and which I use as the predicted.

How is this possible? I'm thinking that it may have to do with the fact that the size of the continuous IV values is strongly related to the Conditions they are in. Could such a relationship cause this type of result?
not necessarily. the fixed part of a mixed model is invariant under linear transformations, but the random part is not. Z scores are a linear transformation so you can certainly expect a lot of differences between transformed and untransformed variables in mixed-effects models. although that thing that you mention about the size of the IV being related to the conditions they are intrigues me because it may or may not pose a problem.

Am I totally misunderstanding the mixed procedure?
Sorta... but these things are tricky so i wouldnt worry too much about it ;)
 
#3
thanks for the reply, spunky!

what you said about OLS is right and made me realize that the issue i'm having is not just with "mixed."

i need to learn why it affects the results when the predictor and predicted values are switched in a case like mine. if you or someone else will explain it here, it would be appreciated.

also, since the results differ depending on which variable is the IV and which the DV, how does one decide which way to conduct/report the regression? whichever one has a higher R squared value? (side-note: wouldn't this be "cherry-picking" the analysis that works best?)

thanks again for your insights.

ps. after posting last time i realized that what i said about the size of one of the continuous variables being related to my categorical IV cannot be an issue in my case, because i calculated the Z scores within each condition -- therefore the mean level of both continuous variables is 0 for each level of the categorical IV.
 
#4
i need to learn why it affects the results when the predictor and predicted values are switched in a case like mine. if you or someone else will explain it here, it would be appreciated.
One aspect of my data is that one of the two continuous variables is positively skewed while the other is more normally distributed. Does the positive skew in a variable have more of an impact on regression if it is used as the outcome variable?

I'm not good at thinking abstractly about statistics, so I simulated data like mine and varied the positive skew of one of the continuous variables. When I use the skewed variable as the predictor, the amount of skew doesn't affect R squared and parameter estimates much. However, when I use it as the outcome variable, the R squared goes down more with greater skew and the parameter estimates get more messed up (i.e., more "wrong" based on how I actually generated the data).

Does this pattern of results make sense to anyone? I could really use some input. I don't know if what I'm saying here is obviously true, an interesting possibility, or obviously nonsense...
 

spunky

Can't make spagetti
#5
i need to learn why it affects the results when the predictor and predicted values are switched in a case like mine. if you or someone else will explain it here, it would be appreciated.
because depending on what you have on your dependent or independent variables, the variance gets partitioned in different ways. not all predictors share the same proprtion of variance among themselves and with the dependent variable. the only case in which they end up being the same is if you have only one dependent variable and only one independent variable, in the case of the standardized regression coefficient (or if both variables have the exact same standard deviations it should also work on the unstandardised regression coefficient, but i'll have to think about that one more). i'm talking here of course in the case of regular, easy-cheesy OLS regression. you can make a similar case for mixed effects regression, but the partition of the variance is significanly more complicated given the inclusion of random effects.

also, since the results differ depending on which variable is the IV and which the DV, how does one decide which way to conduct/report the regression? whichever one has a higher R squared value? (side-note: wouldn't this be "cherry-picking" the analysis that works best?)
ehem... no. that is a question that gets answered from whichever substantive theory, research hypothesis, design, etc. on which you're basing your analysis. you can regress anything on anything, the mathematics will always work out. the problem comes in answering what should you regress and its interpretation. but that's an issue of research design and not of statistics, so i can't really help you out much there unless i know what exactly you're looking for. but if you're just fishing for a regression model by switching around variables, that's sorta bad practice, lol. you need a hypothesis to drive your model-building.

One aspect of my data is that one of the two continuous variables is positively skewed while the other is more normally distributed. Does the positive skew in a variable have more of an impact on regression if it is used as the outcome variable?

I'm not good at thinking abstractly about statistics, so I simulated data like mine and varied the positive skew of one of the continuous variables. When I use the skewed variable as the predictor, the amount of skew doesn't affect R squared and parameter estimates much. However, when I use it as the outcome variable, the R squared goes down more with greater skew and the parameter estimates get more messed up (i.e., more "wrong" based on how I actually generated the data).

Does this pattern of results make sense to anyone? I could really use some input. I don't know if what I'm saying here is obviously true, an interesting possibility, or obviously nonsense
that is sort of expected in a way. the distribution of your variables is somewhat irrelevant when it comes to an assumed distribution. it's the conditional distribution of the dependent variable on your independent variable what can screw you up. the distribution of the independent variables are mostly irrelevant for the analysis, so you can leave those untouched. the distribution of your dependent variable can indeed affect your estimates so, when in doubt, transform the heck out of it (preferably through a log transformation since it has a nice interpretation of change in terms of %ages. other transformations, although useful, can become tricky to interpret.

you're talking about R-sq'd now. did you switch from a mixed-model approach to regular OLS regression?
 
#6
Thanks again for your replies, spunky! Very useful.

you're talking about R-sq'd now. did you switch from a mixed-model approach to regular OLS regression?
I picked one of the levels of the repeated measure and was doing OLS regression just to help my understanding of why the results differ so much when the continuous IV and DV are switched.