Linear Mixed Models in SPSS for repeated measures

#1
I'm having trouble formulating a model with Linear Mixed Models in SPSS.

I'm trying to overcome the problem of related errors due to repeated measurements by using LMM instead of linear regression. However, SPSS mixed allows one to specify /RANDOM factors and/or /Repeated factors and I don't know which to use (or both).

In my situation, if I use both, the model does not converge. Using repeated only yields the best model (in terms of fit) but I'm not sure if only specifying the repeated factor is sufficient. Somewhere else I read in that case, it's actually a 'Marginal Model' (aka population-averaged model) and can be used if some requirements are met:

http://www.analysisfactor.com/statchat/repeated-measures-approaches/

However, I'm not sure if it's applicable to my data. I have four measurements on each subject, so I'm trying to model the lack of independence between observations. Can someone help me out? I have some books and articles on the use of this function in SPSS but they seem to contradict each other on this matter.
 

Masteras

TS Contributor
#2
random factors and repeated factors? what do you mean? You have 4 measurements for some observations. how many? do you have also other independet variables? sex for instance?
 
#3
I'll try to explain a bit more by showing you the design.

Group Cat1 Cat2 Cat3 Cat4
1 AA1 AB2 BA3 BB4
2 AB3 BA4 BB1 AA2
...
16 BB2 AA3 AB4 BB1

So, each respondents evaluates 4 stimuli in four different categories, which is the (ordinal) DV. The stimuli are constructed based on three factors; factor 1 (A or B), factor 2 (A or B) and four replications within the combination of factors (1,2,3 or 4). I'm interested in the effect of factor 1 and 2, controlling for the different replications and categories.

Consequently, the data looks like this:

Subj F1 F2 Rep Cat Evualation
1 A A 1 1 3.1
1 A B 2 2 2.4
1 B A 3 3 3.0
1 B B 4 4 1.0
2 .......

A regular linear regression won't work, since there's autocorrelation due to the non-independence of the observations (4 observations for each cluster/subject). So, using the literature on MLM, I'm trying to formulate, for example using the following syntax:

MIXED evaluation BY F1 F2 Rep Cat
/FIXED=BY F1 F2 Rep Cat | SSTYPE(3)
/METHOD=ML
/PRINT=G SOLUTION
/REPEATED=Cat | SUBJECT(subj) COVTYPE(UN).

However, I cannot figure out if I should add a '/random' statement defining random intercepts (either in combination with the '/repeated' statement) or that this is in fact realized by the repeated statement itself.

Does this clarify the question?
 

Masteras

TS Contributor
#4
Ok, better now, the different replicatons and cateogries you want to control for what are they? Isnt' it true, that since you have 2 factors with 2 levels each, each respondent will give 4 answers, thus 4 repetitions? and the categories?
 
#5
Well not quite.. The factors are parity (even/uneven) and magnitude (high/low). The replications consist of different digits. The replications are merely to control for the effect of the digits, I'm only interested in the first two factors. Schematically it looks like this:

Even Uneven
Low 2,4,6 or 8 1,3,7 or 9
High 200,400,600 or 800 100,300,700 or 900

So, every participant gets one of the for magnitude*parity combinations, one in each of the four product categories. The replications and category-stimulus combinations are balanced across groups, as shown below in a more detailed overview.

C1 C2 C3 C4
Group 1 1 200 6 700
Group 2 3 400 8 900
Group 3 7 600 2 100
Group 4 9 800 4 300
Group 5 2 100 7 600
Group 6 4 300 9 800
Group 7 6 700 1 200
Group 8 8 900 3 400
Group 9 100 2 600 7
Group 10 300 4 800 9
Group 11 700 6 200 1
Group 12 900 8 400 3
Group 13 200 1 700 6
Group 14 400 3 900 8
Group 15 600 7 100 2
Group 16 800 9 300 4

Does that make it even clearer? :D
 

Masteras

TS Contributor
#6
it is a still fcuked up situation for me. two factors low high and even uneven their two corresponding levels. And from this table you have each participant chooses four rows to do. is this right? I think I will not be abel to help you any more. Sorry for that, try to see posts in random effects and find somebdoy who might knwo smore about these strange patterns. I cannot go any further, sorry.
 

Masteras

TS Contributor
#9
I suggest you ask someone expert or find from similar posts who answered these type of questions. I do not think anyone is going to answer you now.
 
#10
Isn't this the place were the experts are? :p Of course, I looked around on the forum for similar posts. In fact, I've been doing that for days now, and not just on this forum. I think I just need some advice on the specific design, though.. But thanks for the tip again.
 

Masteras

TS Contributor
#11
This is a place where people know some things more in statistics. It does not mean we have to know everything.

Maybe your problem is too difficult to be solved with standard packages and maybe you need either to go to a more package or there is no function for this specific problem. Maybe you did something wrong from the beginining and got yourself into this trouble.
 

spunky

Smelly poop man with doo doo pants.
#12
ok... so here are my two cents. from what it looks like to me, the most appropriate way of doing it would be to both treat this as repeated-measures with random intercepts. i guess it sort of makes sense as for why SPSS does not converge. if you're using the regular version (and not their extended and hence pricier version which i think they call for "advanced statistics"), they use a pretty clunky version of full information maximum likelihood that has its own set of problems for this kind of models (cf Savalei & Bentler, 2009 for a good list of references although the article doesnt talk about this issue in particular). as for why repeated gives you the best model fit, well... it has less things to estimate and keeps a lot of the variance constant, so i guess it makes sense as for why it would do so, although fit in this case could well be a statistical artifact. have you considered using a generalized estimating equations approach? they can handle a lot of MLM kind of data and are faster in their estimates..

in general, however, Masteras is right. anyoen with even a basic understanding of research design can realise very quickly how messy your data is and, to be honest, it's this kind of data what gets your name in a publication as the methodology specialist... hence the lack of advice :)
 
#13
Thanks for your reply. I didn't try GEE yet, but doesn't your final comment apply to that even more? :p I mean, MLM is discussed in some regular textbooks but I haven't come across any chapters on GEE. Or am I wrong?

About Advanced Statistics.. Isn't that an add-on for the 'Base' functionality of SPSS and isn't LMM only available through the Adv. Stat. add-on (e.g. http://www.hks.harvard.edu/fs/pnorris/Classes/A SPSS Manuals/SPSS Advanced Statistics 17.0.pdf) ? If true, since I'm using LMM, I hav Adv. Stat.

I agree the data is somewhat messy, although I don't know what you exactly mean by that. I did come up with the design with my thesis coach, who is a distinguished professor himself so I think he knows what he is talking about (although I don't think he's an expert in the field of MLM etc.).
 

spunky

Smelly poop man with doo doo pants.
#14
thing is i dont really use SPSS a lot (R is the default program for almost everyone who does research in theoretical statistics, i'm kind of self-teaching myself SPSS as i tutor my students) so the version i have is pretty old and has "advanced statistics" as an optional package. like AMOS for structural equation modeling which can be purchased extra. good to know it comes included now in more recent versions.

for reasons beyond my understanding, GEEs are not as popular as multilevel models although they can handle a lot of multilevel data without the extra work in iterations (or invoking some crazy likelihood functions which are a !@$#@ to optimize). GEEs are another set of techniques which pre-dates multilevel modeling and developed completely independently from it, although it attempted to address a very similar question. if you were to ask me, i'd say that it's because Liang & Zeger (the guys who came up with GEEs) are mathematical statisticians whereas Raudenbush (one of the founding fathers of multilevel/hierarchical linear modeling) is a sociologist, so almost everyone who workes in social/behavioural/health/bio sciences jumped in that bandwagon and left Liang & Zeger kind of out there... ok, but that's just me.

anyways, by "messy" i mean you just have some weird nestedness+repreated-measures situation going on there that's gonna be a bi@tch to analyse properly... and hence, my point. i am not questioning the ability of your prof to come up with good designs that would help to strengthen your conclusions... thing is in my years since i made the switch from more theoretical/mathematical statistics to applied quantitative analysis i've come to realise that people sometimes come with these monsters of designs without realising that, as this thing gets crazier, the stats you'll need to handle them will get even crazier...