SEM - Comparing Models with Different Variables

#1
I am doing some structural equation modeling. I have made one model with 3 exogenous variables predicting 5 endogenous variables and would like to compare that to a second model that is identical, except it contains one more exogenous variable (a fourth). Basically, I want to see whether adding this extra predictor allows us to explain more variance in the endogenous variables.

My question is, how do I compare these models? I have heard that AIC and BIC are not appropriate for models using different variables, especially ones that have a different number of variables. Is ECVI appropriate for this?

Given my main question about whether introducing this fourth predictor helps us explain more variance in the endogenous variables, I will pay special attention to the residual variances of my outcome variables (whether they get smaller for the second model), but it still seems like there should be some index of fit I should use to compare these models...

Thanks
 

noetsi

Fortran must die
#2
Where did you hear that you can not use ACI or BIC when there are different variables (that is when the model is non-nested)? I quote from Byrne's "Structural Equation Modeling with Mplus"(71-72)...the AIC and BIC are used in comparison of two or more nonnested models....[of AIC usage] This is particularly so with respect to the situation where a researcher proposes a series of plausible models and wishes to determine which in the series yields the best fit..."

I am not an expert in SEM, however so I may well have missed something in this regard.
 
#3
Well, I stumbled upon this in someone's blog "The problem is that the point estimates of AIC-likecriteria are partially dependent upon the number ofmanifest variables. If different models have differentnumbers of manifest variables, then it would be difficultto meaningfully compare values across models." (http://zencaroline.blogspot.com/2007/08/non-nested-sem-model.html), but she doesn't have any citations for this particular point, so...I don't know what to make of it.

I also came across this, in a document posted online, which appears to be an information sheet that is used in an SEM class "Models with more variables tend to have larger chi-squares...Akaike’s Information Criterion (AIC), the Bayesian Information Criterion (BIC), the Expected Cross-validation Index (ECVI), the root mean square residual (RMR), and the standardized root mean square residual (SRMR)...have similar problems to those of the chi-square, because they are based on simple variations on chi-square"...again, not sure what to make of that.

When I went to my textbook (Kline, 2011) he starts of the section of comparing nonheirarchical models (I'm assuming nonheirarchical is the same as non-nested?) by saying "Sometimes researchers compare alternative models based on the same variables measured in in the same sample that are not heiracrchically related." The fact that he specified that they contain the "same variables" made me think there was a distinction between models which are non-nested and those that contain different variables (or an additional variable), but I'm not sure...

Thank you for your response, it is possible that it is simply not the case that AIC or BIC are inappropriate for my purposes (I'm not just going to take a random blogs word for it, at least)...I've been having a hard time finding journal articles on this - I think I am going to find any and all textbooks on SEM at our library and see if that helps clear up my question.
 
#4
I know this is an old post...but did you ever figure this out or decide what to do with your models? I'm having exactly the same issue with my own analysis. I have two models and I want to see if adding one IV/exogenous variable to the second model helps to explain more variance. I had wanted to use the AIC, BIC, and ECVI, and no one at my proposal meeting mentioned that it might be a problem (this is my dissertation), but I too have read a couple of forum and blog posts saying that you shouldn't use those indices to compare non-nested models with different sets of variables. If it matters at all, my dependent variable is the same for both models.
 
#5
Anyone have any ideas on this? Is it acceptable to use AIC, BIC, ECVI, to compare models if one model has an additional predictor (exogenous variable) that the other model (otherwise identical) doesn't have?
 

spunky

Doesn't actually exist
#6
Is it acceptable to use AIC, BIC, ECVI, to compare models if one model has an additional predictor (exogenous variable) that the other model (otherwise identical) doesn't have?
Nope, it is not. The only way I've seen this work is through Vuong's (1989) likelihood ratio tests within finite mixtures, but it is not routinely implemented in any SEM package and it's kind of math-heavy, so I'm not sure if you want to go there.

What's wrong with just leaving the exogenous predictor there in the model and just not model it, letting it freely covary with everything it wants to covary?
 
#7
Yeah, I have no idea how to do the Vuong thing. I did read about it, but it made no sense to me and it made my brain hurt and I couldn't even figure out if it would be useful. So basically I would leave the extra predictor there and not have any paths from it to the other variables (including the DV)? And then I could use AIC, BCC, ECVI? Wouldn't it kind of be a nested model then? Or am I just confusing myself...
 

spunky

Doesn't actually exist
#8
Yeah, I have no idea how to do the Vuong thing. So basically I would leave the extra predictor there and not have any paths from it to the other variables (including the DV)? And then I could use AIC, BCC, ECVI? Wouldn't it kind of be a nested model then? Or am I just confusing myself...
yup. leave it there and don't add any direct paths to/from it (BUT DO NOT SET ANY COVARIANCES BETWEEN IT AND OTHER VARIABLES TO 0). just let it free to covary with whatever it wants... which guarantees both models have the same dimensions in their covariance matrices and makes comparisons between them meaningful.

it may be nested, or it may not be. being able to decide whether two SEM models are nested or not just from the naked eye is not straight-forward unless you're dealing with very simple models. each SEM model implies a potentially infinite set of equivalent models and you need to make sure none of the elements in said sets overlap for them not to be nested.

thankfully, there is a straight-forward procedure (the NET procedure) developed by Bentler and Satorra (2010) to help you decide whether models are nested, not nested or equivalent. details in:

Bentler, P. M., & Satorra, A. (2010). Testing model nesting and equivalence. Psychological methods, 15(2), 111.

it has only been implemented in R (on the semTools package). if you're using any other software you'll have to work it out by hand, but it's not super complicated.
 
#9
Ooh, thanks for the reference.

Okay, I am still a bit stuck. Let me include a picture of my model for reference. Masculinity is the "extra" variable. In the model with the extra exogenous variable, that variable is a predictor of two other predictor variables (attitudes and norms), so I had to include extra residuals for those two variables. Since all of those variables are correlated theoretically, I had to covary the residuals instead of drawing covariances between the endogenous variables themselves. When I am creating the new model with no paths to masculinity, though, those two variables are no longer endogenous since nothing is predicting them. If I remove the residuals and just add covariances between them, will that render the models incomparable? Or would that still be okay?

 

spunky

Doesn't actually exist
#10
If I remove the residuals and just add covariances between them, will that render the models incomparable? Or would that still be okay?
i'm not exactly sure what you mean by "incomparable". you mean as would the models be nested? non-nested? equivalent? non-equivalent? "incomparable" is not really something i can work with.
 
#11
I mean, could I still use the AIC, BCC, and ECVI? They would both have the masculinity variable but the simpler model would lack the two residual terms.
 

spunky

Doesn't actually exist
#12
yes. but i think it would still be of relevance to have the models go through the NET procedure because it seems like you're removing paths from masculinity, but you're also adding residual covariances. the models could potentially be nested, in which case it is more appropriate to do a chi-square difference test.

so yeah.... my advice would be to go over the NET procedure. if nested, proceed with chi-square difference test. if not-nested, proceed with information criteria comparisons

and if your models fail the chi-square test of fit test you could be statistically fancy and compare them using RMSEA-difference tests as described in MacCallum et. al. (2006) or just ignore it and proceed with chi-square difference tests (<--- which is what 99.99999999999999999999999999999999999999% of people do so you would be wrong, but you would be wrong alongside with everyone else which somehow makes you... you know, not-so-wrong).

i'm almost sure that a model of that size has failed the chi-square test of fit anyway so maybe you're ok just by ignoring the whole nested VS non-nested issue and concentrate on information criteria?
 
#13
Thank you so much! Now I won't look like a complete idiot at my defense meeting (I hope). I will definitely look at the NET procedure and see if I can determine whether the models are nested. Although it seems like just using the AIC and BBC would indeed be a lot easier!
 

spunky

Doesn't actually exist
#14
Although it seems like just using the AIC and BBC would indeed be a lot easier!
true. and probably more statistically correct. but now imagine how you're gonna bedazzle your committee once you start talking about how you investigated nested VS non-nested models that you considered state-of-the-art statistical procedures like RMSEA-difference tests (which not many people know are out there) and the NET procedure, etc. they're probably gonna think "oh wow, this candidate knows his/her ****!
 
#15
I know this post is a little old, but I have a similar question that maybe spunky could help with?

I am estimating a latent growth curve model, and I added time-invariant predictors predicting latent intercept and slope. I have good model fit, but want to know if I have a better model when certain predictors are only predicting slope and not intercept. Is this an appropriate time to use BIC/AIC to compare models?

Thanks for any help you can provide!
 

noetsi

Fortran must die
#16
(<--- which is what 99.99999999999999999999999999999999999999% of people do so you would be wrong, but you would be wrong alongside with everyone else which somehow makes you... you know, not-so-wrong).
It is encouraging to know if you do something that is invalid or a violation of the assumptions it is ok if you can cite enough people who also do it wrong. Simplifies stats a lot for me. Thanks spunky:p
 

spunky

Doesn't actually exist
#17
I know this post is a little old, but I have a similar question that maybe spunky could help with?

I am estimating a latent growth curve model, and I added time-invariant predictors predicting latent intercept and slope. I have good model fit, but want to know if I have a better model when certain predictors are only predicting slope and not intercept. Is this an appropriate time to use BIC/AIC to compare models?

Thanks for any help you can provide!
maybe... maybe not. are the model-implied covariance matrices still of the same dimensions? maybe posting the path diagram would help.

or you could try and use the NET procedure i suggested above first to see if you could conduct a likelihood-ratio test.
 
#18
Thanks for your reply - the models are nested. I have a full model with all paths from predictors to slope and intercept estimated, then a nested model with a few of the paths from predictors to intercept constrained to 0. The chi-square difference test shows no difference between the models, but the BIC of the nested model is lower (13559 vs. 13582), and the overall model fit of the nested model is better (RMSEA, CFI, etc.).

Basically, I'm trying to figure out if, despite a non-significant chi-square difference test, there is evidence for including certain variables as predictors of slope only.
 

spunky

Doesn't actually exist
#19
Basically, I'm trying to figure out if, despite a non-significant chi-square difference test, there is evidence for including certain variables as predictors of slope only.
well, if you have already ascertained that the models are indeed nested (which i would still want to see evidence for but that's ok), conduct a chi-square difference test and find that it is non-significant (i.e. the model with less parameters fits the data just as well as the one with more parameters) then you have evidence to say that the model with less parameters should be used on grounds of parsimony. this fact is also reflected by the BIC.

still begs the question, though.... do both the original model and the nested version have a non-significant chi-square test? (i.e. does the model fit the data by the chi-square test?)
 
#20
Yes, both the full model and nested model have non-significant chi-square tests. Just barely, but I gather that this normal given I'm working with a large sample (n=365).

I'm hoping to show that a certain predictor should be considered as a predictor of the slope only. I have the full model, a model with the predictor to slope path constrained to 0, and a model with the predictor to intercept path constrained to 0. As I said, there are no significant chi-square difference tests, but the BIC is lowest for the model in which the predictor to intercept path is constrained to 0, which is good for my results, but I'm just not sure if people will buy the argument with just the BIC and no significant chi-square difference test.