If the model is not specified properly, then the estimates are biased. This is a short answer. Long answer -- we need more information in order to help you.
Hello all,
I am intersted in the relationship b/w 3 variables (A,B, and C). Then I constructed a path model as A->B->C. My only interest is how big is the estimates from A to B and B to C. I'm not interested in different model like A->C.
I used lavaan in R to analyse my model. Thing is that I got what I want (estimates), but goodness of fit indices don't look so great. For instance, RMSEA is well over 0.3. I am wondering whether I can ignore these indices.
Thank you for your help.
MartenH
If the model is not specified properly, then the estimates are biased. This is a short answer. Long answer -- we need more information in order to help you.
MartenH (06-05-2015), TheEcologist (06-08-2015)
Thank you kiton for your reply. I appreciate you help.
Our main interest is to investigate relationships between 3 variables (A, B, and C).
More specifically:
A: Structural brain metric
B: Functional brain metric
C: cognitive test value
As far as I understood, it is not acceptable to run 2 separate linear regressions, i.e., A->B and B->C (please correct me if I am wrong). Therefore, we thought that we need to use SEM. But the problem is we’ve got nice estimates for each path, but goodness of fit index doesn’t look good. I am somewhat confused reading some papers comparing models referring indices. Given that we are not interested in comparing or figure out a model, but we are interested in the effects of A on B and B on C based on a priori hypothesis. As we assume brain structure determines brain function, then the brain function influences cognitive task performance, we constructed a model A->B->C. Since there are a limited number of variables and from the logical reason, it is unrealistic to come up with other models such as B->A->C and compare with A->B->C.
Some people say (e.g., http://www.psych-it.com.au/Psychlope...cle.asp?id=277) that the overall model needs to be accepted (judged by reasonable goodness of fit indices) before attempting to argue that specific pathways are significant. If, for instance, we observed RMSEA of 0.3 or larger, this means that a priori hypothetical model doesn’t really reflect our data set, even though the estimates are like 0.56 (p=0.000)?
Furthermore, we ran path analyses on different conditions (e.g., cognitive task in Time 1 and time 2) based on the same model. Make things more complicated, some conditions returned RMSEA of less than 0.1, but other showed larger RMSEA. How do you interpret this? It is strange to conclude that brain structure and function predicted cognitive function reasonably well at Time 1, but not Time2.
Am I overlooking something crucial? Or should I be using different statistical methods?
Thank you again for your help.
MartenH
Sorry for my post again.
I added a path A->C in addition to A->B and B->C. This addition somehow changed the indices dramatically. However, indices are all the same across different conditions (e.g., Time 1 and Time 2), spitting out chi-sq = 0, rmsea=0 (CI: 0-0), nnfi=1, srmr=0. Did I do something wrong?
MartenH
I apologize it took me a few days to see this post. My comments are between the lines below:
It depends on you research design and theory. I may think of 2SLS model which allows for A->B and B->C simultaneous estimation. Yet, path model is also a plausible option.
Indeed, path model could be applicable. Yet, keep in mind that the relationships should be based on theory, not some anecdotal evidence, even if it makes sense.
Well, whether we like it or not, but it is the specification tests that allow us to conclude if we can make valid inferences from the estimates or not. If model specification fails, then the estimates are biased, no matter what.
Could you please provide descriptive statistics (i.e., number of observations, mean, sd, skewness and kurtosis, min and max values) on the variables you are using in the model.
MartenH (06-08-2015)
Thanks again for your help, kiton.
Here is the descriptive statistics.
M_fun (n=15, m= 1.45, sd=0.94 , skew= 0.81, kurtosis=-0.91, min=0.45 , max=3.19)
Y_cog1 (n=15, m= 0.30, sd=0.28 , skew= 0.81, kurtosis=-0.78, min=0.03 , max=0.90)
Y_cog2 (n=15, m= 0.48, sd=0.33 , skew= 0.80, kurtosis=-0.62, min=0.08 , max=1.19)
X_str (n=15, m= 0.34, sd=0.02 , skew= 0.56, kurtosis=-0.28, min=0.31 , max=0.39)
The first model (X_str -->M_fun -->y_cog1) returned following indices: chi-squared=2.25, RMEA=0.29 with standardised estimates (X_str -->M_fun= -0.01(SE=0.26) ; M_fun -->y_cog1=-0.55(SE+0.18, p<0.05), while the second model (X_str -->M_fun -->y_cog2) returned following indices: chi-squared=0.42, RMEA=0.0001 with standardised estimates (X_str -->M_fun= -0.01(SE=0.26) ; M_fun -->y_cog1=-0.58(SE+0.17, p<0.05).
Could you also share your view on saturated models? Could I ignore fit indices by including a direct path between X_str and y_cog and then simply interpret the estimates?
MartenH
Hello Marten,
Firstly, please note that the sample size is very small (in a path model you need at least 10 observations per relationship, not to mention that at least a hundred ob observations is desirable) -- even though you would be able to estimate the coefficient, it might be biased because of small sample. The fit statistics may be inadequate because of that as well, since RMSEA, for instance, is dependent on the sample size.
Secondly, note that the standard errors are rather larger in comparison to the estimated coefficients -- this also indicates that the model is misspecified.
Thirdly, from the RMSEA that you mentioned, only the second one falls within an appropriate range. Please refer to this presentation for additional information -- http://www.psych.umass.edu/uploads/p...it_Indices.pdf
Key question -- is there a way to increase sample size?
Additionally, please consider least squares estimator and 2 stage models in particular -- this would allow you to model your relationships and check the robustness of the SEM estimates (even with low sample size).
As for saturated models, honestly, I have not looked into those for a long time and cannot say anything about them.
MartenH (06-09-2015)
Hi kiton, thanks again for your help. Increasing sample size to the satisfactory level is not really an option as our experiment is very time consuming and expensive. Thank you for your suggestion regarding least square estimator and 2 stage models. I will look into that. Just one last question. Could we use mediation model? Or our sample is too small for that as well? Best regards, MartenH
Not at all, Marten. Pleasure to be helpful.
You could surely use a mediation model, yet keep in mind that there must be strong theory behind that (you cannot simply assume mediation even if it makes sense). Sample size depends mostly on the number of relationships proposed. What concerns me more is that you have a time consuming and expensive experiment, which does not really provide you with enough data. Your inferences might be very questionable because of that. What level of study are you doing (e.g., MS, Phd)?
Last edited by kiton; 06-09-2015 at 11:44 AM.
Tweet |