Mixed Modeling for Longitudinal Data

I’m doing my thesis analyses with a longitudinal data set in which the time between visit dates varies by individual and not everyone has the same number of visit dates. I am interested in assessing whether metabolic syndrome (binary variable) impacts mental health-related quality of life (continuous) and severity of depressive symptoms (binary/continuous). I know using a mixed model is the best option for both of these aims, but how do I determine which is the best model to use? There are also missing values for each of the variables of interest. I would appreciate any feedback.
To identify the "optimal model" in a specific class of models, you can use one of the standard model selection criteria. For example, you can

1] make sure that all the parameters are statistically significant,
2] the "optimal model" has the lowest Akaike Information Criterion score (AIC) or Bayesian Information Criterion score (BIC).

Typically, it is impossible to perform exhaustive search over all the candidate models. Therefore, a researcher chooses and informed path through the models. Two popular procedures are forward stepwise selection and backward stepwise selection. Forward stepwise selection is more time consuming but works better on data sets with low observation-to-parameter ratio. Backward stepwise selection is fine if working with relatively large data sets. On large data sets the two procedures have been demonstrated to have comparable performance.

The missing data are not an issue if they are missing at random. If not, we get onto the whole other level of complexity. One of the remedies is missing data imputation but it has to be run cautiously and only in selected situations.
Last edited: