missing data in multiple regression


Super Moderator
so... here's the situation. i'm slowly becoming more and more interested in the problem of missing data because of pervasive it is. there're usually two ways to go around it: Full Information Maximum Likelihood (FIML) and Multiple Imputation (MI). routines for both methods have been automated in Structural Equation Modelling software programs but i was hoping to use them for simpler analyses (ANOVA, regression... the tea-test :p)

anyway, so i'm trying to help someone who has missing data and wishes to perform a straightforward multiple regression analysis. i thought to myself "no problem. with lavaan/R i can get the EM (expectation-maximization) covariance matrix, operate on it and obtain what i want. there is a problem, though, with the standard errors.

the formula i have for the standard errors is [MATH]\frac{\beta}{\sqrt{\sigma^{2}C_{jj}}}[/MATH] where [MATH]\sigma^{2}[/MATH] is the variance of the residuals and [MATH]C_{jj}[/MATH] is the diagonal element of [MATH](X'X)^{-1}[/MATH] associated with that particular variable.

i believe i have heard before that i cannot "naively" estimate the SEs of the regression coefficients because that underestimates the true variability due to missing data. does anyone know how to correct for it?