Missing data: Use of the indicator method

Hello everyone,

My question pertains to the treatment of missing data and use of the indicator method in regression analyses (i.e. creating either a new missing variable for a variable an entering it into regression or recoding a variable to give missing data a unique value e.g. 0=missing, 1=yes, 2=no).

I am under the impression that this type of approach is antiquated and should not be used as it has been shown to lead to biased estimates (Jones 1996 JASA among others). Thus your remaining options I would assume are a complete case anlaysis (unbiased if data MCAR or MAR) or multiple imputation.

Interestingly, I see the indicator method being used in premier journal articles in my field (American Journal of Epidemiology) for large cohort studies. My question is then, are there situations when this may be a valid approach (perhaps when the bias from the indicator method is less than that from a complete case analysis... but I cannot see how this would be known) or is this an issue of improper modeling in the hopes of conserving cases. I'm confused because I've seen the least squares derivations on one hand and on the other respected researchers using this approach.

Thanks in advance for your help.