Dear all:

I am working with data from a large cohort study in which participants were asked, retrospectively, about their work history. Data are structured such that there is one observation (row) per study participant.

I am using mi impute in stata (version 12) to multiply impute variables (using MICE). I am having trouble imputing variables that represent the start and end years for conducting a task (I am using intreg). Some participants in the data set are missing start date values, some are missing end date values, and some are missing both. I would like to impute both variables, but I need to be sure that the start date variable is less than or equal to the end year value. Also, the start date needs to be greater than or equal to the year of birth + 10 years, and both the start and end dates need to be less than or equal to the year of enrollment (Time line for a person would appear as follows: YEAR OF BIRTH+10-------START YEAR----END Year----Study enrollment year

Here is a simple line of code as an example:


*create a variable to define the birth year plus ten years, since this is the lower limit for *the start year.

gen year_ten=BIRTH_YEAR+10

*generate a variable to define the upper limit for the *START_YEAR. We define the upper limit for the start *year as *the enrollment year if they conducted the task but were *missing the end year variable. We define *the upper limit for the start year as the end year if they conducted the *task but had data on the end year.

generate UPPER_LIMIT_START=ENROLLMENT_YEAR
replace UPPER_LIMIT_START=ENROLLMENT_YEAR if TASK==1 & END_YEAR==.
replace UPPER_LIMIT_START=END_YEAR if TASK==1 & END_YEAR !=.

*generate a variable to define the lower limit for the end_year. We define the lower limit for the end year as *the birth year plus ten (year_ten) if they conducted the task (task=1) but were missing the start_year variable *(start_year==.) We define the lower limit for *the end year as the start year if they conducted the task
*(task==1) and the variable for *start year was not equal to missing.

generate LOWER_LIMIT_END=YEAR_TEN
replace LOWER_LIMIT_END=YEAR_TEN if TASK==1 & START_YEAR==.
replace LOWER_LIMIT_END=END_YEAR if TASK==1 & END_YEAR!=.

set more off
mi set mlong
mi register imputed START_YEAR END_YEAR
mi register regular CANCER_EVENT FOLLOW_UP_YEARS birth_place
mi impute chained (truncreg, cond (if TASK==1) ll (YEAR_TEN) ul(UPPER_LIMIT_START)) START_YEAR
(truncreg, cond (if TASK==1) ll (LOWER_LIMIT_END) ul(ENROLLMENT_YEAR)) END_YEAR =CANCER_EVENT FOLLOW_UP_YEARS i.birth_place, add(1) rseed(2332) force noisily

We receive the following error message:

st_store(): 3203 colvector required
_Imp_intreg::fillmis(): - function returned error
<istmt>: - function returned error
error occurred during imputation of START_YEAR END_YEAR on m = 1

I will appreciate any ideas you have for ways to solve this problem.


Thank you in advance!