In a cohort study of HIV-infected individuals initiating antiretroviral therapy, the
following variables are measured at baseline:

• X1 = plasma viral load (continuous variable)
• X2 = history of injection drug use (frequent/occasional/never)
• X3 = number of years of formal education (starting from Grade 1).
The outcome variable is:
• Y = plasma viral load after 12 months of therapy (continuous variable).

For parts (a) through (e), write down the conditional mean equation (eg., &#181;Y |X1 ,X2 =
b0 + b1X1 + b2X2) that corresponds to the given problem. For history of injection drug
use, identify the dummy coding you are using for each category.
(a) A model for predicting Y using only the baseline plasma viral load.
(b) A model for predicting Y using only the history of injection drug use.
(c) A model for predicting Y using history of injection drug use and years of formal
education assuming no e ect modification.
(d) A mo del for predicting Y using the history of injection drug use and years of
formal education and allowing for the relationship between Y and years of formal
education to be di erent for each category of drug use.
(e) A model for predicting Y using all predictor variables by themselves and allowing
for e ect modification between baseline plasma viral load and years of formal
education.
(f) Describe the procedure that you would use to conduct a test of the hypothesis
that the slope of the relationship between Y and years of formal education is the
same across categories of drug use. That is, identify which models you would fit,
the quantities you need from fitting these models, the calculation used to obtain
the test statistic for testing this hypothesis, and the critical value to which the
test statistic would be compared.

