Hello dear forum members!
Currently, I am working on a project that aims to predict a certain cancer-related outcome (y) using a number of control (c) and predictor (X) variables:
y(i) = a + c(it) + X(it) + u (1)
In Equation (1): y(i) is continuous in nature, data is available only as means of values aggregated from 2009 to 2013; c(it) is a vector of several longitudinal (yearly) control variables available from 2009 through 2013; and X(it) is a vector of several longitudinal (yearly) predictor variables available from 2010 through 2013.
As you can see, the outcome does not vary over time as it is available only in the aggregated form of means; however the controls and predictors are in the panel form. Facing such a limitation, panel models do not seem applicable. Therefore, my approach is to firstly estimate:
y(i) = a + c(i) + X(i) + u (2), where c(i) and X(i) are aggregated as means
And secondly to (a) ensure consistency of the coefficients, and (b) test for lagged effects estimate:
y(i) = a + c(it-1) + X(it-1) + u (3), where c(it-1) and X(it-1) are from 2012 only
y(i) = a + c(it-2) + X(it-2) + u (4), where c(it-2) and X(it-2) are from 2011 only
y(i) = a + c(it-3) + X(it-3) + u (5), where c(it-3) and X(it-3) are from 2010 only
Please advice if my modeling approach seems plausible (considering the limitation related to DV data).
Currently, I am working on a project that aims to predict a certain cancer-related outcome (y) using a number of control (c) and predictor (X) variables:
y(i) = a + c(it) + X(it) + u (1)
In Equation (1): y(i) is continuous in nature, data is available only as means of values aggregated from 2009 to 2013; c(it) is a vector of several longitudinal (yearly) control variables available from 2009 through 2013; and X(it) is a vector of several longitudinal (yearly) predictor variables available from 2010 through 2013.
As you can see, the outcome does not vary over time as it is available only in the aggregated form of means; however the controls and predictors are in the panel form. Facing such a limitation, panel models do not seem applicable. Therefore, my approach is to firstly estimate:
y(i) = a + c(i) + X(i) + u (2), where c(i) and X(i) are aggregated as means
And secondly to (a) ensure consistency of the coefficients, and (b) test for lagged effects estimate:
y(i) = a + c(it-1) + X(it-1) + u (3), where c(it-1) and X(it-1) are from 2012 only
y(i) = a + c(it-2) + X(it-2) + u (4), where c(it-2) and X(it-2) are from 2011 only
y(i) = a + c(it-3) + X(it-3) + u (5), where c(it-3) and X(it-3) are from 2010 only
Please advice if my modeling approach seems plausible (considering the limitation related to DV data).