Regression with 1 Y observation and 5 X observations per subject.

Is there an optimal way to analyse such data? In classical regression analysis you have one observation per subject both for Y and X. Here Y is observed only once i.e. at timepoint 5 per subject and X five times in all five time points.

All time points are important, so solutions such as: keeping an average of the five X obs or any other single value seem very suboptimal. Another solution would be to assume a last observation carried backward approach again suboptimal. Could it be a mixed models and imputation kind of model?

Any ideas?

Thank you
Hello, kpateras. More information about the nature of variables is needed. I'd build a regression for X with relative time as predictor, then try to look on relation between (b0, b1) and Y, may be calculate correlation or another regression, but I'm not sure that it correct approach for your data.
Thank you for the reply ask.biostat.

So if I can be more explicit regarding my data, they will look like these.

Time ID Y X
1 1 NA 2.3
2 1 NA 3.2
3 1 NA 1.5
4 1 NA 4.3
5 1 90 4.6
1 2 NA 2.1
2 2 NA 2.2
3 2 NA 2.1
4 2 NA 2.5
5 2 85 3.4

I am measurinng once Y(outcome) and five times the X(predictor). I am intersted in all values of X because they are known to predict Y5, I was wondering if there is a multivariate way to model X and then project them onto Y.
kpateras, do you have times of measurements? I meant meaning of the variables. I think, in case if Y and X were measured in the same time, a mixed model is not appropriate method because a lot of missings. At least, I havn't another ideas. (


Not a robit
Nothing jumps out at me. Yeah, multi-level models for repeated measures have multiple values for outcome. Multiple imputation pooling in impute sets has multiple values for outcome.

Bigger question, if you don't have a Y baseline value or a measure per time point how can you tell the effect of Xs on Y? Unless you are looking at the trends across time in Xs or dose response. In that case you could regress Y on the dose-responses or trends (slopes). So you would use a cumulative value (which you said you can't) or fit two models, one with Xs as dependent variable and then insert those values into the model of Y.

Not sure, as said - nothing apparent is jumping out. Do you have to control for other variables as well?


TS Contributor
So for each subject you have measured y and x(t1) to x(t5).
What is the reason that you do not perform multiple
regression with 5 predictors x(t1) to x(t5), and dependent
variable y?

What is this study all about, what are these x and y variables?

With kind regards

Last edited: