This is re-post of a something I put in the 'statistics' forum a while ago. There were 75 views but no responses, so I thought I try again in the 'regression' forum. If anyone has any advise on this, I'd greatly appreciate it. Here's my previous post:

....................

I'm putting together a model which has the purpose of predicting an outcome using a given set of predictor variables; i.e. the main purpose of the model is NOT to assess the individual contributions of each predictor.

It's a longitudinal multilevel model, but I think the essence of my problem is about the complicated relations between the outcome (y) and two predictors (x1, x2).

x1 is well known to impact on y via various pathways. One of the paths is via x2; that is, x2 is on ONE of the causal paths between x1 and y. Additionally x2 may also impact of y independently of x1. I'm interested in estimating y given both x1 and x2.

My question is:

If I regress y on x1 and x2, can I then use that model to predict y given x1 and x2, or will the model parameters produce biased estimates?

(I realise you can use path analysis to look at the relations, but the longitudinal multilevel nature of the data makes this difficult. Also, as above, the main purpose of the model is to make 'correct' predictions of y).

Any advise would be greatly appreciated.

Cheers

Simon