For a couple of weeks now I have been trying to find a suitable solution online, in books, research papers etc, but so far... not found any satisfactory answers. I hope someone here can give me a hint... or a reality check if what I want to do is just absurd.

Assuming I got a statistical model M1 and I know the parameters X1, .. Xn, used in that model but do NOT know how the model mathematical works, I only get output data Y1 for a specific in put dataset. So, I got a kind of black box model implemented in software (Of course I do not have the source code or access to the program).

Now, I think I know how to make the model predictions better by adding another parameter Xm into the mix. Obviously I cannot feed that into the model since the implementation is fixed and I don't even know how the parameter would need to be integrated into the model.

My naive assumption is now... what if:

1) I run a couple of studies collection test data P1,..., Pn and get the corresponding predicted data for Y1 under model M1.

2) At the same time I got the real measured Ym from my test subjects.

3) I then use Y1 and Pm to do a regression for a Ym = Y1 + b1 Xm + e, which e.g. would end up in Ym = Y1 + 0,75 Xm which then could be my model M2. NOTE: doesn't necessarily need to be a regression!

4) then I run another study and want to compare if M1 or M2 is the better model.

Well, I know how to do 1,2 and 4 but I'm not sure 3 is something which can be done, should not be done or if done what would the right way to do it.

Any hints or tips would be highly appreciated!

Thanks,

fassy