Analysis of model outputs

Hey there,

I'm looking for some help/advice about a problem that I would like to tackle.

I have a set of input-output (X-y) data. The aim of my work is to model the output y given the variables in X. (one output variable, several inputs)

I have constructed a number of different models, using different modelling techniques, and I would like to compare the results. Each modelling technique yields a different prediction for y, call them yhat1, yhat2, ... (a collection of linear and non-linear modelling techniques are used).

Rather than just comparing the mean squared error / absolute error / mean error etc. to decide what model is "best", I want to show that one models performance is statistically significantly better than another based on the error signals. (error = y - yhat).

Can anyone suggest what hypothesis test I should be using for this? Can i use a t-test (or similar) to compare the means of the error signals, thereby taking the error variance into account etc.?

Or if i'm posting this question and it makes you think "this guy's understanding is way off", please let me know!!




Ambassador to the humans
What types of models are you going to compare? If you have one model that has explanatory variables X1, X2, X3, X4 and you want to compare it to another model with variables X1, X2 then it's a really easy thing to do. Or are you just making comparisons wherever you feel like it?

The models do not necessarily contain the same inputs no. Model 1 may use input vectors X1 and X2, and Model 2 may use X3 and X4.

Both models will then produce a predicted output yhat. And i need to quantify whether the yhat from Model 1 or the yhat from Model 2 is a better approximation of the real y.