I'm having trouble deciding which type of statistical analysis I should use to compare my modeled data to my measured data and was wondering if I could get some suggetions?

I am comparing the conversion of a reactant to product along the length of the reactor. I have measured data directly from the reactor. I made a model in Aspen Plus to simulate the process. I would like to compare the measured data versus the modeled data to quatitatively decide if the model is working properly.

Each individual run has four points (and a 0,0 point). I've used a parity plot to compare the results, but I don't know how to quantitatevly say whether or not my model is working well. An example of my data might be:

Case 1:

Modeled Measured

0 0

0.2 0.16

0.5 0.57

0.8 0.92

0.97 1.00

Case 2:

Modeled Measured

0 0

0.15 0.23

0.45 0.65

0.9 0.97

0.99 1.00

I have around 8 cases that I would like to compare together to see if the model works well over several runs.

I've never analyzed statistics on experiments like this. I have Minitab installed and can use that for analysis. Does anyone know the way this data should be evaluated?

Thank you! ]]>

In*non-linear time series regression yt = g(xt)+et,

if xt is non-stationary (i.e. I(1)) and

yt is stationary (i.e. I(0)),

how to test that this relationship is cointegrated and not spurious?

Would it be enough to use KPSS test for unit root on residuals et (this seems to be the only test applicable for the non-linear case)?

*

Note: both yt and xt are stationary long-term but the regression is only fit on the data available short-term where xt becomes non-stationary.

Any advice (possibly with references) can help. ]]>

The technical data contains information such as weight, dimensions, material. Each element (identified by unique code) has distinct technical attributes.

The non-technical data contain information such as supplier name, contract type, INCOTERMS, first day of validity of the contract, last day of validity, minimum order quantity, etc. In some cases the prices depend on the order quantity.

One of the problems I am facing is that often each element has a different price depending on the financial data, i.e. element X will cost 100 dollars if Incoterms is A, 200 dollars if Incoterms is B, etc. In other words, there are rows that contain price information on the same element, but one of the columns has a different value and so the price is different.

In other cases the price is 100 if 50-100 elements are ordered, 80 if 100-200 elements are ordered, 60 if 200-500, etc.

I am planning to do some correlations as well as regression. I will probably also try data mining (using Rattle and R).

I need advice on how to treat the (rather) similar observations that each object has in case of correlation, regression and in general for data mining. Should I try to select a single observation per element or do the analysis for “all in”. I guess the last option won’t work for correlation at all.

I tried to explain my problem as good as I can. If it is not clear, I will try to provide further explanation.

Thank you in advance. ]]>