combining error terms for two dissimilar regressions

George Kraemer

New Member
I have a linear regression for Y on X, and a power regression for Z on Y, each having associated error. (the Y values are different (independent) in each case). I have combined the equations to model Z as a function of X.

What I want is a confidence envelope around the model estimates of Z.

Thanks,

hlsmith

Omega Contributor
Can you write out your model equations, so that we can better understand what you are doing.

Thanks.

George Kraemer

New Member
N = aM +b, where N = number of eggs, M = mass of a small quantity of eggs
T = cS^f, where T = total mass of eggs, S = size of crab

so substituting produces the model estimating number of eggs N from size (S): N = a(cS^f)+b

what I'd like to be able to show are confidence envelope around the model predictions

thanks

Last edited:

George Kraemer

New Member
yes. So the estimates have error terms associated.

katxt

Member
You have problems because the model isn't linear and can't be made so. Things are also complicated because a and b (and c and f) are correlated.
This should (possibly could?) work -
Re-sample the first set of data, do the regression and get an a,b pair.
Re-sample the second set of data (log-logged), and do the regression to get a c,f pair.
For a particular size S, use a,b,c, and f to calculate N.
Repeat the three steps a few 1000 times to get a distribution for N. Find the 2.5% and 97.5%tiles.

Dason

Are you looking for a confidence interval or are you actually interested in a prediction interval?

George Kraemer

New Member
one more question; what's the rule - assuming there is one - on the resample n (i.e., how many observations from the full data set used to estimate the constants each time?)?. The power curve data set has 258 observations. The linear data set has 102 observations

George Kraemer

New Member
not sure; a measure of confidence around the predictions from the model

katxt

Member
Are you looking for a confidence interval or are you actually interested in a prediction interval?
A good point, Dason. This will give a confidence interval for the mean value of the N for some given value of S. For a prediction interval you would also need to find the spread around the mean. But how to do that?

katxt

Member
one more question; what's the rule - assuming there is one - on the resample n (i.e., how many observations from the full data set used to estimate the constants each time?)?. The power curve data set has 258 observations. The linear data set has 102 observations
Use the full data set for each, and re-sample with replacement. The accuracy of the constants depends on the sample size, so use all you have.

George Kraemer

New Member
sorry, but still unsure. E.g., for the 102 mass-count observations, how many should be resampled each time for the estimates of the constants?

katxt

Member
Make a new list of 102 pairs, choosing each pair at random from your original 102 pairs. Get a and b for that re-sampled list. This will give you a plausible a and b, correlated appropriately.
Do the same with the 258 pairs for the other set (after log-logging) to get a plausible c, f pair.
Find N.
Repeat.