combining error terms for two dissimilar regressions

#1
I have a linear regression for Y on X, and a power regression for Z on Y, each having associated error. (the Y values are different (independent) in each case). I have combined the equations to model Z as a function of X.

What I want is a confidence envelope around the model estimates of Z.

Thanks,
 
#3
N = aM +b, where N = number of eggs, M = mass of a small quantity of eggs
T = cS^f, where T = total mass of eggs, S = size of crab

so substituting produces the model estimating number of eggs N from size (S): N = a(cS^f)+b

what I'd like to be able to show are confidence envelope around the model predictions

thanks
 
Last edited:
#6
You have problems because the model isn't linear and can't be made so. Things are also complicated because a and b (and c and f) are correlated.
This should (possibly could?) work -
Re-sample the first set of data, do the regression and get an a,b pair.
Re-sample the second set of data (log-logged), and do the regression to get a c,f pair.
For a particular size S, use a,b,c, and f to calculate N.
Repeat the three steps a few 1000 times to get a distribution for N. Find the 2.5% and 97.5%tiles.
 
#9
one more question; what's the rule - assuming there is one - on the resample n (i.e., how many observations from the full data set used to estimate the constants each time?)?. The power curve data set has 258 observations. The linear data set has 102 observations
 
#11
Are you looking for a confidence interval or are you actually interested in a prediction interval?
A good point, Dason. This will give a confidence interval for the mean value of the N for some given value of S. For a prediction interval you would also need to find the spread around the mean. But how to do that?
 
#12
one more question; what's the rule - assuming there is one - on the resample n (i.e., how many observations from the full data set used to estimate the constants each time?)?. The power curve data set has 258 observations. The linear data set has 102 observations
Use the full data set for each, and re-sample with replacement. The accuracy of the constants depends on the sample size, so use all you have.
 
#14
Make a new list of 102 pairs, choosing each pair at random from your original 102 pairs. Get a and b for that re-sampled list. This will give you a plausible a and b, correlated appropriately.
Do the same with the 258 pairs for the other set (after log-logging) to get a plausible c, f pair.
Find N.
Repeat.