I'm supporting a client who is attempting to fit a line to a set of 5 (X,Y) data points. Call X years and Y people. I understand the statistical uselessness of only 5 data points particularly when projecting out 20 years which is what they intend to do. Nevertheless...

Given in the problem are distinct equations that they found to fit the data in the forms of

- (1) y = a*e^(bx)
- (2) y = a*ln(x) + b

They would like for us (me) to find prediction intervals on the out years to 90% confidence level.

For (1) I applied the logarithm to achieve:

ln(y) = ln(a) + bx

Since this is linear in X, I performed simple linear regression and found values for a and b.

Here's PART I of where I need help: What is the proper way of finding the prediction interval in the original formula?

[1] What assumptions would be correct for the error term? iid N(0,sigma^2) or should I use a student t-distribution.

[2] Using either, would I find the prediction interval using the linear regression line I found and then apply the exponential to transform my intervals back into ones that are applicable to the original format (i.e. y = a*e^(bx))

[3] If the errors would be normal, could I transform the equation first along with the error term in the formula (y = a*e^(bx)*e(error term)) and then recognize that e(error term) is Log-Normal? Then I could find a 5th percentile and a 95th percentile and find the prediction interval that way... Or is that incorrect?

PART II of where I need help is, is there a way to find a prediction interval for (2)?

Any thoughts and suggestions are welcome and I appreciate you're time!