# Error in parabolic regression of data points

#### strangersface

##### New Member
I have some data points, each with a specific (but different) error (standard deviation). I fit a parabolic curve to the data points using least squares. I calculated the minimum value of the parabola based on its equation. How do I calculate the error in the parabola minimum value?
If I could calculate the "error" in each term of the parabolic fit equation, I could then easily determine the error in the minimum calculation.
So does anyone know how to calculate the error in the terms of a parabolic fit equation based on the errors in the data points themselves?
(a +/- da)x^2 + (b +/- db)x + (c+/- dc)

#### Dason

Are you talking about the value the response variable takes at the lowest point in the curve or are you talking about estimating the X value where the curve is at it's lowest?

#### strangersface

##### New Member
The x value where the parabola is at its lowest. (min = -b/2a)
I'd like to calculate the error in a and b based on the errors in the original values used for the parabolic fit.

#### Dason

Well I don't understand what you mean by calculate the error in a and b. You're asking about a single quantity and we can talk about the error associated with the quantity of interest. I'm guessing your working with the following model:

$$Y_i = \beta_0 + \beta_1X_i + \beta_2X_i^2 + \epsilon_i$$

where $$\epsilon_i \sim N(0, \sigma^2)$$.

To estimate the minimum as you already pointed out we can get a point estimate fairly easily by using $$\widehat{X_{min}} = \frac{-\hat{\beta_1}}{2\hat{\beta_2}}$$. Now you're interested in the variance of this quantity. We get that using the delta method. Research that a little bit if you're interested. Otherwise you'll just have to believe me and you'll need the covariance matrix of the original parameters because

$$Var(X_{min}) \approx \left(\frac{1}{2\hat{\beta_2}}\right)^2Var(\hat{\beta_1}) +\left(\frac{\hat{\beta_1}}{2\hat{\beta_2^2}}\right)^2Var(\hat{\beta_2}) - \left(\frac{\hat{\beta_1}}{2\hat{\beta_2^3}}\right) Cov(\hat{\beta_1},\hat{\beta_2})$$

If you're familiar with testing linear contrasts another way to get a confidence interval would be by inverting a hypothesis test. The test of interest would be
$$H_0: X_{max} = c$$ which is the same as
$$H_0: \beta_1 + 2c\beta_2 = 0$$. So if you know anything about linear contrasts and inverting hypothesis tests that would be another route you could take.

#### strangersface

##### New Member
I really appreciate your response, thanks.
I don't think that's what I'm after though.
What I have is something like in this plot (this is just a plot I found online):

I don't know Var(beta1) or Var(beta2)...

I have a discrete data set {x1,x2,x3,x4,x5} which are each N(xn,sn) where n is 1,2,3,4,5.
I have then used least squares to fit a parabola to it and use -b/2a to calculate the minimum and as you said, I want to know the variance of this value.

#### Dason

So you have five observations. You somehow know the variability in the response? Is that what you were saying? If that's the case you could use an Aitkin model and the same results would still apply. If you're using least squares you can get a variance/covariance matrix for your parameters.

#### squareandrare

##### New Member
When you talk about the "error" of your data points, are you talking about measurement error? For example, the measured value is 5.5, but it could be as low as 4.5 or as high as 6.5? And you want to calculate how much these errors affect the estimated minimum of the parabolic fit?

#### strangersface

##### New Member
When you talk about the "error" of your data points, are you talking about measurement error? For example, the measured value is 5.5, but it could be as low as 4.5 or as high as 6.5? And you want to calculate how much these errors affect the estimated minimum of the parabolic fit?
Yes, exactly. Well the data points are based on a number of samples. So each data point is really an average +/- standard deviation, which is where the error comes in. The distribution is normal.

#### Dason

Do you have the original data? It would be much better to just use the actual data...

#### strangersface

##### New Member
Do you have the original data? It would be much better to just use the actual data...
This is the original data...

The situation is analogous to this:
There's an independent variable (laser wavelength in this case) that I have control over and I am measuring the dependent variable.
I set it to wavelength 1, and take a measurement. I repeat this measurement, say, 20 times.
Then I set it to wavelength 2, and take another set of 20 measurements.
Repeat for wavelengths 3,4,5.
Now I have a data set, 20 values for each of the 5 settings.
I calculate the average and standard error of the mean.
I fit a parabola to the 5 data points (average at each setting).
I find the minimum of the parabola.
I want to know the error in the minimum based on the standard error of the mean that I have for each data point.

One method (which is probably what you're referring to) would be to do the fit and calculate the minimum 20 times, using the individual measurements and then calculate the error in the minimums directly. I know this is an option, but I was wondering if there was a different way to do it. This would be useful if instead of 20 measurements for each setting you have 5000 for example.

I hope this clears up my problem..

#### Dason

Well what I was suggesting was to treat all of your measurements as a single dataset. Use that dataset to fit a quadratic regression. In doing this you'll be able to get a covariance matrix and go through the process I described before. I'll attach some code to illustrate in a moment.

Edit: As promised... The code to do what I think you want done in R.

Code:
# generate fake data because I don't have yours
x <- rep(1:5,each = 20)
betas <- c(3^2 + 4,-2*3*1.7,1.7^2)
xmat <- matrix(c(rep(1,100),x,x^2),nrow=100)
ys <- xmat%*%betas + rnorm(100)
plot(x,ys)

out <- lm(ys ~ x + I(x^2))
out

# Store the coefficients
b <- coef(out)
# Get the estimate
est <- -b[2]/(2*b[3])
#Get our covariance matrix
v <- vcov(out)

#use delta method to calculate variance
xminvar <- (1/(2*b[3]))^2*v[2,2] + (b[2]/(2*b[3]^2))^2*v[3,3] - (b[2]/(2*b[3]^3))*v[2,3]

# confidence interval
est - 1.96*sqrt(xminvar)
est + 1.96*sqrt(xminvar)

#actual value for xmin was 3/1.7 = 1.764

Last edited:

#### strangersface

##### New Member
Oh so you're using all the points and then using the error in the regression calculation (via covariance) to get at the error in the min. OK, yeah that should work. It'll be a bit more complicated in my case because a bunch of calculations are involved before getting at the data that is fitted (which before I was only performing on the average not the individual values), but this sounds like a good way to go.

Thanks!