How to average multiple regression curves into one?

#1
Hello everyone, I have what might be a simple question.

I have hundreds of x y plots, where the x and y values are all different for every plot.

Every x y plot fits to a quadratic curve. So I end up with hundreds of quadratic curves.

I want to average all of these curves into one, how do I do it?

Thanks!
 
#4
Thank you for the replies.

Independent variable is N fertilisation rate, dependent variable is crop yield.

Every dataset is a field: for every field there are many N rates x yield points. Quadratic regression of this datapoints is the so called "N response curve", i.e. how a crop responds to increasing N rates.

Every dataset has a different quadratic curve regression, they can vary wildly between each other. If I pool all the data from all the fields together, I get a huge scatter mess of data that likely won't provide any significant R2.

I assume that the regression curve is correct for every field, but I want to merge them together, by crop, by soil type, by location, etc. Selecting fields with a speficic crop might provide datasets more similar to each other, but they might be moved up and down the graph because of different intercepts even if the slopes are similar.

So basically I'm trying to find away to average several regression curves that come from different datasets, I hope it makes some sense!
 
#8
No. What they are talking about also get referred to as random effects models.
yes i know, but it will be a multiplr regression because there will be 2 effects, nitrogen and field, of which field is random.

In Python it should be the MixedLM for statsmodels. I do need to transform the polynomial curve into linear by creating a new variable nitrogen2=nitrogen*nitrogen

so the final model should be:

yield = nitrogen + nitrogen2 + field

does it make sense?
 

Dason

Ambassador to the humans
#9
That would only give you random intercepts. So if the curves look exactly the same except some are shifted up or down then your model would be fine but I'm guessing that isn't appropriate here.
 
#10
so what other option would I have? I'm thinking that if the slopes are not similar, then there is really no point in averaging them because the data act differently. Only way would be if there was a statistical test to compare two regression equations to see if they differ or not
 
#11
Likelihood tests, but you are likely fine examining AICC or another measure of fit.

Also, examination of residuals is always important.
 
#13
In your model you have several options. One option is to make only the intercepts random across fields (i.e., allow the intercepts to vary across fields), and assume that the regression coefficients for nitrogen and nitrogen2 are the same across fields. This is the random intercept model. Alternatively, you can make both the intercepts and the regression coefficients for nitrogen and nitrogen2 random. In this way, you will get an average intercept, an average regression coefficient for nitrogen, and an average coefficient for nitrogen2. In addition, you will get an estimate of the variation of the regression coefficients around their means across the different fields. If this variation of a regression coefficient around its mean is small, the regression coefficient is similar across fields. If the variation is large, the regression coefficient varies a lot across fields. There are also tests for whether the variance of a regression coefficient is signficantly different from 0, like deviance tests. If you choose to perform a multilevel analysis, it would be wise to read some literature about, before running the analysis. It is important to know what you are doing.
 
Last edited: