# comparing linear models to test for statistical similarities

#### catbelize

##### New Member
I am trying to describe plant growth over time using accumulated temperature (dd) as my independent variable and the plants 'growth stage' (gs) as my dependent variable.
The shape of the data is sigmoid and I have three different sets of data for different geographic areas.
I decided to use a third order polynomial to describe the data (which it does well in all three cases). Examples below:

GS = - 0.969 + 0.02909 *DD + 0.000062 * DD**2 - 0.0000001 * DD**3
GS2 = 0.847 + 0.03232* DD + 0.000060 * DD**2 - 0.0000002 * DD**3
GS3 = 0.479 + 0.0501 * DD + 0.000032 * DD**2 - 0.000003 * DD**3

The above polynomials are from three different locations, and describe plant growth stages (GS) in relation to accumulated temperature (DD). How can I find out if they are statistically different from one another?i.e. is their output very different when supplied with the same DD? If they are not different then I could describe plant growth in any of the three geographic areas using a polynomial model that was derived in one of the locations.

Many thanks

#### lex logan

##### New Member
There are formal tests of whether things are different, but when Excel, for example, does regressions, it provides an upper and lower estimate for each coefficient. Take a look at those confidence intervals -- do they overlap a lot or are they distinctly different for the three locations? If they mostly overlap, I'd try an overall regression with indicator (dummy) variables. This model assumes that there is a fixed difference, on average, from one location to another. If instead the coefficients seem quite distinct for the different locations, you might want a regression with interactions between the dummy variables and one or more of the polynomial terms. Look up "dummy variables" and "model building" in a stats textbook.

#### catbelize

##### New Member
Thanks for that. I'll have a look at what you suggest.

#### Dason

##### Ambassador to the humans
There are formal tests of whether things are different, but when Excel, for example, does regressions, it provides an upper and lower estimate for each coefficient. Take a look at those confidence intervals -- do they overlap a lot or are they distinctly different for the three locations? If they mostly overlap, I'd try an overall regression with indicator (dummy) variables. This model assumes that there is a fixed difference, on average, from one location to another. If instead the coefficients seem quite distinct for the different locations, you might want a regression with interactions between the dummy variables and one or more of the polynomial terms. Look up "dummy variables" and "model building" in a stats textbook.
Actually it probably would be best to fit a model that includes interaction terms between the location identifier and *all* of the polynomial coefficients. This would allow you essentially fit a different third order polynomial for each location. Then you could do a formal test of whether any of those interaction terms is significantly different from 0 which would tell you if you have evidence that you need a different polynomial for any of the other locations.

Alternativel you could fit a random coefficients model - most likely the best fits will be slightly different at each location but we could envision those polynomials coming from a distribution. This type of model would allow for that.

#### catbelize

##### New Member
thanks for that...think I'm going to have to look into this to figure out how to do it.