I performed a linear regression with a first order and second order polynomial on the same dataset. I now want to determine if the more complex (second order polynomial) is significantly better than the first one. I'm doing this within the scope of a design standard for mechanical tests. An example is worked out in the standard which I tried to follow, but I assume I'm doing something wrong (or probably less likely) there is a mistake in the example. The dataset of the example is the following:

I'm performing my regression analyses in Python. For the 1st order linear regression I have:

Code:

```
import numpy as np
import pandas as pd
import statsmodels.api as sm
from sklearn.preprocessing import PolynomialFeatures
df_sn = df.copy()
# add a log_strain and log_life column
df_sn["log_strain"] = np.log10(df_sn["strain_range"])
df_sn["log_life"] = np.log10(df_sn["fatigue_life"])
# y: The variable which is dependent on x.
# --> log_life is the dependent variable
y = df_sn['log_life']
# X: The independent variable.
# --> log_stress is the independent variable
X = sm.add_constant(df_sn['log_strain'].to_numpy())
lin_reg = sm.OLS(y, X).fit()
print(lin_reg.params)
lin_reg.summary()
```

Code:

```
# Create a 2nd order polynomial
poly = PolynomialFeatures(degree=2)
# Fit & transform to standardized range
poly_feats = poly.fit_transform(np.array(df_sn['log_strain']).reshape(-1, 1))
X = poly_feats
y = df_sn.log_life
sm.add_constant(X)
# Ordinary least squares regression with 2nd order polynomial
quad_reg = sm.OLS(y, X).fit()
quad_reg.summary()
```

Now for the dataset in the first image, they report these results:

I get the same results for the R^2 and Std. Dev. so I'm pretty sure my models have been fitted correctly and in the same way. But when I try to determine the F calculated I'm unable to figure out what I'm doing wrong.

When I try to calculate it according to the formula of the standard as such:

Code:

```
>>> rsse1, rsse2 = lin_reg.ssr, quad_reg.ssr
>>> v1, v2 = lin_reg.df_resid, quad_reg.df_resid
>>> print(v1, v2)
17.0 16.0
>>> Fcalc = ((rsse1 - rsse2)/rsse1)/(v2/v1-v2)
>>> print(Fcalc)
-0.03330950467712325
```

Code:

```
>>>from statsmodels.stats.anova import anova_lm
>>>anovaResults = anova_lm(lin_reg, quad_reg)
>>>print(anovaResults)
df_resid ssr df_diff ss_diff F Pr(>F)
0 17.0 1.484616 0.0 NaN NaN NaN
1 16.0 0.739930 1.0 0.744686 16.102855 0.001005
```