# data fitting

#### menahg

##### New Member
Hi,
I have a data which represents some items indices (x axis) and the mean rating (y axis). I want to draw a trendline to understand the tendency of the data. Please look at the following figure. I used polyfit method in numpy, the r square is 0.05.
How can I interpret this? What else you suggest to understand the data more?

The confidence interval of the model, I used R confint(model,level=0.95):
2.5 % 97.5 %
(Intercept) 3.6288915640 3.7105907379
x -0.0002462929 -0.0001795994

Thanks

Last edited:

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Well you should report the slope of the line given an X increase in the x-variable. It seems like you have some heterogeneity in the data, which may warrant a sandwich SE estimator. Placing confidence intervals on the R-sq and slope estimates may also help in interpretations.

Of note, making the plots partially transparent would help with visualizing the densities of values at certain regions.

#### Dason

*might* have heterogeneity. It's hard to tell if all the points plotted are completely opaque.

#### menahg

##### New Member
Well you should report the slope of the line given an X increase in the x-variable. It seems like you have some heterogeneity in the data, which may warrant a sandwich SE estimator. Placing confidence intervals on the R-sq and slope estimates may also help in interpretations.

Of note, making the plots partially transparent would help with visualizing the densities of values at certain regions.
Thanks. I updated the plot, it seems the data is scattered. Are there any statistical methods to understand this type of data?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
It is not tremendous, but you have funnel shaped residuals. So as items increase the variability may increase. Thus if the standard errors are based on all data, they may be a little off when making estimates or predictions on the tails of your data.