data fitting

#1
Hi,
I have a data which represents some items indices (x axis) and the mean rating (y axis). I want to draw a trendline to understand the tendency of the data. Please look at the following figure. I used polyfit method in numpy, the r square is 0.05.
How can I interpret this? What else you suggest to understand the data more?

The confidence interval of the model, I used R confint(model,level=0.95):
2.5 % 97.5 %
(Intercept) 3.6288915640 3.7105907379
x -0.0002462929 -0.0001795994


Rplot.png Rplot.png
Thanks Rplot.png
 
Last edited:

hlsmith

Less is more. Stay pure. Stay poor.
#2
Well you should report the slope of the line given an X increase in the x-variable. It seems like you have some heterogeneity in the data, which may warrant a sandwich SE estimator. Placing confidence intervals on the R-sq and slope estimates may also help in interpretations.

Of note, making the plots partially transparent would help with visualizing the densities of values at certain regions.
 
#4
Well you should report the slope of the line given an X increase in the x-variable. It seems like you have some heterogeneity in the data, which may warrant a sandwich SE estimator. Placing confidence intervals on the R-sq and slope estimates may also help in interpretations.

Of note, making the plots partially transparent would help with visualizing the densities of values at certain regions.
Thanks. I updated the plot, it seems the data is scattered. Are there any statistical methods to understand this type of data?
 

hlsmith

Less is more. Stay pure. Stay poor.
#5
It is not tremendous, but you have funnel shaped residuals. So as items increase the variability may increase. Thus if the standard errors are based on all data, they may be a little off when making estimates or predictions on the tails of your data.