(crossposting from here)
I am trying to do many linear regressions on series of data points where each data point is associated with a standard deviation.
For simplicity sake, let's say that each series has 3 points (of mean measurements), each of which has its own standard deviation.
I am trying to find a way to do a linear fit to the data points and consider their standard deviation to help determine the confidence that I have on each point (larger SDs meaning smaller confidence in that mean value).Code:Series 1, points A, B and C. Points given in (XX-coordinate,YY-coordinate,SD) A= (1,2,0.5) B= (2,3,0.75) C= (3,2,0.4)
I was thinking about three possible ways to do this, but I'm looking into a different approach:
- Randomize the list of data. Say, for each series I would randomly choose a list of points that would vary around the mean value within the SD. This would, for instance, generate 1000 datasets for each series. I would then do the fittings to each dataset and to each series.
- I would assign weights to each of the points that would be inversely proportional to the SDs. Those points where a higher SD is observed would have a smaller weight. However, in some series I have very big SDs for all 3 points, meaning that I am essentially giving each point the same weight anyway.
- To each of the datapoints, assign the mean+SD and mean-SD as 2 separate points. For instance, in A= (1,2,0.5) I would have points (1,2),(1,1.5),(1,2.5). Then, do the regression to the 9 points instead of the three initial values.
How would you approach this issue?
Thanks
Tweet |