Hi all,
I've built a model for total sales forecasting based on cumulative sales and using least squares regression gradient [yintercept set to zero i.e. f(0)=0] to get the monthly 'run rate'. This works well for large, mostly consistent numbers (~300 per month).
e.g.
My run rate here would be [Sum(Month * Cml Sales) / Sum(Month * Month)] = 300.8 to 1dp (this is the [Sxy/Sxx] gradient of LSR)
In comparison the arithmetic mean would be 1250 / 4 = 312.5 to 1dp
Therefore I can forecast to month 6 as:
LSR gradient: 300.8 * 6 = 1805
Mean: 312.5 * 6 = 1875
Variance between the two: 70 (3.7% of the mean)
We've found the forecast derived from the LSR gradient to be more accurate than the one from the mean.
However, now we are forecasting on a sitebysite basis, where the numbers are smaller and less consistent.
e.g.
LSR: 10.9 to 1dp
Mean: 13.5 to 1dp
Forecast to month 6:
LSR: 65.6
Mean: 81
Variance between the two: 15.4 (19% of the mean)
I'm worried that using this model will not work. Since I am using cumulative sales, there is bias towards the earlier months in the LSR model (the first 10 sales appear in months 1,2,3,4, yet the last 30 sales only appear in month 4). This can be seen when reversing the order of the sales:
LSR: 15.2 to 1dp
Mean: 13.5 to 1dp
Forecast to month 6:
LSR: 91.2
Mean: 81
Variance between the two: 10.2 (13% of the mean)
The total sales are the same, so the mean in the same, but the LSR gradient has completely changed because of this bias. My LSR forecast has changed by 25 units even though the total sales have remained constant.
Am I barking up the wrong tree using the LSR gradient as a forecasting tool? Should I just stick to the mean, or is there some better way of forecasting sales taking every month into account? I want to use the same model for all situations.
I've attached a .xlsx Excel spreadsheet with these examples on (just change the extension from .zip to .xlsx)
Cheers for any and all advice
I've built a model for total sales forecasting based on cumulative sales and using least squares regression gradient [yintercept set to zero i.e. f(0)=0] to get the monthly 'run rate'. This works well for large, mostly consistent numbers (~300 per month).
e.g.
Code:
Month Sales Cml Sales
1 300 300
2 250 550
3 325 875
4 375 1250
In comparison the arithmetic mean would be 1250 / 4 = 312.5 to 1dp
Therefore I can forecast to month 6 as:
LSR gradient: 300.8 * 6 = 1805
Mean: 312.5 * 6 = 1875
Variance between the two: 70 (3.7% of the mean)
We've found the forecast derived from the LSR gradient to be more accurate than the one from the mean.
However, now we are forecasting on a sitebysite basis, where the numbers are smaller and less consistent.
e.g.
Code:
Month Sales Cml Sales
1 10 10
2 5 23
3 9 32
4 30 54
Mean: 13.5 to 1dp
Forecast to month 6:
LSR: 65.6
Mean: 81
Variance between the two: 15.4 (19% of the mean)
I'm worried that using this model will not work. Since I am using cumulative sales, there is bias towards the earlier months in the LSR model (the first 10 sales appear in months 1,2,3,4, yet the last 30 sales only appear in month 4). This can be seen when reversing the order of the sales:
Code:
Month Sales Cml Sales
1 30 10
2 9 23
3 5 32
4 10 54
Mean: 13.5 to 1dp
Forecast to month 6:
LSR: 91.2
Mean: 81
Variance between the two: 10.2 (13% of the mean)
The total sales are the same, so the mean in the same, but the LSR gradient has completely changed because of this bias. My LSR forecast has changed by 25 units even though the total sales have remained constant.
Am I barking up the wrong tree using the LSR gradient as a forecasting tool? Should I just stick to the mean, or is there some better way of forecasting sales taking every month into account? I want to use the same model for all situations.
I've attached a .xlsx Excel spreadsheet with these examples on (just change the extension from .zip to .xlsx)
Cheers for any and all advice
Attachments

23.8 KB Views: 0