Forecasting with fewer data points


Can someone provide more insights on how we can achieve forecasting with fewer data points and come up with more accurate models.

I have tried regression however MAD is at 18% and is too high for my model.



No cake for spunky
Having more data is useful, but forecasting error can be tied to many sources. It could be structural breaks in your data, violations of assumptions, the wrong predictors, outliers etc. Depending on the type of data you have, cross sectional versus time series, it could be tied to the wrong type of regression (that is not dealing with autoregression which linear regression usually does not address).

What are you trying to predict and what type of data do you have?
@noetsi thanks for your insights.

I am basically tracking the number of errors/issues logged and keeping a track of whether its normal or not (time series data set). Using linear regression model to predict the errors for the next day. Model is trained on daily basis.

Do you recommend - i filter out out-liners ? Looking forward to your recommendation.


No cake for spunky
Experts recommend that you not use linear regression with time series data because of autocorrelation. That is bad enough, if your DV and IV have trends in them and are not cointegrated you will have bias in your model which is worse.

If you are predicting error for the next day you likely have hundreds of data points (because even a year would be be 365 data points). Why not try exponential smoothing for prediction. This is a simple type of time series which historically has been shown to be accurate and robust to violations of assumptions. Once you learn it it is easy to do. Look for Holt Winters on the internet.

If you meant outliers I am conflicted on that point. Many statisticians, which I am not, reccomend against removing them. But it will distort your results according to one source I read which makes sense to me. So I do it - but I say this knowing many strongly disagree.
@noetsi Thanks for the excellent recommendation, my initial assessment is giving me better returns, will post my findings after the implementation.