Break_down_percentage = 7%, 8%, 10%, 6%, 12 % etc.

There are 315 data points which can be used to test the different models. I used ets(), arima() and nnetar() as a rolling window prediction to identify the best prediction model. I also wanted to determine a good size of the rolling window that can be used to predict the next day value. So I tried all the models for different rolling window sizes to estimate a good model and a good size of the past data to predict the future value.

I plotted the MAPE values against the rolling window size as shown in the graph. From the graph it can be seen that the ideal size of the window is using all of the data points but it is not practical to use all of the previous data points considering the system dynamics ( very old data may not be useful in predicting the future value as lot of things might have changed in the machine until then. Experts view in manufacturing is that the last two to three months values (~ 60 to 90 data points will be useful). Also, using n-1 data points to train the model has only 1 data point to test the model and that could also be a reason for low MAPE for last window size.

So my question is,

ARMA model is found to better than ETS between 60 to 90 data points.And the corresponding lowest MAPE value is 8.8% and the corresponding window size is 74. Is this very high value for the rolling window prediction or should it be advisable to use more data for future prediction (increasing window size)

Are there any other models that would be interesting to try out in addition to the above models?

Note. I am a self-learner on this topic and didn't attend any formal classes on data science prediction. So apologies if this is obvious.