Arima dummy regression arimax

#1
Dear All,

Below is t-shirt sales data for 2011 and 2012 from week 14 to 40. DummyDad is father's day, dummyceleb is the celebration week that moves back 1 week for every year(These weeks are outliers among other weeks).The freq is 27 week and the 2012 is not completed yet. Below is all my code with the questions:
Thank you for any input:
Code:
ff=read.table(header=TRUE,sep = "," ,text="
STORE, MERCHYEAR, MERCHMONTH, MERCHWEEK, TYSALESUNIT, TYAVGSTOCKUNIT, TYAVGSALESUNITPRICE, DUMMYDAD, DUMMYCELEB, DUMMYYEAR
3153,2011,4,14,105,425.14,49.95,0,0,0
3153,2011,4,15,116,876.57,48.72,0,0,0
3153,2011,4,16,215,1249.14,47.24,0,0,0
3153,2011,4,17,155,1160,49.45,0,0,0
3153,2011,4,18,175,1285.29,47.21,0,0,0
3153,2011,5,19,176,1554.14,49.35,0,0,0
3153,2011,5,20,221,1620.29,47.08,0,0,0
3153,2011,5,21,313,1565.43,49.78,0,0,0
3153,2011,5,22,255,1764.57,49.67,0,0,0
3153,2011,6,23,316,2068.29,51.39,0,0,0
3153,2011,6,24,415,2348.29,48.28,0,0,0
3153,2011,6,25,612,2593.29,47.02,1,0,0
3153,2011,6,26,385,2927.57,49.56,0,0,0
3153,2011,7,27,523,2917,42.87,0,0,0
3153,2011,7,28,526,2387.71,40.27,0,0,0
3153,2011,7,29,496,2350,42.04,0,0,0
3153,2011,7,30,464,2427.71,40.53,0,0,0
3153,2011,7,31,471,2353.14,41.78,0,0,0
3153,2011,8,32,354,2086,41.23,0,0,0
3153,2011,8,33,343,1752.86,41.63,0,0,0
3153,2011,8,34,417,1413,36.58,0,0,0
3153,2011,8,35,452,1172,36.45,0,1,0
3153,2011,9,36,305,830.86,37,0,1,0
3153,2011,9,37,227,734.57,38.66,0,0,0
3153,2011,9,38,197,560.43,37.39,0,0,0
3153,2011,9,39,147,478.71,37.13,0,0,0
3153,2011,10,40,94,463.14,37.19,0,0,0
3153,2012,4,14,58,621,55,0,0,1
3153,2012,4,15,133,735.29,52.67,0,0,1
3153,2012,4,16,160,990.57,51.72,0,0,1
3153,2012,4,17,210,1379,50.79,0,0,1
3153,2012,4,18,268,1590.57,50.35,0,0,1
3153,2012,5,19,322,1819.43,49.84,0,0,1
3153,2012,5,20,300,1874.86,50.46,0,0,1
3153,2012,5,21,301,1888.57,50.65,0,0,1
3153,2012,5,22,278,2129.14,54.01,0,0,1
3153,2012,6,23,307,2358.29,51.52,0,0,1
3153,2012,6,24,405,2400.29,51.99,0,0,1
3153,2012,6,25,734,2443.29,49.27,1,0,1
3153,2012,6,26,389,2661.86,50.08,0,0,1
3153,2012,7,27,483,2767.71,48.42,0,0,1
3153,2012,7,28,457,2610.71,47.14,0,0,1
3153,2012,7,29,573,2414.71,43.89,0,0,1
3153,2012,7,30,544,2304.14,43.16,0,0,1
3153,2012,7,31,511,2191,42.56,0,0,1
3153,2012,8,32,574,2012.86,41.08,0,0,1
3153,2012,8,33,598,1833,42.31,0,1,1
3153,2012,8,34,716,1701.43,40.98,0,1,1
3153,2012,8,35,397,1606.86,40.05,0,0,1
3153,2012,9,36,263,1593.57,41.07,0,0,1")

dfmagts=ts(ff[,5],freq=27,start=2011)
dftrain=window(dfmagts,end=2012+24/35)#I want to hold out last 4 weeks for validating model
dftest=window(dfmagts,start=2012+24/35)

xreg1=data.frame(ff,model.matrix(~as.factor(ff$MERCHMONTH))[,2:7])# I also added dummy's for month
xregmodel=xreg1[-(47:50),6:16]
xregforecast=xreg1[(47):(50),6:16]
arimas=Arima(dftrain,order=c(1,0,0),seasonal=list(order=c(1,0,0)),xreg=xregmodel,lambda=0)
#Question 1: Is it useless to put too many dummys such as month,week,year? What does arima catch here?
summary(arimas)
#Question 2: How can I tell which variables significant? I can'get any p-values or r-square?
#below is my forecast value a 
auto=forecast(arimas,h=4,xreg=xregforecast)
accuracy(auto,dftest)
# Point          Forecast    Lo 80    Hi 80    Lo 95    Hi 95
#2012.704       549.4364 457.6020 659.7007 415.3752 726.7656
#2012.741       569.9222 474.4117 684.6611 430.5126 754.4756
#2012.778       447.6373 372.6188 537.7590 338.1384 592.5949
#2012.815       333.1392 277.3092 400.2093 251.6483 441.0191
#Question 3: As you noticed point does not make any sense. How can I put to like:
#Point
#2012.20
#2012.21
#2012.22
#2012.23
#Is there any suggestions to improve this model? It seems I missed 569.9222 and 716 by 146 units?
 
Last edited: