+ Reply to Thread
Results 1 to 2 of 2

Thread: time series - ARIMA + ARIMAX + R

  1. #1
    Points: 4, Level: 1
    Level completed: 7%, Points required for next Level: 46

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    time series - ARIMA + ARIMAX + R




    Hello,
    I am not a statistician, a machine learning computer science person. Recently, have read quite a bit about ARIMA and ARIMAX and using them in R. I will be grateful if you check my workflow and point out any mistakes. Thanks in advance!
    1. Pre-processing
    the two time series (A, B) were not stationary => applied log and diff twice to make it stationary
    MydataLog<-data.frame(log(Mydata$A), Mydata$B) // didn’t apply log(B)
    MydataDiff1<-data.frame(diff(MydataLog$A), diff(MydataLog$B))
    MydataDiff2<-data.frame(diff(MydataDiff1$A), diff(MydataDiff1$B))
    - Test for stationary
    adf.test(MydataDiff1$A, alternative="stationary")
    adf.test(MydataDiff1$B, alternative="stationary")
    (p-value = 0.01 =>stationary (smaller than 0.05) after log+diff+diff )
    data: MydataDiff2$A
    Dickey-Fuller = -6.7353, Lag order = 3, p-value = 0.01
    alternative hypothesis: stationary

    2. Cross-correlation function
    #try any model to get any white noise ARIMA(1,0,2) for B
    model1<-arima (MydataDiff2$B, order=c(1, 0, 2))
    residuals1<-residuals(model1)
    Box.test(residuals1, type='Ljung',lag=log(length(residuals1)))
    yfiltered <- residuals(Arima(MydataDiff2$A, model=model1))
    Box.test(yfiltered, type='Ljung',lag=log(length(yfiltered)))
    c <- ccf(residuals1, yfiltered)
    Is it the correct way to perform prewhitening and cross-correlation between A and B?
    My ccf shows significant correlation at lag 0 => Yt depends only on Xt and not on Xt-1, etc.
    What’s the 5% significance test? Some say greater than 2/sqrt(N), others say greater than 2/sqrt(N)sqrt(N-k)

    3. Arimax – because I saw only one significant correlation at lag 0 = > am I applying dynamic regression of the following type:
    Y(t) = b0 + bX(t) + n(t)
    n(t) = a1*n(t-1) +…+ ap*n(t-p) – Q1*e(t-1)…-Qq*e(t-q) + e1

    Start with some model Arima(1, 0, 2), X = A, Y = B
    model<-Arima(MydataDiff2$B, xreg = MydataDiff2$A, order=c(1, 0, 2))
    res2<-residuals(model)
    Box.test(res2, type='Ljung',lag=log(length(res2)))

    #result was white noise

    #correct the model:
    ny<-arima.errors(model)
    tsdisplay(ny, main = "ARIMA errors")

    #potential ARIMA(MA(2), AC(3) for example) for ny
    model1<-Arima(ny, xreg = MydataDiff2$A, order=c(3, 0, 2))
    ny1<-arima.errors(model1)

    #test et for white noise
    Box.test(residuals(model1), lag=10, type = "Ljung")

  2. #2
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: time series - ARIMA + ARIMAX + R


    I don't know R, but I will make some comments. If you are using ARIMAX than you pre-whiten both series first (create a PDQ ARIMA model) which is what I assume you did. ADF has serious power issues. So you should test for Stationarity not just with it but with one of the test of Stationarity that has that has the opposite null. If both test show there is a trend (non-Stationarity) than your more confident of your results. Logging does not deal with non-Stationarity as far as I know. It deals with variance issues.

    I am not sure what you mean by dynamic regression. Different authors use that for different things.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats