Multiple Time Series Regression? Time Series with multiple independent variables

#1
I've asked a lot about time series in general but perhaps I should be focusing my questions about this:

is it possible to do a time series analysis with more than 1 explanatory variable?

I know in the simplest (using the word simple very loosely) terms time series involves modeling Y = time, where time is essentially the only information used to try and predict Y (based on trends, seasonality....).

However, what if I am interested in using time AND population to predict cost?

If I were trying to do some type of normal regression I would have:

model cost = time population


Is there some way to do something like this in time series? I am ultimately curious about forecasting costs based not only future months but future population projections. I have not come across anything in SAS or JMP that seems to account for more variables other than time.
 

noetsi

Fortran must die
#2
It is possible. There are a series of methods to do this. They differ in the assumptions made, but are complex even compared to regular time series. I am just starting to work on this and don't know the names of specific methods to use (I have seen them, but that is all).
 
#3
Vector autoregression is one method for multivariate time series analysis. There are multiple packages in R devoted to this, and there is proc varmax in SAS. As noetsi said, it is much more complex than univariate time series. You won't be able to jump in without reading up on the assumptions and tests necessary to perform before and after estimation. In many cases, neither variable will be clearly exogenous. Granger causality tests can give you a a peek into causality, but generally there is a lot of give and take between the variables. This means that it may be best to, in your example, model both variables as a function of their past values as well as the past and current values of the other variable.
 
#4
You can do this with Granger Causality analysis // VARs. If you just suspect one directional causality you can estimate:

[math]Y_t = constant + \sum_{i=1}^L \alpha_i L_i Y_t + \sum_{i=1}^L \beta_i L_i X_t + \epsilon_{t}[/math].

L_i is just the lag operator. You select the lag length L by minimizing some information criterion; e.g. AIC (less conservative) or BIC (more conservative w.r.t lag length).

To test whether process X_t "Granger causes" Y_t, do a joint F_test that [math]\beta_i = 0 \forall i \in \{1,...,L\}[/math]. If you reject the null then X_t is useful for forecasting Y_t ("Granger causes").

The next step would be to implement Granger causality over quantiles, however this will depende on the nature of Y_t and whether you believe that the conditional quantiles of Y_t are relevant to the causal relationship between the two processes (However I argue below that you do not even need to suspect that this is the case).

The following is what Granger causality is actually doing. It's doing a pseudo-test of [math]F(Y_t|Y_t(lags),X_t(lags)) = F(Y_t|Y_t(lags))[/math], where F is the cumulative distribution function, by testing that the first moment is equal when the first moment is conditional upon both Y_t(lags) and X_t(lags) versus just being conditional upon Y_t(lags) (because OLS based Granger causality specification is just the conditional mean of Y_t against lags ..). This is a crude test. It's best to test that the conditional quantile function [math]Q_{\tau}(Y_t|Y_t(lags) X_t(lags))[/math] is equal to the same function without the lags of X_t, this gives you a more complete picture of whether [math]F(Y_t|Y_t(lags),X_t(lags)) = F(Y_t|Y_t(lags))[/math].
You can test this with Quantile based Granger causality that uses a Sup-Wald test to test over quantiles [0,1] (an actual interval of the reals). I am still trying to figure out how to implement this in R. Anyone can lend a hand?
 
Last edited:
#5
is it possible to do a time series analysis with more than 1 explanatory variable?

I think the answer is quite a bit more easy than jumping into VAR-modelling. I assume you already have future projections of population, right? If not, we'll come to that in the end of this post..

You can use extra explanatory variables into a (s)ARIMA model, so that it becomes an (s)ARIMAX model. See:



It is called a transfer function, which can be performed easily in SPSS. Simple just place your extra explanatory variables into the 'covariates' section.
Now, if you don't know future population, just run a univariate model on population (with indeed only time), and place that prediction into your covariates section.

If you have more questions, i'd be happy to help.


Regardings VARS; i never performed one, but you use VAR-modelling if your dependent variable is a function of your independent variable and vice versa (your independent variable values depend on your dependent variables value). I don't think that's the question here.
 
Last edited:
#6
Yes, I have future population projections. So I want to use time and future population projections (2 variables) to estimate cost

Is there any type of R function which will allow me to do a time series regression with more than 1 independent variable?