Hi,
I know I can generate two correlated random variables x1 an x2 using
x1= z1
x2= c* z1 + sqrt(1- c^2)* z2
Where
z1: standardnormal random variable
z2: standardnormal random variable
c= Correlation(x1, x2)
But how can I generate an autocorrelated variable x, autocorrelated on x(t-1) ?
Prediction is very difficult, especially about the future. (Niels Bohr)
bryangoodrich (03-10-2015), kiton (03-05-2015)
BGM, did you recognize, that my example above delivers a defined correlation c, where c= Correlation(x1, x2) ?
So, the process to build an autocorrelated random variable with a defined autocorrelation should include an autocorrelation coefficient, right?
Prediction is very difficult, especially about the future. (Niels Bohr)
consuli, did you recognize, that in your original post you don't say that the defined correlation needs to be an explicit parameter. In this case it turns out that the autocorrelation for lag 1 (you never specified if you only want to control the lag 1 autocorrelation or more/less than that) is a parameter in the problem but it was just given a different symbol. Do some reading on simple AR(1) or AR(p) processes. If you want a different type of process that can generate autocorrelation you could look at MA(q) processes or the combination of the two ARMA(p,q). You can get much more complex than all of this. But for your simple request BGMs proposal is probably the easiest way to go about it. It wouldn't be a bad exercise for you to take a little bit of time and derive the autocorrelation for the AR(1) process (it's a pretty easy task).
I don't have emotions and sometimes that makes me very sad.
Yes, you are absolutely right. I could have formulated this more precise.
For the moment I am satisfied with controlling the lag 1 autocorrelation. But I guess, I will ask for controlling lag t correlation in future, where t € {1, 2, 3, ..., 9, 10}, when I am proceeding with my simulation of value at risk. So if you have a solution for controlling autocorrelation of a standard normal variable from lag 1 up to lag 10, this would be very welcome.
Prediction is very difficult, especially about the future. (Niels Bohr)
I did mention AR(p), MA(q), and ARMA(p,q) processes. These are the "classical" ways to model time series but also give a clean data generation process.
You could also just use the multivariate normal distribution and give it the covariance matrix you want. Note that you do need to keep that matrix positive definite. It could be tricky to guarantee that criteria if you get too crazy with trying to specify weird lag-1, lag-2, ..., lag-10 correlations.
I don't have emotions and sometimes that makes me very sad.
so i tried to take a stab at it and i think (hope?) something like this would work. now, just please keep something in mind: i am not an expert in time-series, auto-correlation anything. i'm just kind of working at this from my very basic understanding of how the durbin-watson test for auto-correlated residuals work in regression so that's the framework i'm using to generate data (i.e. auto-correlated residuals).
so the first step is to generate the residulas here. for this i will sample from a bivariate normal distribution and i'm guessing it would be a auto-correlated process with a lag of 1 and the auto-correlation would be 0.5. the key is basically to intercalate the two columns of correlated random vectors so you end up with only 1 vector with double the length you initially specified.
so something like this:
so those are my auto-correlated 'e'. now i only have to build a simple linear regression model:Code:library(MASS) mu <- c(0,0) S <- matrix(c(1,.5,.5,1),2,2) N <- 500 datum <- mvrnorm(N,mu,S) e <- c(rbind(datum[,1], datum[,2]))
and if i do a durbin-watson test using the 'car' packageCode:x <- 1*rnorm(1000) y <- 1 + x +e mod1 <- lm(y~x)
i get a significant p-value which i *think* implies you have autocorrelated residuals... right? am i on the right track here?Code:mod1 <- lm(y~x) durbinWatsonTest(mod1) > durbinWatsonTest(mod1) lag Autocorrelation D-W Statistic p-value 1 0.2655003 1.467445 0 Alternative hypothesis: rho != 0
i think the issue is just to make sure you can change between-column dependencies to become between-row dependencies...right? wrong? maybe?
for all your psychometric needs! https://psychometroscar.wordpress.com/about/
Some of the elements in your vector are correlated but not all. Plus you aren't specifying the autocorrelation if you do that.
for me gaveCode:j <- acf(e) j
So you got a lag-1 autocorrelation of about .198. That's not what you wanted I'm guessing. You did a weird thing by generating independent sets of dependent variables and then just combining them. You essentially make it so that if you go through the vector you alternate between correlated and independent observations.Code:0 1 2 3 4 5 6 7 8 9 10 11 1.000 0.198 -0.014 0.014 -0.031 -0.005 -0.046 -0.055 -0.015 0.030 -0.005 -0.049 12 13 14 15 16 17 18 19 20 21 22 23 -0.029 -0.020 -0.067 0.000 0.014 -0.029 0.013 -0.013 0.014 0.024 -0.047 0.007 24 25 26 27 28 29 30 -0.006 0.042 0.031 0.008 0.012 0.032 0.066
I don't have emotions and sometimes that makes me very sad.
spunky (03-05-2015)
thank you! i guess my limited understanding of how auto-correlation worked was that there would be some sort of row-to-row of data correlation. now i see that this is more complicated than just generating data and intercalating it.... although i guess even that procedure does give me some sort of auto-correlation because you did get that 0.198 number?
weird... :/
for all your psychometric needs! https://psychometroscar.wordpress.com/about/
You could also just use the multivariate normal distribution and give it the covariance matrix you want.
Dason, did that mean the following?
Code:n=10000 x= matrix(ncol= 5, nrow= n) x1= rnorm(n) x[ , 1]= x1 x[ , 2]= c(0, x1[ 1:(n-1)]) x[ , 3]= c(rep(0, 2), x1[1:(n-2)]) x[ , 4]= c(rep(0, 3), x1[1:(n-3)]) x[ , 5]= c(rep(0, 4), x1[1:(n-4)]) head(x) covm= as.matrix(read.table( text= " 1.00 0.50 0.35 0.15 0.10 0.50 1.00 0.25 0.12 0.05 0.35 0.25 1.00 0.08 0.03 0.15 0.12 0.08 1.00 0.01 0.10 0.05 0.03 0.01 1.00 ", stringsAsFactors= F )) x= x %*% chol(covm) cor(x) acf(x[ , 1], plot= F)
Does not work!
> acf(x[ , 1], plot= F)
Autocorrelations of series ‘x[, 1]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1.000 0.000 0.003 0.003 -0.024 0.008 0.005 -0.013 -0.005 -0.017 0.004 -0.006 -0.017 0.009 -0.006 0.002
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.010 0.013 -0.015 0.015 0.005 0.001 -0.013 0.001 0.004 -0.008 -0.004 0.021 -0.024 -0.001 -0.005 -0.012
32 33 34 35 36 37 38 39 40
-0.003 -0.003 -0.010 0.008 0.011 0.021 0.008 0.006 0.010
>
Prediction is very difficult, especially about the future. (Niels Bohr)
However, there suprisingly appears some autocorrelation for column 2, 3, 4, 5 !
So, maybe it will work this way, when shifting the columns 2, 3, 4, 5 in the other direction timely forward and specifying the natural correlations between columns 2, 3, 4, 5?Code:n=10000 x= matrix(ncol= 5, nrow= n) x1= rnorm(n) x[ , 1]= x1 x[ , 2]= c(0, x1[ 1:(n-1)]) x[ , 3]= c(rep(0, 2), x1[1:(n-2)]) x[ , 4]= c(rep(0, 3), x1[1:(n-3)]) x[ , 5]= c(rep(0, 4), x1[1:(n-4)]) colnames(x)= c("x1", "x2", "x3", "x4", "x5") head(x) covm= as.matrix(read.table( text= " 1.00 0.50 0.35 0.15 0.00 0.50 1.00 0.00 0.00 0.00 0.35 0.00 1.00 0.00 0.00 0.15 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 1.00 ", stringsAsFactors= F )) x= x %*% chol(covm) colnames(x)= c("x1", "x2", "x3", "x4", "x5") head(x) cor(x) acf(x[ , 1], lag.max= 5, plot= F) acf(x[ , 2], lag.max= 5, plot= F) acf(x[ , 3], lag.max= 5, plot= F) acf(x[ , 4], lag.max= 5, plot= F) acf(x[ , 5], lag.max= 5, plot= F)
Prediction is very difficult, especially about the future. (Niels Bohr)
I have researched the internet, how to create a random series with a defined autocorrelation. Found the following
http://www.mapleprimes.com/questions...Autocorrelated
The maple example translated to R generates a defined autocorrelation.
However it is pretty nasty, to simply repeat x(t-1) (which is called x_tm1 in the code).Code:n= 1000 p= 0.5 x_tm0= rnorm(n) x_tm1= c(0, x_tm0[1:(n-1)]) # x_tm1 b= rbinom(n, 1, p) # b_tm1 x1= b* x_tm1+ (1-b)* x_tm0 acf(x1, lag.max= 5, plot= F)
So I thought about mixing x and x(t-1) together.
For somehow reasons, this works only for a weight p of 0.5.Code:x2= p* x_tm1+ (1-p)* x_tm0 acf(x1, lag.max= 5, plot= F)
Prediction is very difficult, especially about the future. (Niels Bohr)
Why are you doing all of this? Name one problem with what BGM posted in post 2: http://www.talkstats.com/showthread....l=1#post171517
It does exactly what you asked for and is probably the easiest to interpret what is going on and easy to code.
I don't have emotions and sometimes that makes me very sad.
I asked my question for the purpose of simulation. The first step in a simulation is to replicate the empirical data. So when the empirical data has an autocorrelation vector of [1, 0.5, 0.3, 0.2, 0.15, 0.10] (lag 0 to lag 5), I have to replicate exactly this autocorrelation vector. As far the AR process does not offer a map from the AR parameters to autocorrelation parameters and vice versa, an AR process does not help me for my problem.
Prediction is very difficult, especially about the future. (Niels Bohr)
An AR(1) process offers a direct map to the lag-1 correlation as I've said many times before. And I'm not sure what you mean by "the first step in a simulation is to replicate the empirical data". The first step in a simulation is to simulate the process. Are you just saying that your empirical autocorrelation might not be exactly equal to the autocorrelation that you specify? Of course not - you wouldn't actually be simulating the process if you didn't get variation...
I don't have emotions and sometimes that makes me very sad.
Tweet |