Re: How to generate multiple correlated variables?

well, the general way is to take the correlation (or covariance) matrix that you'd like to be the population parameter, do some type of decomposition on it (eigen/spectral decomposition, cholesky, etc.) and multiply it times the vectors you'd like to have correlated.

lemme give you an example using the cholesky decomposition. say the correlation matrix i want in the population is for 4 variables and looks like:

so a 4X4 correlation matrix with everything correlated at 0.5 (in the population).

first you generate 4 normal variables (standardized just to make things simpler) and assemble them in a matrix. i used a sample size of N=1000 for no particular reason

Code:

X <- cbind(rnorm(1000),rnorm(1000),rnorm(1000),rnorm(1000))
colnames(X) <- c("X1", "X2", "X3", "X4")

then the only thing you have to do is a cholesky decomposition of the correlation matrix R and post-multiply that times the data matrix X:

Code:

cholR <- chol(R)
data1 <- X%*%cholR

if you calculate the correlation matrix on the new dataset "data1" you will see that it is (within sampling variability) close to the population correlation R

now, if you want everything in just 1 step, just use the 'mvrnorm' function in the MASS package that generates multivariate random normal variates with a pre-specified population vector of means and variance-covariance matrix of your choice: