# Thread: Generate Random Correlated Variable

1. ## Re: Generate Random Correlated Variable

Originally Posted by Dragan
See below.

Yes, b1 is the slope coefficient associated with the regression model.

"Sigma" is the standard deviation of the error term (E) in the regression model. You can set it to any postive value you would like. The value of Sigma affects the correlation (r) in the denominator i.e. Sqrt[Sigma^2 + b1^2].

In terms of your first question, I think it would be best if you provided a short concise example of what it is your trying to accomplish. That is, some values of Y and X so contributors can get a better "handle" on your problem.
I have 76 observations of X and associated Y. After running an Excel regression, I discover the following:

Correlation coefficient. (0.797)
R^2. 0.630
Standard error. 0.9594
Significance F and p-values. 0.0000 (rounded)
Intercept. 5.4013
X coefficient. (0.2205)

I want to generate a random Y from a known X, i.e., if X is 5, I want a random Y which is influenced by the regression statistics. For example, since X and Y are negatively correlated, I want a higher probability of a larger random Y as X decreases. Logically, I would expect any formula to include the intercept, X coefficient, actual test X, and possibly the correlation coefficient and standard error.

Using two random variables or using a standardized X seems counter-productive because I want the random Y tied a particular value of X.

Hope this better explains my question.

2. ## Re: Generate Random Correlated Variable

Originally Posted by cisaak
I have 76 observations of X and associated Y. After running an Excel regression, I discover the following:

Correlation coefficient. (0.797)
R^2. 0.630
Standard error. 0.9594
Significance F and p-values. 0.0000 (rounded)
Intercept. 5.4013
X coefficient. (0.2205)

I want to generate a random Y from a known X, i.e., if X is 5, I want a random Y which is influenced by the regression statistics. For example, since X and Y are negatively correlated, I want a higher probability of a larger random Y as X decreases. Logically, I would expect any formula to include the intercept, X coefficient, actual test X, and possibly the correlation coefficient and standard error.

Using two random variables or using a standardized X seems counter-productive because I want the random Y tied a particular value of X.

Hope this better explains my question.

Well, I think the way you could approach this would be like this (note that this is a very simplified example):

Let X be 3 fixed levels of income in thousands of dollars: 5 5 5, 10,10,10, 15,15,15

Standardize X (ZX): -1.15,-1.15,-1.15, 0,0,0, 1.15 1.15,1.15

Create Y: Y = b0 + b1*ZX + E where E is standard normal (mean of zero and standard deviation of 1)

For example:

Y = 5 + (10)*ZX + E

And so, the level of correlation between Y and your fixed levels of X will be Corr[Y,X]= 10 / Sqrt[1 + 10^2] = 0.995.

Note that the Corr[Y, ZX] will also be 0.995 as the Pearson correlation is invariant to linear transformations.

3. ## Re: Generate Random Correlated Variable

Thank you very very much. This line ( Y = r*X + Sqrt[1 - r^2]*E) helped me a ton!