1. ## Needing some correlation quick advice

I have created a sample set of data, and I would like to find the right approach that would give me the kind of result I am looking for. I will explain:
In column A half of the results are a 1 (patient took a pill) and half are a 0 (patient did not take a pill). For all of the results in column B adjacent to a 1 (took a pill) I generated a random number between 90 - 100. For the results next to a 0 (did not take a pill) I generated random numbers between 100 - 110.
This should indicate that there is an extremely high and consistent chance that taking a pill will reduce the value of the disease represented by column B by -10. Yet, CORREL function returns a result of .10, which is low.
I would like to know which formula would return the result telling me that there is a close to 100% chance that taking the pill would reduce column B value by -10. Thanks!

2. Denote X and Y be the result in B which presenting the patient did not/did take a pill respectively. Then X ~ Uniform(90, 100) and Y ~ Uniform(100, 110)
The problem is, if you just compute Pr{Y - X > 10} = 1/2, this surely what not you want.
Computing the correlation of Xi and Yi (two sequences of i.i.d. sample you generated) will not help also, as this should give you a value close to 0 if your random generator is good enough.

I guess you want to compute something like:
1) By central limit theorem, for sufficiently large n,
(Y1 + Y2 + ... + Yn)/n - (X1 + X2 + ... + Xn)/n → 10

2) You may want to paired the samples, like setting Yi|Xi ~ Uniform(Xi+5, Xi+15)
Then the correlation should be very high as they are related by Yi = Xi + 10 + Ui
and Ui ~ Uniform(0, 10) can be considered as the uniform random noise

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts