# Are These Correlation Coefficients Accurate?

#### tfj

##### New Member
Greetings. I am stumbling around in EXCEL trying to figure out whether or not my correlation coefficients are accurate. I have a gut feeling that I'm not doing something right. I have virtually no statistical "common sense," so if you can help me, please explain everything in very easy English.

First of all, I have to explain my "experiment" in some detail, so please bear with me if this explanation gets a little tedious (to save time, please see the attached file). I gave 40 students three tests, the results of which have been recorded in three columns (the data is also listed below). The purpose of these three tests was to determine if students with a particular cognitive style (field dependence or field independence) score better on a certain Cloze Test (a test where every seventh word is deleted). In the first column are numbers which equal the amount of mistakes in millimeters each student made on Test 1 (C. One) (it's a test that is supposed to measure cognitive style). In C. Two are numbers which equal the amount of mistakes in millimeters each student made on Test 2 (it's the same test that measures cognitive style). And in C. Three are the numbers of mistakes each student made on the Cloze Test (a test with every seventh word deleted from a paragraph of text).

The working hypothesis is that students with do better on Test 2 (C. Two), that is, field independent students, should do better on the Cloze Test (C. Three), and field dependent students, who should do better on Test 1 (C. One) should do better on the Cloze Test (C. Three). (NOTE: the highter the numbers in each column, the more the mistakes.)

I sorted the first column and ranked each student based on his or her score, and then I did the same for the second column. Finally, I did correlations for C. One and C. Three (0.130956848), and then C. Two and C. Three (0.005440901). I have a gut feeling that I'm not doing something right. Could you please give me some advice. The data is located below (I hope it does get mangled). I'll attach an image file, too.

C. One Rank C. Two Rank C.Three Rank Student
24 4 21 15 50 37 1
32 11 17 6 25 12 2
33 12 41 30 13 3 3
50 19 48 34 39 30 4
109 35 31 23 35 26 5
51 20 18 9 44 34 6
28 6 51 36 31 22 7
53 23 48 35 56 40 8
38 16 24 17 26 17 9
98 30 24 18 56 39 10
11 1 20 12 19 8 11
35 14 17 7 48 36 12
155 40 34 26 42 31 13
138 39 31 24 12 2 14
44 17 31 22 25 14 15
20 2 46 32 14 4 16
37 15 28 21 15 5 17
29 8 9 1 10 1 18
30 9 34 25 28 18 19
66 29 36 28 42 32 20
65 28 20 14 26 16 21
28 7 10 2 32 24 22
107 32 17 8 47 35 23
51 21 134 40 25 15 24
107 33 16 5 22 10 25
129 38 36 29 29 21 26
48 18 48 33 22 11 27
100 31 23 16 36 27 28
27 5 27 20 43 33 29
107 34 12 4 38 29 30
54 25 52 38 31 23 31
117 37 19 11 18 6 32
34 13 26 19 35 25 33
52 22 41 31 18 7 34
115 36 51 37 28 19 35
31 10 12 3 20 9 36
21 3 75 39 37 28 37
58 26 20 13 25 13 38
61 27 35 27 29 20 39
53 24 19 10 52 38 40

#### JohnM

##### TS Contributor
It may very well be that there is virtually no relationship between the variables that you are studying.....

To compute the correlation coefficient (r) in Excel, just do =CORREL(column1,column2)

Just a couple of questions:

Did you do correlations between raw scores or the ranks, or both?

I wouldn't do =CORREL on the ranks - I would do Spearman's rho.

Why did you compute ranks?

#### tfj

##### New Member
Ranks

Thanks a lot, JohnM. I ranked the students because, and I said at length in my first post, I really know very little about statistics. I did a CORREL function in EXCEL, and came up with correlation coefficients that show no statistical significance, which means, I hope, that the Cloze Test does NOT discriminate against people with a particular cognitive style (some scholars have argued that field dependent people -- people who do poorly in Col. Two -- should also do more poorly on a Cloze Test (Col. Three), and therefore the test cannot be used to test global English ability).

Anyway, with the data I have (as seen in the attached jpeg), do I understand the procedure correctly? That is, should I first do a CORREL in EXCEL on Col. One and Col. Three (i.e., the Cloze test) to get a correlation coefficient, and then another CORREL in EXCEL on Col. Two and Col. Three, and NOT use the ranking data (1-40) the three columns where I ranked the students according to their scores on the three tests?

Is there a better way to determine that the Cloze test doesn't discriminate against students according to cognitive styles (scores revealed in Col. One and Col. Two). You mentioned the Spearman test? Should I do this test instead, and if so, could you please tell me how to do it? Please explain in very simple English. I deeply appreciate your help in this matter, and I have resolved to take a basic statistics course during this winter vacation.

#### JohnM

##### TS Contributor
Whether or not to analyze ranks or the raw scores is really a matter of the underlying theory, previous research, and the nature of the tests.

Yes, a lack of a statistically significant (i.e., low) correlation does imply that one test score isn't related to another test score.

You could do both methods (=CORREL in Excel is Pearson's Product Moment Correlation (r), which is the strength of the linear relationship between two variables; Spearman's rho is the "nonparametric" counterpart to Pearson's r - it's a correlation coefficient between two sets of ranks, and can be interpreted the same way as Pearson's r).

I would follow what was done in previous research in order to determine whether Spearman or Pearson should be used - but you can definitely do both - it would be interesting to see if one yields stronger correlations than the other - for example, if Spearman's rho yields stronger correlations, then that means the participant's raw scores aren't as strongly related as their relative positions (ranks) in the group.

If you look at the Examples section on this site, you should see a post for Online Statistics Resources - the following two are highly recommended as resources that discuss a lot about correlation, including rank correlation:

"Concepts and Applications..."

"HyperStat Online"

#### tfj

##### New Member
Thanks

Thanks JohnM for you kind and useful information.

#### tfj

##### New Member
One More Thing

I did as you suggestied, and calculated the Pearson Product Moment Correlation Coefficient with EXCEL, and came up with the following figures:

The correlation for Test One and the Cloze Test was: 0.109313302
The correlation for Test Two and the Cloze Test was:-0.060664335

I hope I don't sound too much like a complete dummy, but what is the significance of the minus sign in the second figure? And both figures mean there is no correlation whatsoever between the two tests and the Cloze Test, right?

Also, I couldn't figure out how to do a Spearman r in EXCEL because it doesn't seem to have this feature. Anyway, thanks for all your kind help.

#### JohnM

##### TS Contributor
I did as you suggestied, and calculated the Pearson Product Moment Correlation Coefficient with EXCEL, and came up with the following figures:

The correlation for Test One and the Cloze Test was: 0.109313302
The correlation for Test Two and the Cloze Test was:-0.060664335

I hope I don't sound too much like a complete dummy, but what is the significance of the minus sign in the second figure? And both figures mean there is no correlation whatsoever between the two tests and the Cloze Test, right?
The correlation coefficient can range between -1.0 and +1.0, with 0 or close to 0 meaning no correlation. A negative correlation indicates that as one variable increases, the other decreases. A positive correlation indicates that as one variable increases, the other increases.

Also, I couldn't figure out how to do a Spearman r in EXCEL because it doesn't seem to have this feature. Anyway, thanks for all your kind help.
There isn't a Spearman rho function in Excel, but it certainly can be done. There are plenty of web-based resources that will tell you the complete formula.