I've spent hours trying to learn how to do this seemingly simple thing, but I just can't find any solid answers.
Here's the situation: I parsed hundreds of computer programs and sorted each line into a category. I want to find out if there are any correlations between the categories. The data is discrete, so I am inclined to use the Pearson test. The data also fails the normality test, so I am also inclined to use Spearman.
The second problem I have is my sample size is in the hundreds. According to what I read, this means that the test for significance is rather low. Almost ALL the categories are correlated with a p-value well under 0.001.
In summary, here are my 2 questions:
1) If my data is discrete but not normalized, should I use Pearson or Spearman?
2) If my sample size is in the hundreds, should I say that r = .199 is significant, or can I just talk about stuff greater than .6 for example? Should I just talk about the different levels of significance rather than whether or not there is a correlation?
Thanks ahead of time for any help.