Correlation analysis: spearman or pearson


We are conducting research on the relationship between self esteem and aggression. We have aggression scores and self esteem scores and we want to find out if there is a significant difference in the relationship by gender.

We have ran a pearson's correlation for the relationship for the entire sample (regardless of gender) however are confused whether spearman or pearson is more suitable for the analysis by gender. The distribution for the entire sample for aggression and self esteem scores is normal however when analysed by gender, the distribution of the female aggression scores is not normally distributed however the female self esteem and both male aggression and self esteem are normal. Should a spearman be used for the correlation for female self esteem and aggression as i understand spearman should be used when assumptions of pearsons are violated? More over, pearsons assumes linearity of the data and although there is a very weak linear relationship between variables we are unsure as to whether the linearity assumption is met or not for Pearson. The sample sizes are also rather small (female = 38, male = 36) which may impact using Spearman's.

However we also wish to perform fishers Z test for significant difference between the relationship for males and females however the distribution of male scores was normal and therefore we believe pearsons would be acceptable whilst spearman might be more acceptable for the female correlation - we are not sure whether it would be acceptable to compare these correlations since they are deduced from different methods?

I have attached visualisation of the relationships to help what i am trying to say, i hope i have explained it well enough.

Any help will be greatly appreciated


Correlation by Gender.jpeg Total sample Correlation.jpeg


TS Contributor
Pearson's correlation does assume linearity. Spearman's assumes monotonic increasing or decreasing, so it may or may not be linear. Spearman's is less sensitive to outliers in the tail areas, and can handle ordinal data. However, since it is based on ranks, it is less powerful than Pearsons for linear, continuous data.


Active Member
The displayed relationships are fairly linear. If fact any non-linear terms, when added to the model, may not come out as statistically significant. If you try both, Pearson and Spearman, they are likely to deliver the same significance status. In a situation like yours they usually agree. This plot visualizes a situation where Pearson and Spearman go separate ways. This is somewhat of an extreme case but, hopefully, you will get the idea.