Hi there, thanks in advance for any help you can give.
I need to perform a correlation test between a parametric data set and a non-parametric data set: should I use a parametric test (Pearson) or a non-parametric test (Spearman's rho)?
I'll be performing the analysis on SPSS.
Pearson's will work. I attached a marginal plot of two variables. The x variable is from a uniform distribution. The y variable was created by adding a uniformly distributed error term to the x variable. Both the x and y terms are clearly non-normal. There is also a strong linear relationship and a strong Pearson's correlation.
Dear miner, thanks very much for all your help :tup:. I have another question for you if you can help it would be greatly appreciated:
I'm going to model the non-parametric data sets to try and model the relationship between them: I was going to use a generalised mixed linear model, but thats for normal data isnt it- what would be the best model to use for non-normal data?
Regression analysis does not require normality in the variables themselves, but normality of the residuals. Run your analysis then test the normality of the residuals (as well as the remaining residuals diagnostics such as no unusual patterns vs. fitted values or time).
Hmm, there are no "non-parametric data". But there are some non-parametric methods.
There are many parametric methods for skewed non-normal data. Examples are the the binomial distribution, the Poisson distribution and the exponential distribution.
The above mentioned distributions can be estimated in models like "generalized linear models" (that will also include the normal distribution as a special case). It does not seem clear at the moment if you need the "mixed" part in the model.
Normality of the residuals is actually not required for point estimates of regression. But it is neccessary for the CI and assessment of the model (the p values) and most won't be interested in running regression if they can not test the null hypothesis.
While Pearson might work with some non-normal data that is questionable if you have binary data (data with two levels). Polychoric correlations are commonly recommended for that.