Outliers and correlation

lora

New Member
#1
Hi all

Im working on small size sample (17) . and need to do correlation for alot of prameters my questions is :

-is removing the outliers is the only thing I should do to prepare the my data for doing correlation ? What about the transformation ? when I should use it ?

- What is the best procedure to do correlation if my sample size is (17) and trimming the outliers may affect ?

Many Thanks
 

JesperHP

TS Contributor
#2
correlation is a measure of the strength of association between variables so technically you cannot do "correlation for alot of prameters" as you suggest. If you truly need help you probably have to rephrase the question. Is the problem that you have a lot of different variables in a sample of size 17?

How do you know they are outliers?
 

hlsmith

Less is more. Stay pure. Stay poor.
#3
I believe the posters just wants to run a bunch of pairwise correlations for all of their variables. The correlation procedure will depend on the formatting and/or normaliity of your data.
 

lora

New Member
#4
yes I have so many variables and need to test if any correlate with another variable ..

and Im confused with which is the best way to prepare the data for doing correlation when I have outliers !! that appear from the plots and skewness ..

when I did read in Andy feild book there is many ways like transformatiom , trimming , etc

so for example when I do trimming the data I still can see from histogram and skewness the data not normal distributed yet !! so shall I do transformation also ?!

so confused !!!
thats why Im asking about the best procedure to prepare the data for correlation if you have some outliers or not normal !!
 

Lazar

Phineas Packard
#5
Do scatter plots. With a small n you will get a much better idea by visual inspection than simply running correlations alone.
 

lora

New Member
#6
one more thing also I faced when using the Spss for removing the outliers ...
when I need to delete outlier shall I just replace it with (-9 ) as missing data ? or I should just delete the value ? if I delete the value I should delete all other variable raw related to this outlier right ?!?!?!

for example

x y
2 4
3 7 <---- outlier (so shall I remove the whole raw 3 and 7 ?? )


sorry my question might be silly for some of you ;(
1 2
 

lora

New Member
#8
I believe the posters just wants to run a bunch of pairwise correlations for all of their variables. The correlation procedure will depend on the formatting and/or normaliity of your data.
is this mean only remove the outliers ? or needto check the shape of the distribution and the skewness ?