Normality of a dataset

alfred

New Member
#1
Hello,

In SPSS, which is the difference between the Kolmogorov-Smirnoff test under Analyze->Descriptive Stats->Explore and under Analyze->Non Parametric Tests->1 sample K-S?

I always thought that the first one includes the Lilliefors correction and is more precise, but in principle the second one could be used to check normality too.

However, I have very simple dataset (here is the dataset) and I simply want to see if it's normal or not...

If I run the Kolmogorov-Smirnoff test under Analyze->Descriptive Stats->Explore I obtain:

Therefore my dataset is not normal (i.e. it is significantly different from a normal distribution).

However, if I run Kolmogorov-Smirnoff under Analyze->Non Parametric Tests->1 sample K-S I obtain:

Therefore my dataset is normal (i.e. it is not significantly different from a normal distribution).

So, which test should I trust? Just looking at the "numbers", can my dataset be considered normal or not?

Thank you in advance
 

RobH

New Member
#2
I don't know why the two tests are coming out different, I looked at the data on a histogram and it didn't look normally distributed, if you go to graphs>histogram and put the gain variable in it doesn't look very normally distributed, so I would say no.
 

RobH

New Member
#3

alfred

New Member
#4
Thanks again for your feedback. I know that chapter and I have that book (love it) :)
I just couldn't understand how could a dataset have a completely opposite result with K-S, with and without Lilliefors correction (usually when u run K-S in SPSS with or without the correction, the significant level changes but it still remains in the same range of normal/not normal. This is probably an atypical case).
 

noetsi

No cake for spunky
#5
Be warned that this is a very weak test for normality. It only catches abnormality that is really extreme. Using a histogram with a superimposed normal curve, looking to see if a box plot is centered in the data, looking at skewness and kurtosis results, and QQ plots are all better ways to find if the data is normal.
 

ilgr

New Member
#6
I have also a question about my data set, in some topics I read that the skewness and kurtosis, statistic cannot exceed the number 2 or -2, but others devide the statistic by standaard error of the skewness and the kurotsis, and this outcome cannot lie between -1,96 and1,96 (z score interval 95%) to be normal distributed, which one is correct?

The first analyses gives a normal distribution but the second not.

And how can I do a regression without a normal distributed database.

I hope some one can help me, thanks in advance
 

Dason

Ambassador to the humans
#7
You don't need your data to be normally distributed when you do regression. You want your residuals to be approximately normally distributed. You can't check this until you actually run the regression.
 

ilgr

New Member
#8
But for the multicollinearity test I need to know if this is normal distributed or not? And that is what I need before running a regression.