Need some advice on normality and tests

#1
Hi,

First I would like to thank you for this wonderful forum with lots of useful information :)
I have two samples for which I'm trying to determine whether they are normally distributed or not so I can decide if I should choose a parametric or non parametric method later.

What is confusing me is that both sample pass Anderson Darling Test, Jarque Bera Test, Kolmogorov-Smirnov Test and Shapiro test. However, when I try to plot the data in R using hist or plot, it doesn't look normally distributed.

My question am I doing something wrong or missing something?

Here are the samples:
Sample 1:
9 9 24 24 27 27 39 45 54 54 54 57 57 57 57 66 66 69 75 81 90 99 102 105 108 120 120 123 24 19

Sample 2:
15 18 24 30 39 60 69 72 87 126 129 132 144 147 156 162 171 171 174 180 186 192 207 210 222 234 237 249 255 279 306 327 330 9 89

Thank you for any answers!
 

hlsmith

Not a robit
#2
Yes, these data come up normal when examining tests of normality. Graphical representation show that the tails almost present with little peaks (kind of trimodal). Though based on the tests and that the sets have =/> 30 observations, I think normality assumption can hold.

What types of tests are you planning on using, in that the concept of normality may be a little different in some instances such as ANOVA, where you would want to look at the normality of the model's error term.
 
#3
Thank you for your reply!

Well the idea is to test for normality and from there I have a hypothesis that I set up (related to determining if there is a difference between these two samples). I will reject or accept the null hypothesis based on the method I use.

If normality assumption holds I'm leaning towards a t-test. Otherwise a Whitney-Mann test.

I'm leaning more towards a Whitney-Mann test (rank based) because of the skewness and kurtosis present in the plotting pictures from what I can see. Also, I'm n ot sure how tied values affect the choice of method.





What I'm still confused about is how can the samples pass the normality tests yet they are clearly not symmetric and have some degree of Skewness and Kurtosis (which I calculated too). Is this possible?
 
Last edited:

hlsmith

Not a robit
#4
In the general scheme of things these data are not too skewed, etc. I usually use a length of stay or weight example, you have most people (e.g. 2-day hospital stay or weight around 165), with these measurements there is no real ceiling, so you get positively skewed data with very long tails. These graphs look much more hideous than these.

I believe to test normality (and select between your options), merge the data sets together, calculate the mean, and subtract mean from all observations. If these differences are normally distributed then go with the T-Test otherwise the nonparametric. If doing the T-Test examine the equality of variances and select the right p-value.
 
#5
Thanks again for your feedback :)

It seems that merging both samples into one sample fails the Anderson-Darling test. Plotting also reveals a much higher degree of skewness (which I expected). The question is does this still make the t-test an attractive choice? Also, how does the T-test deal with the tied values. It seems to me that going with a non parametric test would be safer.

EDIT: After merging the 2 samples, calculating the mean and subtract it from all the observations. The data is still pretty skewed.
 
Last edited:

hlsmith

Not a robit
#6
If it looks and tests like it is not normal, I would go with Wilcoxon rank sum test of medians. I believe pretty much the same thing as the Whitney-Mann.

Ties don't come into play in the t-test sinc you are subtracting values, though nonparametric ranking tests usually have a method for dealing with ties.
 
#7
If it looks and tests like it is not normal, I would go with Wilcoxon rank sum test of medians. I believe pretty much the same thing as the Whitney-Mann.

Ties don't come into play in the t-test sinc you are subtracting values, though nonparametric ranking tests usually have a method for dealing with ties.
Got it. What do you think a reasonable explanation for both samples passing the normality tests. I mean they are supposed to be accurate but here it seems quite contradictory to me. While I am personally convinced that the distribution is not normal, is there a valid explanation as to how the samples can pass these tests. For me now it seems like the graphical methods such as the qq-plots and kernel density plots support the non-normality but the theoretical tests are suggesting otherwise.

Also, is there a name for the procedure you suggested of merging samples and subtracting the means to check if the samples come from the same distribution?

Thank you again for your answers.
 

hlsmith

Not a robit
#8
With the listed methods you would be testing the residuals for normality, much like in simple and mult-variable regression analyses. I would look up how the normality tests are being conducted to understand why they are not significant. I believe, much like many other statistical tests, the greater the sample size the more often the null gets rejected given its correct to reject, but look up the test formula to verify how this may occur.
 
#9
I know the test formulas are fine and so they passed them with p >= 0.05. What I'm still confused about is if they are normally distributed then how come they can be skewed at the same time?
 

Dason

Ambassador to the humans
#10
If you fail to reject a normality test that doesn't mean that the sample comes from a normal distribution - it just means that you don't have enough evidence to say that they don't come from a normal distribution. Just like how a guilty person can go free if the prosecutor doesn't provide enough evidence to convince the jury that the defendant is guilty.
 
#11
If you fail to reject a normality test that doesn't mean that the sample comes from a normal distribution - it just means that you don't have enough evidence to say that they don't come from a normal distribution. Just like how a guilty person can go free if the prosecutor doesn't provide enough evidence to convince the jury that the defendant is guilty.
So it seems, I was thinking to try and argue for why a non parametric test would be a better approach just because in this case we are not certain about the distributions.
 
#12
I am not a master of statistical approaches by any means, but you just need to provide rationale to defend your decisions. If you take the best approach you see possible and conduct either parametric or non-parametric tests, just be able to back up why you selected your approach. In many circumstances the tests are going to provide fairly comparable results.
 
#13
I am not a master of statistical approaches by any means, but you just need to provide rationale to defend your decisions. If you take the best approach you see possible and conduct either parametric or non-parametric tests, just be able to back up why you selected your approach. In many circumstances the tests are going to provide fairly comparable results.
True, I have to admit that sometimes I think the samples are 'normal enough' for a t-test. However, I have some arguments why a non parametric approach is safer: First, the samples are skewed to some extent which indicates departure from normality if I'm correct (according to kernel density plots presented previously). Another argument relates to what you guys said earlier that we don't have enough evidence that reject normality but neither can we assume it. Assumptions that should be met for a t-test to perform well can also be refuted here, such as normality and homoscedasticity.

I did a homoscedasticity test and both samples passed (which makes me more unsure and maybe a t-test is relevant). But it seems to me that at best these two samples are "contaminated normal".

What do you think about this in general and related arguments?
 
#14
I just wanted to add some interesting new findings. I did another type of the Anderson-Darling test (non-parametric k-sample tests) which tests the probability of two samples coming from the same distribution. Guess what, it fails!
Evidence keeps piling up. Gotta love statistics.