Normality test in data with multiple treatments


I have to test for normality to know if I go the ANOVA route or the Kruskal-Wallis route and I don't know how to go about it. I have an experiment with the setup shown in the attached image. I have several treatments, within each treatment I may or may not have independent reactors (depending on the variable in question) and within each reactor I have several samples. The object of my research are the DIFFERENCES BETWEEN TREATMENTS, not between reactors Where and with what data should I test for normality?

1.- I think "Situation A" is simple (I emphasize "think" and "simple"): I take all samples within a given treatment and determine their normality; if they are normal, then I move on to the next treatment, take all its samples and determine, and repeat until all treatments are independently normal (ANOVA) or until one is not (transformation and repeat process; if still not normal, use Kruskal-Wallis)

2.- Situation B is much more complex (for me, anyway). As my interest are the differences between treatments, I would apply my normality analysis at the Treatment level, but with what data? The average value of each reactor within the treatment? Or each and every sample within the given treatment (no averages)? Furthermore, if I go for residual analysis, how should I calculate the residuals within each treatment, each sample VS the average of the reactor they were taken from or each sample VS the average of the whole treatment?

For starters, I wish to use Shapiro-Wilk and skewness on the actual numerical values and Q-Q plots constructed with the residuals. Is there any non-visual test for residuals? For example, could I also apply Shapiro-Wilk on the residuals? Would that make sense?

I know there are considerations on the sample size and I know its a very candid discussion if one should even test for normality because of this and how to go about it. I just need to know how to go about this within the context of my question and the structure of my experiment, I know it's reprehensible that I don't want to go in depth into the why's and how's of statistics but I've tried to find an answer and have not been able to get anything done so far.



Well-Known Member
The normality applies to the residuals, not the data itself.
Many folk are happy with looking at a normal plot of all the residuals as a group. If there is a strong pattern away from the straight line, then this may indicate that the model can be improved with a transformation, say, rather than using a non parametric test on a weaker model. A second "eye" test is to plot all the residuals as a group against the predicted values. There should be no obvious pattern here. In my opinion formal tests are of limited value.
In situation A use the residuals from the ANOVA. In situation B, the samples are nested in the reactors so the test values are the average of the samples from each reactor.