Non-parametric test alternative to 2-way ANOVA?

#1
Hello everyone. I recently attempted to run a 2-way ANOVA test (using SPSS) but have run into the difficulty of meeting the ANOVA test assumptions of normality (both Levene's test and the Shapiro-Wilk test were both significant). It's also apparent that the data does not appear to be normally distributed (just by looks of the data via a histogram alone). I'm trying to figure out what would be a good alternative to testing this data since the normality and equality of variances appear to be violated.

Here's a summary of my data:
  • The hypothesis is I would like to see if there is an interaction between a person's BMI and their procedure type on the number of events that occurs during their procedure
  • 2 independent variables (BMI, procedure type; both categorical) with 1 continuous (number of events; interval) dependent variable
  • Sample sizes for each group are unequal

I've looked online for a lot of resources but I'm not sure if they apply to what I'm trying to do. I've heard of using other alternatives like Friedman, Welch's, and Brown-Forsythe tests, but I don't know if they're applicable to this scenario. I've also seen recommendations to use the log of my values or to do a robust analysis, but again...very confused as to how to approach this.

So right now I'm pretty lost. Any help would be greatly appreciated. Thank you for your time.
 

Attachments

Karabiner

TS Contributor
#2
Data need not be normally distributed in ANOVA. In small samples, it is assumed that the residuals are from normally distributed populations, but that is not an important assumpton if samples are not small (n > 30 or so). Some real problems seem to be
a) the categorization of the BMI, which sacrifices many degrees of freedom - why don't you use the BMI as interval scaled variable?
b) your dependent variable is a count variable with only a very small range, therefore ANOVA seems inappropriate; you should consider Poisson regression or negative binomial regression instead.

With kind regards

Karabiner

BTW, there are no "nonparametric" alternatives for 2-factorial problems.
 
#3
Hello Karabiner,

Thank you for your helpful reply. I appreciate the information regarding ANOVA. I was under the impression that to run an ANOVA test you needed to meet these assumptions:
  1. Independence of observations – this is an assumption of the model that simplifies the statistical analysis.
  2. Normality – the distributions of the residuals are normal.
  3. Equality (or "homogeneity") of variances, called homoscedasticity — the variance of data in groups should be the same.
By the way that's from the ANOVA Wiki page. I did test the residuals and plot them on a histogram, and they appeared to be very skewed. I also used a Q-Q plot and saw that they were not lining up either, which is why I started to become confused as to what would be the next step in this study if it did not meet normality and homoscedasticity. However, as you mentioned, this is count data and therefore ANOVA would not be appropriate as it is not a test for counts.

When you say that is not an important assumption do you mean when the total sample size from all groups is n > 30? Or when each factor's group (say for my example BMI: <18.5 * Procedure: Endoscopy) has to be n > 30? Because I have some groups (let's just say for example BMI: > 50 * Procedure: Colonscopy) that have fewer than 30 cases in it. Is that an issue even if I decide to consider using Poisson regression or negative binomial regression?

To answer your other question, unfortunately the people who collected the data did not have access to the actual BMI number: it was instead grouped into a range according to the forms they used. Out of curiosity if I did get BMI to the interval range (say for example I did happen to have this info), then would it change what type of test I would do? If so then what test?

Lastly -- and again I appreciate your assistance with everything -- how would I know whether Poisson or negative binomial is the better test for this study? I ask only because you give me an option for either or, but most flow charts/reference material I've looked at online don't even cover those studies. How would I know going forward which test to use?
 
Last edited:

Karabiner

TS Contributor
#4
When you say that is not an important assumption do you mean when the total sample size from all groups is n > 30?
Yes.
Out of curiosity if I did get BMI to the interval range (say for example I did happen to have this info), then would it change what type of test I would do? If so then what test?
In that case, you would include BMI as an interval scaled variable in your analyses instead of BMI as a categorical variable with 8 levels. E.g. in a regression, you would only need 1 variable "BMI" instead of 7 dummy-variables which jointly represent "BMI".

how would I know whether Poisson or negative binomial is the better test for this study?
https://stats.stackexchange.com/que...mial-negative-binomial-and-poisson-regression
http://www.math.usu.edu/jrstevens/biostat/PoissonNB.pdf
https://www.theanalysisfactor.com/p...ng-count-model-diagnostics-to-select-a-model/

HTH

Karabiner