# Thread: What Test To Use? (Urgent)

1. ## What Test To Use? (Urgent)

I am trying to determine which method is best and I'm evaluating the methods in different scenarios. Basically I have a 63 x 2 x 4 x 10 design, but the data isn't normal at all (in fact it is multi-modal).

Some of my questions could hypothetically be addressed by grouping some of the factors and then using a Friedman test, but not all of them. Let me go into more detail:

I have 63 different algorithms I'm testing. Each algorithm is applied in 2 different systems. Each algorithm-system combination is then applied in 4 different scenarios, and 10 times in each scenario (I could hypothetically increase 10 to a larger number if needed). Primarily I want to ask the following questions:

"Which algorithm(s) work best overall?" <- main point
"Which algorithm(s) work best in each scenario?" <- secondary point
"Which system is superior?" <- not a terribly important question

Any help is extremely appreciated. I suspect I will need to use a non-parametric approach since the normality conditions are quite violated (bi-modal and right skewed). I'm on a deadline and need to get this completed this week.

2. When you say the data is non-normal do you mean the observed y values? We typically don't care if the y values aren't normal. We care if the residuals are normally distributed.

3. Originally Posted by Dason
When you say the data is non-normal do you mean the observed y values? We typically don't care if the y values aren't normal. We care if the residuals are normally distributed.
Hmm, well my residuals are normally distributed when I do a Jarque-Bera test on the residuals produced by an n-way ANOVA. So would this (probably) be the appropriate way to go?

There aren't many papers in my field dealing with statistical tests (unfortunately it lacks rigor despite requiring a lot of graduate-level math). One of the only papers I have found argues that parametric forms are inappropriate because the sample means are probably not normally distributed (like mine) and that the random variables have unequal variance - which could lead to erroneous post-hoc tests. But it is possible that this paper's argument is flawed - the tiny amount of statistical rigor I'm trying to use is infinitely more than almost every paper in my field.

4. Originally Posted by kanan
Hmm, well my residuals are normally distributed when I do a Jarque-Bera test on the residuals produced by an n-way ANOVA. So would this (probably) be the appropriate way to go?

There aren't many papers in my field dealing with statistical tests (unfortunately it lacks rigor despite requiring a lot of graduate-level math). One of the only papers I have found argues that parametric forms are inappropriate because the sample means are probably not normally distributed (like mine) and that the random variables have unequal variance - which could lead to erroneous post-hoc tests. But it is possible that this paper's argument is flawed - the tiny amount of statistical rigor I'm trying to use is infinitely more than almost every paper in my field.
Always try to be fair and use common sense when conducting your statistics (I guess you mean that by rigor).

Many of us (on this forum) aren't too big fans of normality tests. They have limited use when your sample size is small and again when your sample size is big. Whats your sample size and if possible post quantile plots and/or histograms [if your sample size is huge you are in the clear].

Anyway when your residuals are truly normally distributed its fine to use an anova.

5. ## Re: What Test To Use? (Urgent)

Thanks for your help guys. I feel like the more I learn the more confused I become. Let me describe what I'm trying to do in more detail and see if anybody can answer my questions.

So here is my setup. I have two algorithm types, A and B, run in succession. A has 9 levels and B has 7 levels. Each combination is then applied in conjunction with a framework (F) that has 2 levels. Each of the 9*7*2=126 algorithm*framework combinations are tested in 4 different datasets (D). Each dataset is randomly split into 30 partitions (V), so each algorithm combination is evaluated in each partition. I want to know which combinations of A and B are best/worst, which A are best/worst, and which B are best/worst (best/worst overall and per D). I don't have any particular hypotheses, I just want to do some sort of ANOVA to figure out which algorithm combinations work well and which don't with post-hoc tests.

I thought I had a factorial design with repeated measures, but then I thought it might be better viewed as a split-plot design (each dataset is a plot with each partition being a subplot). I'm not sure exactly what the right way to analyze this data is.

One way to think about this design is to think of it as having 120 subjects (each of the partitions). 30 randomly selected subjects are of type 1, 30 are of type 2, 30 are of type 3, and 30 are of type 4 (for the 4 datasets). Each of these subjects is then tested 126 times, one time for each A*B*F combination. I want to know which A works best, which B works best, and which A*B combination works best.

The dependent variable is accuracy (a value between 0 and 1), with 1 being perfect accuracy.

Can anybody help tell me confirm what kind of design I've created and provide advice for the analysis? A few specific questions: (1) Exactly what kind of design do I have (split-plot or factorial with repeated measures - or are these the same but just different interpretations)? (2) Should I rearrange/group things to improve the design in some way? (3) What kind of post-hoc test(s) are most appropriate? (4) I think that A, B, and F can be considered fixed effects, but I'm unsure if D should be fixed or random. I'm not sure whether V should be considered a variable or not.

For reporting my per-dataset results, I'm reasonably confident that I have a factorial design with repeated measures (right??). It is the combination across datasets that confuses me.

This project is several weeks behind schedule with the analysis causing the hold-up.... I used this time to get more data though (hence the increase from 10 to 30 partitions per dataset from my earlier post). Statistical analysis beyond reporting the mean and making some qualitative observations isn't generally done in my field (lame!), so I don't have anybody around to ask these questions to.

6. ## Re: What Test To Use? (Urgent)

Great post, it helped me solve my issues too.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts