# Thread: Multiple samples of different size and not normally distributed: how to compare/sort?

1. ## The world does not need staticians. It needs Statistics.

Dason, if you or anyone else try to change my posts again, I will sue you and this forum for calumny, because you are not authorized by law to pretend I said something that I didn't say.
In fact, I'm responsible for the content of any message I write; therefore if you edit them in my place, then I need to safeguard myself.

If you believe my posts are inappropriate, then remove them or put censorship on them and ban me.
In any case, you are not allowed to edit them according to your menstrual/hormonal opinion and feelings.

2. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

non-parametric test would be appropriate, i think u should employ "Kruskal- wallis" test since u r dealing with more than 2 samples, may be simpler to use Mann-whitney test if u wish to compare between two samples. i think to a certain extend, difference in sample size are not restrictive in these tests

3. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

Dason, if you or anyone else try to change my posts again, I will sue you and this forum for calumny, because you are not authorized by law to pretend I said something that I didn't say.
In fact, I'm responsible for the content of any message I write; therefore if you edit them in my place, then I need to safeguard myself.

If you believe my posts are inappropriate, then remove them or put censorship on them and ban me.
In any case, you are not allowed to edit them according to your menstrual/hormonal opinion and feelings.

4. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

How many variables and how many observations do you have?

Do you have any grouping variable or independent variable that you want to check if it has any influence?

(In chemistry and statistics the term “sample” has different meaning so it is difficult to understand.)

5. ## The Following User Says Thank You to GretaGarbo For This Useful Post:

Dason (08-26-2012)

6. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

Dason, if you or anyone else try to change my posts again, I will sue you and this forum for calumny, because you are not authorized by law to pretend I said something that I didn't say.
In fact, I'm responsible for the content of any message I write; therefore if you edit them in my place, then I need to safeguard myself.

If you believe my posts are inappropriate, then remove them or put censorship on them and ban me.
In any case, you are not allowed to edit them according to your menstrual/hormonal opinion and feelings.

7. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

I am sorry I don’t understand much. Nomenclature differs between subjects.

In “software engineering”: is it so that if you calculate your result from a complex computer model so that if you would repeat the “experiment” you would get exactly the same result?

Your “A, B and C”, I would call them variables, are they experimental settings that can have: a “bad” value, a middle value and a “good” value. So if each has three levels like: 1, 2, 3 then they could be coded like these experiments:

Code:
``````A B C
1 1 1
1 1 2
1 1 3
1 2 1
1 2 2
1 2 3
1 3 1
…``````
and so on 3*3*3=27 combinations.

The D variable is that what you calculate or measure, your response variable= dependent variable.

Parameters in statistics are things like mu and sigma for populations mean and standard deviation and alpha and beta in linear regression. Let us use these terms here. Please don’t talk about samples. I obviously just confuse. Let us say number or observations.

8. ## The Following User Says Thank You to GretaGarbo For This Useful Post:

Dason (08-26-2012)

9. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

Dason, if you or anyone else try to change my posts again, I will sue you and this forum for calumny, because you are not authorized by law to pretend I said something that I didn't say.
In fact, I'm responsible for the content of any message I write; therefore if you edit them in my place, then I need to safeguard myself.

If you believe my posts are inappropriate, then remove them or put censorship on them and ban me.
In any case, you are not allowed to edit them according to your menstrual/hormonal opinion and feelings.

10. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

Okey, now we have sorted out that you have a number of independent variables= experimental factors A, B, C (and maybe even more factors)

You also have a response variable D. That one measures “the quality or other characteristics”. But let us call that one Y1. Suppose that is “speed of program execution in seconds”. But you might have other attributes like Y2: “easiness to understand the user interface from the consummer”, and Y3: “easiness to input data in the software” and more dependent variables.

Then you have several dependent variables Y1, Y2, Y3 and three experimental factors A, B, and C.

But you have made many experiments and all should not be evaluated together.

Suppose that you had done some bacterial growth experiment (Y1) on sausage
and experimental factors A (temperature), B(saltiness) and C(pH). Then you can run that and just evaluate the result. But suppose that you also have done experiments on bacterial growth Y1 on apple juice with factors A, B, C.

Then you can not join these two experiment because they are two different biological systems although they have the same Y1 and A, B, C. They have to be evaluated separately.

In you case you need to separate out different “groups” of experiments into “sausage” and “apple” groups.

Then you run each on in analysis of variance (anova) with A, B, C as independent factors and each one Y1 at a time. Then run another anova with Y2 and so on.

The A, B, C don’t need to normally distributed, as have been said on this site a million times. (Search for residuals, normal )

It is the dependent variable Y1 that, conditionally on A, B, C, needs to be normally distributed or with some other known distribution. (Another way to say this is that the residuals should be normal.)

Ignore Duncans test! It is invalid anyway! It is from the 1950ies when they had no clear idea of what was meant by multiple inference. Tukeys hsd is good but I think you should ignore this multiple inference altogether. It just confuses you. Use standard significance test from the anova. If the p-value is less than 0.05 then it is statistically significant.

[Multiple inference is used in different degree in different sectors. In advanced epidemiology it is not used at all. So by ignoring that you are in good company.]

You should look if Y1 and Y2 are approximately normally distributed. If not do a transformation, like take log(Y1) or sqrt(Y2) (square root).

Sorry for writing so long.

11. ## The Following User Says Thank You to GretaGarbo For This Useful Post:

Dason (08-26-2012)

12. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

Dason, if you or anyone else try to change my posts again, I will sue you and this forum for calumny, because you are not authorized by law to pretend I said something that I didn't say.
In fact, I'm responsible for the content of any message I write; therefore if you edit them in my place, then I need to safeguard myself.

If you believe my posts are inappropriate, then remove them or put censorship on them and ban me.
In any case, you are not allowed to edit them according to your menstrual/hormonal opinion and feelings.

13. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

I don't like this forum and erased the content of this post.

14. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

Think about this again if you should not split them up in different sausage and apple groups. (I think that) you can’t throw them all in one basket. This may be more sophisticated than you think. In the beginning you talked about many different samples and lots of things….

You don’t need to recode the factors. The software takes care of that. Just declare A, B and C as factors (or what ever they are called in spss).

Since you have three factors you have three-way anova, not one-way.

Since every factor only takes 2 levels, it means: that a significant factor means that there is a significant difference between “level 0” and “level 1”.

As a training do first some simple models to learn the stuff.

Take some very simple data like a 16 observations where you have exactly 2 observations in each cell out of the 8=2**3 combinations. You have 3 factors gives 8 combinations.

Dependent variable Y1. Run a t-test with just factor A as grouping variable.

Run a one way anova with the same factor. That will give the same answer.

When you understand this and feel comfortable with this go on with the real stuff:

Run model Y1 = A + B + C the main effect only.
Run model Y1 = A + B + C +A*B + A*C + B*C main effects with 2fi
Run model Y1 = A + B + C +A*B + A*C + B*C + A*B*C all effects.

2fi= two factor interaction.

Skip the Tukey tests!

Yes, anova is based on the normal distribution. But it is fairly robust to deviations from normality. Look at a histogram on residuals.

It is better if it is balanced, ie the same number of observations in each cell, but it can still be estimated for most other configurations.

I thought of using a independent sample t-test on "A=1,B=1,C=1" vs "all remaining observations" with null hypothesis mu_111<=mu_remaining.
Does it work?
No.

Run the above and you will sort out this one too.

15. ## The Following User Says Thank You to GretaGarbo For This Useful Post:

Dason (08-26-2012)

16. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

Dason, if you or anyone else try to change my posts again, I will sue you and this forum for calumny, because you are not authorized by law to pretend I said something that I didn't say.
In fact, I'm responsible for the content of any message I write; therefore if you edit them in my place, then I need to safeguard myself.

If you believe my posts are inappropriate, then remove them or put censorship on them and ban me.
In any case, you are not allowed to edit them according to your menstrual/hormonal opinion and feelings.

17. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

You have a simple and easy interpretable situation – only the main effects are significant. Rerun the model with only those.

Run different models and you will understand this.

Get spss to print the parameter estimates, then it will be clearer. Start with the t-test and so on. There are many, many places where you can run anova in spss. Try several!

18. ## The Following User Says Thank You to GretaGarbo For This Useful Post:

Dason (08-26-2012)

19. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

Dason, if you or anyone else try to change my posts again, I will sue you and this forum for calumny, because you are not authorized by law to pretend I said something that I didn't say.
In fact, I'm responsible for the content of any message I write; therefore if you edit them in my place, then I need to safeguard myself.

If you believe my posts are inappropriate, then remove them or put censorship on them and ban me.
In any case, you are not allowed to edit them according to your menstrual/hormonal opinion and feelings.

20. ## Re: Multiple samples of different size and not normally distributed: how to compare/s

And what was significant here and what was not?

Can you get out the parameter estimates or the size of the effect of the factor?

21. ## The Following User Says Thank You to GretaGarbo For This Useful Post:

Dason (08-26-2012)

Page 1 of 2 1 2 Last

 Tweet