[Prism 5] Gather or Individual Analysis? ANOVA or Kruskal?

#1
To whom it may help:

My research is on cellular/molecular responses after hormones farmacological administration and I'm not quite sure how to perform statistical analysis.
I have 15 test groups plus my control group (DMEM) and I did my experiments two different times (2 occasions) in triplicate looking for cell proliferation.
As you know, with three replicates, if I try to analyze my data in each occasion, I wont have enough samples to do normality tests. That situation forced me to do Kruskal-Wallis Non-parametric Test. However, the results are far away from what we can really see on data images and graphically.
I tried to analyze each occasion using parametric statistical analysis ANOVA one-way and It was a catastrophe. Some results of the first occasion doesn't match with the second one.
Well, as I compare my results based on fold-change, I also tried to gather both results of different occasions. Now, since I'm using 6 different replicates per group, in this way It's possible to perform normality test - I did use Komogorv-Smirnov Normality Test and after that I did ANOVA one-way and compared all groups with Tukey post-test. Results now were way better. Nevertheless, I would like to know if this is the best way to analyze my data. Also, is there any advice on how should I analyze my data?

Best regards !
 
#4
I never understood this post.

I have 15 test groups plus my control group (DMEM) and I did my experiments two different times (2 occasions) in triplicate looking for cell proliferation.
I have 15 test groups plus my control group
This suggest to me that there was 16 treatment groups

and I did my experiments two different times (2 occasions)
Was it repeated measurement? the same unit measured twice?

Or were there 3 units that could be randomized, to each of the 16 experimental conditions, so that there was 48 =3*16 units?

As you know, with three replicates, .... I wont have enough samples to do normality tests.
I didn't know that, so I have done that many times.
(...by comparing the residuals, not the data it self.)

That situation forced me to do Kruskal-Wallis
You are not forced to do anything, and the only alternatives are not anova or Kruskal-Wallis.

... ANOVA one-way and It was a catastrophe.
That has not happened to me. I hope no-one was hurt.

Some results of the first occasion doesn't match with the second one.
But that has happened to me. Many times! It means that either the data is wrong (not so likely) or that reality is more complicated than we had believed.

Well, as I compare my results based on fold-change
I don't know what that is.

Now, since I'm using 6 different replicates per group,
Before it was 3 or 2. If the "6" number comes from using several measurements of the same unit it is pseudo replication.

Nevertheless, I would like to know if this is the best way to analyze my data.
There is (in practice) no "best way"! maybe it of good, but it is impossible to say that this is the "best".

Sorry to bother.
Sorry for not understanding!
 
#5
"This suggest to me that there was 16 treatment groups"
-Yes.

"Was it repeated measurement? the same unit measured twice?"
-I did my experiments once in june and then the same experiment in july.

"were there 3 units that could be randomized, to each of the 16 experimental conditions, so that there was 48 =3*16 units?"
That's right. I did exactly what you wrote. I have 16 groups. Each group in triplicate = 48 samples.

But how can I compare the residuals? Actually, what is residuals? mean residuals? variance residuals? How can I do it on GraphPad Prism 5? Please help.
And 6 comes from: 3 units from the first ocasion (june) and 3 units (july).

Best regards !
 
#6
“My research is on cellular/molecular responses after hormones farmacological administration ”
What can be the unit of investigation here? I pretend that it is individual patients. What is measured as the response variable?

“I did my experiments once in june and then the same experiment in july.”
Did you take 48 patients in June and randomized them to the 16 treatments and then took 48 new patients and randomized them to 16 treatments? (Then it is a replicated experiment in two blocks.) Or did you measure the same patients again in July as in June?

Of course it is not just about the amount of numbers that you have (the 2*48). There is a difference if you had taken 16 patients, and measured them 3 times in June and then 3 times again in July. You would then have 2*48 numbers, but still just 16 experimental units. I believe that it is very common that researchers do that kind of pseudo replication. Of course the interpretation will be different with 16 experimental units as compared to 96.

By the way, did you formally randomize (with random numbers) to the 16 treatments? In my opinion, it is an experiment if you formally randomize. Otherwise it is just an observational study.

And, if you want to compare all the treatments with the control, then it is good to take an extra large sample in the control since everything is compared to that. The rule is (I hope I remember this correctly) to take the square rot of the number treatments, and that many times replicates in the control. So the square rot of 16 is 4, so you should have 4 times as many replicates in the control as you have in each of the treatment cells.

“But how can I compare the residuals?”
This is a good moment to study an elementary statistics book. The residual is the measured value minus the “predicted” value. In this case: residual = measured value – mean value(in that treatment group). Any statistical program can do that. Do the standard diagnostic checking that is mentioned in the books. (For example to look at the residuals if they are normally distributed. But skip the Kolmogorov-Smirnov test -that will just confuse you. Look at a histogram instead. Does it look normal? Then it is OK.)

Plot the June values (the 48) with the treatment group on the x-axes. Do they look strange? Is there any outlier? Generate 48 random numbers and plot them again, to get some feeling for how random numbers can work. Then do that again and again (say ten times) to experience randomness. I believe that a person with good common sense but limited statistical knowledge, should avoid many of the formal statistical tests, but instead do things like this that is obvious and makes sense to the person.


Maybe the result was different was different in June and July so that they are not comparable.

I would just present the result as means for each of the 16 group with a confidence interval. You could do bar charts with “error lines” for the confidence interval. (So I suggest you skip the Tukey HSD test. It just confuses you.)
 
#7
I think this will be easier to understand
- http://postimg.org/image/pdnxa2xl1/
I did the normalization of my data by doing "fold-change" - that means how a group is lower or higher relationed to my control. In that case specific, my group X is my control (Mean = 1). My group Y and Z are my treatments (Y Mean = 0,89 and Z Mean = 1,49). That means my group Y is 11% lower compared to my control and the group Z is 49% higher compared to my control as well. Well, about the residuals in this picture, are them correct? Residual = unit value - mean value? And how can I do the normality test with 3 samples (3 residuals units) please, at least name one so I can search and learn about.
Thank you very much !
 

Dason

Ambassador to the humans
#10
The method "fold-change" seems to be strange. I would recommend not to use it.

Maybe this site can be of use. Or this or this.
I'll be honest I haven't read much of the thread but fold-change is a common way to talk about things in certain fields (RNA-sequencing experiments are what I'm thinking of) so don't distrust it straight away.
 
#11
This is a good moment to study an elementary statistics book.
Yes, this seems to be a good moment for me to study the elementary part of "fold-change".

But for a user that is not familiar with what "residuals" are, I would (I have modified my self a little) not recommend to use it.

For me it seems more natural to suggest good-old analysis of variance (anova).

Anova uses the data on an additive scale. It seems like fold-change uses it on a multiplicative scale, since it is a ratio. But if the data shows non constant variance, the data can be logged, as they often do in fold-change. If the variance is constant they can remain on the additive scale. It seems to be an empirical issue. Otherwise take logs. Then, I guess that the result will be the same or similar to fold-change.

A few google searches says that fold-change works well for gene expression data, for many genes and and few treatment. Here there are 16 treatments.

My research is on cellular/molecular responses after hormones farmacological administration....
What was measured?
 
#12
I put some drugs on fibroblasts and ephitelial cells cultered in 24 well plates in triplicate (3 units/3 wells) and evaluate proliferation, migration, and metabolic activity. Fold-changed data is really useful and easier to see the difference between control group and tests groups. However I don't assume or draw any conclusions by doing this mathematical comparison. Instead, I transform my raw data to fold-change data, and If I have 5 replicates/group I do Normality Test Kolmogorov-Smirnov, wich 5 units is the minimum to perform this test and afterwards, depending on normality results, I perform parametric or non-parametric statistical analysis. Generally, ANOVA one way or Kruskal-Wallis serves to my experiments. Now, you said some things that called my attention: 1. Look the residuals and 2. Compare them. As you said, the residuals are my raw replicate (raw unit) data minus mean raw data. But how? Either way, I'll still have 3 units wich is hard to assume that they have a Gaussian distribution. But how you do that kind of analysis?
 
#13
I don't understand this. You will either use fold-change and draw conclusions from it (and therefore evaluates the randomness in the fold-change) or you don't use it. You can't derive one quantity and make statements about it and then investigate the randomness in an other quantity, as I understand it.

I am to lazy to write a textbook here about residuals. You will find it in any textbook and lots on the internet. You can use all the 48 residuals (although only with 32 degrees of freedom). I would prefer to use a QQ-plot (Quantile-Quantile plot) to Kolmogorov-Smirnov.

(It seems strange to me now with the 24 wells plates. The numbers before talked about 16, 3 and 2.)
 
#14
Dude, please.. I do my experiments in 24 wells plate in triplicate = 8 groups. With one plate I put 8 different drugs on my fibroblasts. Since I have 16 groups, I use 2 plates. Anyway.. whatever. I'm not a statistical specialist and looks like neither you, since you're a user that is not familiar with fold-changed data. I really appreciate your attempt to understand my experiment and answer my questions, but I think next time someone asks for help you should start to do it not in a coarse way. Thanks anyway.