I am doing a very simple assay to determine if biofilms are reduced when treated with a compound.
Every time I run the assay I run an untreated control. The biofilms are measured in 9 different places per sample each assay. The total amount of biofilm each day is highly variable so we wanted to normalize the data each day to the untreated control.
I know very little about statistics but after talking to some people that seems to be unneeded. Normalizing data presents the problem of showing graphs where the control has no error bars.
I want to do the correct statistics. I thought a friedman was the correct test but after reading it over I am not sure any more. I want something that will allow me to take into account the difference between the measured biofilm treated vs untreated each day ( I believe this is called paired) and give me more statistical power with a low n because it is fairly time consuming to run each assay. I typically hope to only run 3 assays for an n of 3.
I can assume that if the treatment works the treated will be lower than the untreated.
The other question I have is what types of manipulation of the data are OK?, For instance I know I cannot take the 9 measurements from each day and use them as separate samples increasing my n from 3 to 27. But could I add all 9 measurements from a day to get a total instead of averaging them which is what I do now?
Sample data would look like this.
4 separate assays.
Untreated ; Treated
15 ; 7
12 ; 5
30 ; 15
16 ; 9
There are a couple ways you could do this depending on the questions you want answered. Or, you could do them all and be a badass. First off, have you looked at a distribution? Do you know that your data isn't normal? Run a shapiro test (idk what software you are using) and a significant p value here will tell you if you are violating normality assumptions.
Are these data independent from each other? i.e. does the amount of 1 untreated affect the others? If they are independent, then the appropriate test would be a simple ANOVA comparing untreated and treated. This would tell you if your treatment is sig diff from your control. However, ANOVAs need to have a (mostly) normal distribution. Common transformations in biological data are ArcSin and Log. Look at your distribution and shapiro tests of several transf to see if there is one that really sticks out to you vs others.
A friedman test would be used if you had a blocking factor. A block is a factor of your experiment that may be slightly different from each other but you don't expect enough variance to treat it as a separate factor. In my research, I cut trees and look at their moisture. I expect trees from different areas of the state to be different, but not different enough that I would need to make 'area' its own factor. So 'area' would be my blocking factor. In these cases, you perform your regular ANOVA and look for significance in your block. If it is significant, then you need to treat that factor separately.
Another thing you could do is a matched pairs test (which you mentioned). In this case, your data point would have 2 responses - untreated and treated. You would also be comparing the two but you would be doing it per sample (I guess per day in your case?). Your data needs to be balanced in this case.
As for your questions of averaging or doing them separately. This would be a case of pseudo-replications. You might think of nesting here. Nesting is done when you have a sample that shows up many times but they're not the same....that sounds really confusing....
Let me give you an example. You have 5 cows that each give birth to 5 calves. Your calves are numbered 1-5 but this #1 is different from this #1. For you it would be sample 1 from day 1 vs sample 1 from day 2. Am I correct here? I may have interpreted your question wrong.
The main thing here is if you want to pool your variances - this would be done if you had variances among your data that were fairly equal (an F-test will tell you this). If your data is normal (or can be transformed to normal) and you have equal variances, you would do an ANOVA. Don't try to make it complicated just because (a lot of people look for ways to complicate their data just because it looks fancy).