[R] Between subjects repeated measures ANOVA help. Level: Novice.

#1
Dear community,

I am fairly new to the field of statistics and R and I apologise if my problem seems to be too basic.

In my research I have performed a series of measurements on 5 different brands of blocks. Each block has been inspected for deformation under incremental forces (20, 30, 40, 50, 60, 70, 80, 90, 100, 110 and 120 N). The deformation for each force was measured 3 times and the mean values were assigned to each brand for a specific amount of force. I was successful in creating linear regression graphs for these 5 different brands.

Now my wish is to see whether a brand makes a significant difference in deformation values and to perform a post-hoc to compare brands among themselves. In other words to compare the linear regression lines. Sorry if what I am saying makes no sense.

So far, I have tried the following commands:

anova(lm(Deformation~Force*Brand, data=Data), lm(Deformation~Force, data=Data))


and

aov.data = aov(Deformation~Force*Brand, Data)

and gotten suspiciously low P values (***) which clearly indicates that I might be doing something wrong. I would be grateful if you could help me with this issue.



Force Brand Deformation
20 Brand1 0.65
30 Brand1 1.23
40 Brand1 1.25
50 Brand1 2.39
60 Brand1 2.45
70 Brand1 2.93
80 Brand1 3.13
90 Brand1 3.57
100 Brand1 4.68
110 Brand1 4.84
120 Brand1 5.33
20 Brand2 1.24
30 Brand2 1.11
40 Brand2 1.6
50 Brand2 2.13
60 Brand2 2.69
70 Brand2 3.60
80 Brand2 3.90
90 Brand2 3.99
100 Brand2 4.51
110 Brand2 4.74
120 Brand2 5.98
20 Brand3 1.21
30 Brand3 1.37
40 Brand3 2.56
50 Brand3 2.49
60 Brand3 3.17
70 Brand3 3.33
80 Brand3 3.38
90 Brand3 4.2
100 Brand3 4.22
110 Brand3 5.22
120 Brand3 6.28
20 Brand4 0.92
30 Brand4 0.89
40 Brand4 1.2
50 Brand4 1.67
60 Brand4 1.98
70 Brand4 2.25
80 Brand4 3.8
90 Brand4 4.17
100 Brand4 4.94
110 Brand4 5.4
120 Brand4 5.76
20 Brand5 0.69
30 Brand5 1.26
40 Brand5 1.61
50 Brand5 2.17
60 Brand5 2.07
70 Brand5 3.35
80 Brand5 3.27
90 Brand5 4.13
100 Brand5 4.25
110 Brand5 4.59
120 Brand5 5

Thank you.
 
#4
I don't think this problem seems “too basic”, on the contrary for me.

When you have a p-value that is really low, it simply means that something is significant.

There is a statistical significant difference between brands p=0.011<0.05, so it is significant at the 5% level. And force is significant with p-value=2*10^-16 which is clearly less than 0.05.

And “blah blah” there is just one replicate per cell so interactions can not be estimated and tested.

Maybe there are some hierarchical structures here. But for the moment I just assume that it was a completely randomized experiment. I am not sure if I am familiar with the model structure trinker suggest. Suggestions about models and evaluation, anybody?


Code:
dat <-  read.table(header=TRUE, text="
Force  Brand	Deformation
20	Brand1	0.65
30	Brand1	1.23
40	Brand1	1.25
50	Brand1	2.39
60	Brand1	2.45
70	Brand1	2.93
80	Brand1	3.13
90	Brand1	3.57
100	Brand1	4.68
110	Brand1	4.84
120	Brand1	5.33
20	Brand2	1.24
30	Brand2	1.11
40	Brand2	1.6
50	Brand2	2.13
60	Brand2	2.69
70	Brand2	3.60
80	Brand2	3.90
90	Brand2	3.99
100	Brand2	4.51
110	Brand2	4.74
120	Brand2	5.98
20	Brand3	1.21
30	Brand3	1.37
40	Brand3	2.56
50	Brand3	2.49
60	Brand3	3.17
70	Brand3	3.33
80	Brand3	3.38
90	Brand3	4.2
100	Brand3	4.22
110	Brand3	5.22
120	Brand3	6.28
20	Brand4	0.92
30	Brand4	0.89
40	Brand4	1.2
50	Brand4	1.67
60	Brand4	1.98
70	Brand4	2.25
80	Brand4	3.8
90	Brand4	4.17
100	Brand4	4.94
110	Brand4	5.4
120	Brand4	5.76
20	Brand5	0.69
30	Brand5	1.26
40	Brand5	1.61
50	Brand5	2.17
60	Brand5	2.07
70	Brand5	3.35
80	Brand5	3.27
90	Brand5	4.13
100	Brand5	4.25
110	Brand5	4.59
120	Brand5	5")




dat$brand.f <- as.factor(dat$Brand)
dat$force.f <- as.factor(dat$Force)

head(dat)

table(dat$brand.f,dat$force.f)

summary(lm( Deformation ~ brand.f + force.f +brand.f:force.f ,dat))
# the interaction brand.f:force.f can not be estimated because only one replicate blah blah

summary(lm( Deformation ~ brand.f + force.f  ,dat))
  anova(lm( Deformation ~ brand.f + force.f  ,dat))


boxplot(Deformation~ force.f,data=dat)
boxplot(Deformation~ brand.f,data=dat)

#with force as a co-variate below
summary(lm( Deformation ~ brand.f + Force  ,dat))

plot(Deformation ~ Force , pch=as.numeric(dat$brand.f)   , data=dat)


#I will leave the rest to the rest of you for regression diagnostics and so on.
#####################################################################


I will leave the rest, to the rest of you for regression diagnostics and so on.
 
#6
@trinker! What repeated measure? Where do you see that?

The OP has averaged over three measurements and shown us that average. Then the repeated measures has disappeared.

In my reading! :)
 

Jake

Cookie Scientist
#7
I agree with Greta, this is not a repeated measures problem unless Brand is considered to be a random factor (in which case we would have repeated measurements on each random unit, which is what is classically considered a "repeated measures" situation). Since I assume we are not considering Brands to be random here, this analysis seems as straightforward as Greta indicated.
 
#9
Dear repliers,

I may have addressed my problem in a wrong way. By repeated measures I meant different forces (20 N, 30 N, 40 N,...) I wanted to compare deformations over the whole series 20 N - 120 N for different brands so that I can say: "brand has a significant influence on deformation in the whole 20 N to 120 N measurement series"; and "Brand2 showed the least deformation, Brand5 showed the most deformation" throughout measurements.

I want to see if brand influences regression lines significantly.

Once more, I am sorry if I said something utterly stupid.:(
 
#10
@Pickle! Didn't you read and run the code I supplied above?

For others: Is there anybody who wish to go on with the regression diagnostics, if assumptions are met and so on?

(And it is an utterly natural question.)
 
#11
Thank you Greta for the code, I will run it in about 2 hours when I get home from work. This is my doctoral thesis that I am working on, your help is greatly appreciated. I wish I could buy you a dinner or something for doing this :D

I hope I can ask you for further assistance when I run the code. Unfortunately, my biomedical background makes me an idiot for statistics.
 
#12
I have run the code and am very pleased with the outcome. However, I still have things that remained unclear.
Let me see if I got this well. This is the ANOVA table:

Code:
Analysis of Variance Table

Response: Deformation
          Df  Sum Sq Mean Sq  F value  Pr(>F)    
brand.f    4   1.810  0.4525   3.7116 0.01162 *  
force.f   10 124.076 12.4076 101.7711 < 2e-16 ***
Residuals 40   4.877  0.1219                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
So basically brand.f Pr(>F) 0.01162 means that P is 0.01162, which is <.05 and statistically significant. The force significance is something I am not interested in, because it obviously influences deformation. What do residuals mean?

My second question regards summary( lm (Deformation ~ brand.f + force.f , dat)), which gives me this table:

Code:
Residuals:
     Min       1Q   Median       3Q      Max 
-0.73582 -0.18482 -0.03636  0.23009  0.61764 

Coefficients:
               Estimate Std. Error t value Pr(>|t|)    
(Intercept)    0.787636   0.182346   4.319  0.00010 ***
brand.fBrand2  0.276364   0.148885   1.856  0.07080 .  
brand.fBrand3  0.452727   0.148885   3.041  0.00415 ** 
brand.fBrand4  0.048182   0.148885   0.324  0.74791    
brand.fBrand5 -0.005455   0.148885  -0.037  0.97096    
force.f30      0.230000   0.220832   1.042  0.30389    
force.f40      0.702000   0.220832   3.179  0.00285 ** 
force.f50      1.228000   0.220832   5.561 1.96e-06 ***
force.f60      1.530000   0.220832   6.928 2.36e-08 ***
force.f70      2.150000   0.220832   9.736 4.15e-12 ***
force.f80      2.554000   0.220832  11.565 2.48e-14 ***
force.f90      3.070000   0.220832  13.902  < 2e-16 ***
force.f100     3.578000   0.220832  16.202  < 2e-16 ***
force.f110     4.016000   0.220832  18.186  < 2e-16 ***
force.f120     4.728000   0.220832  21.410  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.3492 on 40 degrees of freedom
Multiple R-squared: 0.9627,     Adjusted R-squared: 0.9497 
F-statistic: 73.75 on 14 and 40 DF,  p-value: < 2.2e-16
This table is rather puzzling. What does it mean?:confused: Why is Brand1 not listed at all?
 
Last edited:
#13
I am glad that you are not working for the KGB or the CIA, in that you dare to show us your data. Most posters here seems to deal with government secrets in that they don't dare to even show a small example of their data. Then it is difficult to suggest or advice.

What kind of material is it that is deformed?

If you Edit your above post with the computer printout and highlight it and click on the #-symbol (the code symbol) I believe that it will be easier to read (in courier new) in straight columns.

If you are not familiar with “residuals” you need to study linear regression more. Look at any elementary text book and search the web. Essentially it is the observed value – predicted value.


The Brand 1 and the lowest level of force is not included as separate estimates because they are included in the “intercept”. So they act as reference values. (Look around. That has been discussed here at the forum the last days.)

Did you randomize the experiment and in particular the order of the 5*11=55 experiments?
(And did you use formally random numbers?)


Is there anybody who want to do the check on residuals and so on on this model (where the variables are created in the code above)?

Code:
summary(lm( Deformation ~ brand.f + force.f  ,dat))
 

Lazar

Phineas Packard
#14
This:
I am glad that you are not working for the KGB or the CIA, in that you dare to show us your data. Most posters here seems to deal with government secrets in that they don't dare to even show a small example of their data. Then it is difficult to suggest or advice.
Thanks Greta!!
 

trinker

ggplot2orBust
#15
When you're posting code, dataframes or computer output it's helpful to wrap this information in code tags by:
  1. either clicking the pound (#) sign icon or
  2. wrap with [NOPARSE]
    Code:
    some code
    [/NOPARSE]

which produces:
Code:
some code
For more see this (LINK)
 
#16
The material that is deformed is commercially pure titanium in all brands. I will study "residuals" tonight and return tomorrow with more knowledge. I do not understand what you mean by "Did you randomize the experiment?". I measured one brand at a time, repeating measurements 3 times for each force deforming a specific brand (3 times 20 N, 3 times 30 N, 3 times 40 N, and so on). I dismantled the whole loading assembly in between measurements. This is not my actual data, it has slightly been modified so that I can put it on a forum, yet it resembles my true data in its large portion. Thank you very much for helping me out with this.

Is there a way that I can, after performing anova, perform a post hoc that says which brand differs the most (with most and least deformation)?
 
#17
Did you take one piece of titanium and deformed it and then made three measurements on that piece?

Or did you take three pieces of titanium, deformed them and made one measurement on each piece?

(The latter is much better. If you have a budget of 100 measurements, then it is much better to take 100 units-of-investigation and make one measurement, then to take 50 units-of-investigation and make 2 measurements. Worse is 33 units-of-investigation and 3 measurements.)

And as I understand it then you continued with the same brand but with a higher force?

And you did not randomize the titanium pieces to the various forces, by using random numbers or a lottery, but I guess that you simply arbitrarily assign them to different forces?

Read about randomization and residuals.

This is not my actual data, it has slightly been modified so that I can put it on a forum, yet it resembles my true data in its large portion.
So you work part time for KGB after all?
 
#18
I do not believe that they would care much about dentistry ;)
My study material was very expensive, so I only had 2 specimens per brand, one with smaller diameter, one with bigger diameter. This data is only for smaller diameter specimens. I took one specimen per brand, applied incremental forces from 20N to 120N, removed it from the loading device, dismantled the loading device and the specimen, assembled it again and put on the loading device, then loaded with incremental forces again and did the same process once again from the start - resulting with 3 measurements for each force (I did 3 series per specimen). Their behaviour was similar within brand, with a standard deviation mostly less than 10% (this is due to the method sensitivity problems). The data in the table for each brand and force is the mean value from those 3 series of measurements. I hope this makes sense to you.

Could I go further with my data analysis besides doing anova as mentioned? Or do I simply take the information from the anova table and conclude that the specimen brand significantly influences the deformation and decide on which brands deform more/less from looking at the regression lines?

Additional question: Is the code hereinafter correct to examine residuals?

Code:
lmfit= lm(Deformation~diameter.f + brand.f, dataDeformation)
par(mfrow=c(2,2))
plot(lmfit)
 
Last edited:
#19
Pickle, could you please elaborate on every detail of your study? Statistician experts here might need to know the exact procedures you have used, in order to better understand the design. Please let us know step by step about every detail of your study. What were the specimens? Orthodontic archwires? Implants? Or composite resin block? I see you have talked of "blocks" so I guess it might be some blocks of restorative materials. What was the deformation test? Load deflection? Knoop hardness? Or whatever else? Please also tell us about the procedures of that deforming test, or give some explanatory links. What was the unit of measurement? I would appreciate if you could give a complete image of your experiments with every detail, in a single organized post. The confidential information could be masked or replaced by other names, but we still need to know a lot more about your study. Did you practice that load deflection test on a single point of your each specimen for three times? Or at each time, the point differed? Was the distance between the three points sufficient or the experiment of each point could affect the result of the experiment on the neighboring point? Many more details please. Thanks. :)


edit: these are titanium, ok then no Knoop hardness but please tell us more about the tests, materials, etc.

Another question (from Greta): Did you do any measurement in between? Between "forces" and "hits"?

And another one (again by Greta): Does the block material remain elastic under the maximum force you exerted (120 N force)? Or some permanent shape changes happen in the titanium material after the force reaches one of the loads you applied in your study?
 
#20
The specimens were dental implants. The deformation was measured with an optical method resulting in numbers that a software eventually turns into microns (unit of measurement). The loading device was a vertical press unit combined with a digital dynamometer that detected the force applied. Upon activating the loading device, it would press down on a dental implant. Each specimen (brand) was mounted in the same way so that the pressure was always applied to the same part of the dental implant, so there were no multiple points, and the testing conditions were the same for all brands. Each brand was deformed at 20 Newtons, 30 N, 40 N, and so on, and upon finishing the whole series everything was disassembled and assembled again to repeat the measurements (3 times altogether) so that I can calculate a mean deformation for every amount of force applied to a specific brand.
Thank you for asking for more info about my research, and your willingness to help. I hope that I provided at least a few useful pieces of information and did not talk rubbish as much.