@Pickle, how would you do a stochastic specification for this project? For example for barnd1 when you are doing the second cycle of bumping 20N – 120N, and you are bumbing at 40N: How is that deformation measurement affected by the previous deformation at 30N? and by the before that at 20N? And at that still one previous at 120N? How do previous bumping affect the current measurement?

Greta, I think pickle assumes (as he/she already confirmed) that the previous deformations are totally elastic, meaning that at these loads, absolutely no permanent deformations remain (or if there remains some ultra microscopical deformations, we can ignore them and consider them non-existent).

Pickle, I personally think your design is limited (like many other studies on expensive materials) but not invalid. You have still pairs of Force-Brand which can be used to predict the extent of deformation. Although as Greta said, it was better to exert each force-brand on a new piece of implant, it is not practically possible sometimes. So I think your design can be accepted.

Now I think your question is that "are your results valid"? I haven't read them yet, but I think why not? If you have used the correct test, and the assumptions of the test are met, why not? OK exerting 11 x 3 forces at a single point of each implant might be somehow limiting. If possible, I would rotate the implant at each trial, so that each new force would be exerted to a new point (of course if the implant design was symmetric, which I think it was). But even in your current design, as far as those forces did not bend the implant, or did not change the shape of the loading area (I mean the geometry of the small area at which the force is applied), it could be OK to load the same point frequently. I am not concerned with the former as you already told us that forces up to 120 N do not bend the implant permanently (to the plastic point) (even at microscopical level). But I am not sure whether titanium alloy is malleable or not under the forces up to 120 N.

Lets assume the worst scenario, that is distortion of the loading area by the applied forces. If the shape of the force bed changes after each try, you might have some noise in your results. But given your specific design in which the greater forces are applied after the smaller forces, you can be sure that at least in the first round of trials (from 20 N - 120 N), the changes in the loading area by the weaker forces could be ignorable when greater forces were applied. Because each force might tend to dig a microscopic shallow hole on the implant neck, and by increasing the force, that shallow hole will increase in size (IF implant is malleable of course, otherwise everything is fine).

So at least your results in the first round were likely reliable. If the 120 N of the first round distorted the loading area, the next 20 N force of the second round would be applied to a greater area, reducing the pressure exerted to the implant neck (in MegaPascal). So there might be some errors in the values obtained in the second and third rounds of loading.

That way, you can discard the two other rounds, instead of taking the average values. But you can also evaluate the second and third rounds, statistically. Check if there is a stable pattern of change (for example all decreasing) from the first trials to the second and from the second trials to the third?

You can use a repeated-measures ANOVA and its posthoc, or at least a paired t-test for this purpose. Put the results of all the implants in rows and differentiate them according to the number of the force application rounds (a table of 5 x 11 = 55 rows and 3 columns). Then using a t-test, check if there is a significant trend in the values in the first round compared to the second, and between the second and third rounds. If there were no significant changes between the three rounds of force application, it means that OK the error introduced into your design by ultra-microscopic deformations at each trial is not affecting your results (of course if your test power was sufficient [which I guess it would be, given the good sample size of yours]).

If the test showed that the second round of force application leads to deformations significantly lesser than the first round, and same happens in the third round, then I would suggest you to drop the second and third rounds of force application from your study.

However, you could still tend to use them and take the averages of the three trials of each Force-Brand. Using the second and third trials would of course introduce some measuring bias into your results (if there was a significant decrease in your deformations in the second and third trials). In particular, your Mean values would become affected (the mean would reduce or increase a little bit of course). But the correlations between the independent and dependent variables might still be valid. Although I do not recommend this option (even though I think the changes in the mean value would be really small, if detectable).

Otherwise, if there was no significant increases or decreases, all the three trials are likely valid and you can take their averages as the main value for the response of that specific Force-Brand. But as a suggestion, I think it would be better if you did not combine the three trials for each Force-Brand. That would lead to data loss. Although taking their average as your main result is quite good, it still omits their variance which is of course valuable. If you have the data pertaining to each round of force application [of course you have], I suggest you to use them as separate findings. That would give you a triple-sized sample with many more information.

I suggest you to report as well the results of the statistical assessment of the validity of the second and third trials in your thesis (and later, article) [and also here, if you wished].