# Questions on data complexity beyond my stats knowledge: what stats tests to use?

##### New Member
Hi everyone, sorry for this long post. I tried to be detailed, but if anything is unclear, please let me know. I consulted two stats friends for help but neither was familiar with the type of experimental design described below.

Thank you for your time and expertise in advance. I'd deeply appreciate any input!

---------------------------------- PART 1 ------------------------------------------
Project setup:
Thirty people participated in training on 12 independent topics. The model used for training on each topic was the same. The participants’ knowledge on each topic was tested before and after the training. The tests on each topic differed in contents but were the same in structure.

Complication: Because the participants had freedom in deciding how many topics and which topics they wanted training for, not all of them completed the training on all 12 topics. On average, 8 topics were completed per person. The topics completed by each participant also differed; for instance, some finished Topics 1-8, some 4-12, and some nonconsecutive.

Research question: Was the training model effective overall for all the topics?
Table 1 below displays a sample data layout. I need help with data analysis and guidance on what statistical test to choose for analyzing the data.

What I can do is treating the 12 topics independently, putting the participants who finished each topic into the same group, and conducting a matched samples t-test on their pre/posttest scores. But doing so will not provide me information on the effectiveness of the training model for all the 12 topics considered altogether and involves 12 matched samples t-tests. If 6 of the 12 t-tests show statistical significance whereas the rest do not, I will not be able to answer the research question.

My questions:

Would it be legitimate to treat the training received by each participant regardless of topic as a unit, ignore the different pre/posttests they had taken, calculate the average pre/posttest scores per participant, and conduct a matched samples t-test on these average scores? To clarify, what I am asking is whether this would be valid: suppose subject 1 had gotten 6 pre/posttest scores on the 6 topics he received training on, subject 2 had 5 pre/posttest scores … subject30 had 4 pre/posttest scores, I will just ignore the different numbers of test scores per subject and directly compare their mean pre/posttest scores?

Or, how about a two-way repeated measures ANOVA? The score could be the dependent variable, time and training theme could be the two independent variables. However, these two independent variables are neither within-groups nor between-groups; plus, there seems to be issues determining an number of subjects for the test. Would this test be valid for this case?

If neither of the two proposals is valid, could you offer me advice on what statistical test to use for testing the overall effectiveness of the training model? Again, please refer to Table 1 for a sample data layout for the case. The data layout can be reorganized if needed.

Table 1 (if table is unclear, please see attached)

Subject Topic of training Pretest score Posttest score
21 topic 1 0.67 0.67
26 topic 1 0.67 0.75
15 topic 1 0.5 0.83
3 topic 1 0.75 0.83
9 topic 1 0.75 0.83
2 topic 1 0.42 0.83
4 topic 1 0.67 0.75
19 topic 1 0.33 0.67
12 topic 1 0.58 0.92
6 topic 1 0.67 0.92
34 topic 1 0.67 0.67
34 topic 2 0.58 0.58
11 topic 2 0.83 0.83
2 topic 2 0.67 0.58
29 topic 2 0.83 0.75
8 topic 2 0.92 0.92
27 topic 2 0.75 0.83
21 topic 2 0.67 0.83
…….

29 topic 7 0.67 0.67
26 topic 7 0.67 0.75
4 topic 7 0.5 0.83
13 topic 7 0.75 0.83
12 topic 8 0.75 0.83
21 topic 8 0.42 0.83
26 topic 8 0.67 0.75
26 topic 9 0.33 0.67
4 topic 9 0.58 0.92
26 topic 10 0.67 0.92
14 topic 10 0.67 0.67
21 topic 11 0.58 0.58
26 topic 11 0.83 0.83
4 topic 11 0.67 0.58
3 topic 11 0.83 0.75
14 topic 12 0.92 0.92
2 topic 12 0.75 0.83
4 topic 12 0.67 0.83

---------------------------------- PART 2 ------------------------------------------
The project in Part 1 continued:

In fact the training on each topic covered exercises of four types. For each participant, the training per topic came in only one exercise type, which was randomly assigned by computer. This means, a participant probably was exposed to all four exercise types but with training on different topics.

Research questions: 1) was each exercise type effective; 1) which exercise type was the most effective.

What I can do is perhaps to deal with the four exercise types separately by putting the subjects exposed to the same exercise type into a group and run a one-way repeated measures ANCOVA. I can treat training topics as a covariate, test time as an independent variable, test score as a dependent variable. But with this test, I can only answer the first research question.

My questions:

Please advise on what statistical test to use for answering the second research question or ideally answering both two research questions together. Table 2 below is a sample data layout. The data can be reorganized if needed.

Table 2 (if table is unclear, please see attached)
Subject Topic of Training Pretest score Posttest score Exercise Type
21 topic 1 1 0.88 type A
33 topic 3 0.75 0.75 type A
21 topic 11 0.5 0.75 type A
2 topic 2 0.63 0.5 type A
22 topic 7 0.63 0.75 type A
25 topic 3 0.88 0.88 type A
26 topic 9 0.5 0.5 type A
……
22 topic 4 0.5 0.63 type B
27 topic 3 0.63 0.75 type B
29 topic 2 0.88 0.88 type B
21 topic 8 0.5 1 type B
4 topic 3 0.75 0.88 type B
3 topic 11 1 1 type B
3 topic 3 1 0.75 type B
17 topic 2 0.38 0.5 type B
18 topic 3 0.38 0.75 type B
29 topic 3 0.5 0.5 type B
2 topic 1 0.75 1 type B
19 topic 1 0.63 0.63 type B
30 topic 3 0.75 0.88 type B
……
34 topic 2 0.63 0.75 type D
26 topic 10 0.63 0.5 type D
14 topic 12 0.88 0.63 type D
12 topic 8 0.63 1 type D
12 topic 3 0.63 0.75 type D
26 topic 11 0.38 0.63 type D
21 topic 7 0.5 0.63 type D
32 topic 3 0.38 0.63 type D
8 topic 3 0.38 0.75 type D
5 topic 3 0.75 0.38 type D
26 topic 6 0.88 0.75 type D
27 topic 2 0.5 0.63 type D
21 topic 2 0.5 1 type D
34 topic 3 0.75 0.88 type D