I would really need someone to check my stats work for my thesis, as I am not entirely sure about what procedures I carried out.

Quick overview of my my experiment:
- I investigated 2 methods of using worked examples to study the division & multiplication of fractions in 2 year 7 classrooms, using a pretest, a practice session & a posttest, in addition to a mental effort/perceived difficulty measure (Likert scale).

Fading Method:
Solution steps gradually get omitted, so students received:
(1) Worked Examples (WE) on Multiplication of Fractions (3 steps)
(2) similar problem with 3rd step to-be-solved
(3) similar problem with 2nd & 3rd step to-be-solved
(4) similar problem to be solved entirely
(5) WE on Division of Fractions (3 steps)
(6) similar problem... (same fading procedure)
(7) similar problem...
(8) similar problem...

--> students solve 12 solution steps in total

Interleaved Example-Problem Condition:
(1) WE on Multiplication (3 steps)
(2) Problem to-be-solved (all 3 steps)
(3) WE on Multiplication
(4) Problem to-be-solved
(5) WE on Division
(6) Problem to-be-solved
(7) WE on Division
(8) Problem to-be-solved

--> students also solve 12 steps in total, max. score = 12 ( 1 point per step)

- experimental group (Fading & Interleaved)
- Pretest score (per question & total)
- Practice session: proportion of correctly answered step 1, 2, & 3 questions, number of total errors & total score (not in proportion format)
- Posttest score (per question & total)
- Mental Effort & Perceived Difficulty scores per participant for pretest, practice session & posttest.
- Pretest & Posttest were comprised of 4 questions (2 on x & 2 on , each scored out of 3 (because in practice condition the problems are broken down into 3 steps aswell), max. score 12.
- Mental Effort & Perceived Difficulty measured on 5-point Likert scale after pretest, practice session, & posttest.

Analyses I carried out:

> 2-sample t-test to see if conditions were comparable at pretest --> Levene's test sig. for 2 questions so I carried out a mann-whitney U test (not sig., so condition were comparable)

> within-subjects comparison (t-test) to see if sample as whole made progress from pre- to posttest (sig.)

> between-subjects comparison to check if proportion of correct answers to step 1, 2 & 3, respectively, were different. Levene's test sig. for proportion correct of step 1, so I conducted a Mann-Whitney U test again (sig., fading group scored lower)

> between-subjects comparison of total score (not sig.) & no. of errors (not sig.)

> between-subjects comparison of posttest performance (t-test)

> ANOVA for Likert scale items with experimental condition as between-subjects factor (not sig.)

> Mann-Whitney U test for Likert scale items (unsure why I did both tests...)

I would be immensely grateful for any feedback !!!

Kind regards, Fran