We have two independent patient groups in an Individual RCT ("treatment" and control["standard care"]). The treatments will result in a change of scores on a psychometric scale, from baseline to end-line. A baseline assessment was conducted using this scale, and thereafter, the participants were randomized into one of the two groups. In the control group, the average scale score at baseline is assumed to be 13.4 (sd: 2.63) and we expect the average score at end-line to be 9.0 (sd: 3.98). We are using a reference study to assume these scores. The duration of standard care between baseline and end-line in this group will be 6 months. In the treatment group, the average scale score at baseline is assumed to be 14.5 (sd: 3.22) and we expect the average score at end-line to be 7.3 (sd: 4.15). The treatment duration between baseline and end-line in this group will also be 6 months.

What is the sample size required to detect the aforementioned difference of differences (baseline minus end-line) in the mean scores between the two groups, at 80% and 90% power, at 0.05 significance level (two-tailed) with an allocation ratio of 1? We would also appreciate if you could share the formula/reference reading for the calculation.

