Subjects or power of test-retest reliability using ICC

I’m a trying to get a paper published about the test-retest reliability of certain measures in spinal deformity patients. In summary, 5 subjects were scored by the same rater on six parameters. 2 weeks later, they were scored again by the same rater on the same six parameters. Although I realize that I don’t have that many subjects, I calculated intraclass correlation coefficients (ICC’s) with a two-way random effects model for absolute agreement (ICC(2,1)) (SPSS 25, IBM Corp. Armonk, NY).

My reviewer’s question was: “Please discuss the potential impact of accounting for 5 subjects (instead of a larger set) on the computation of the ICC.”
He also included 4 publications:

It would be really helpful if someone could help me answer this question (either by discussing how this could affect the interpretation of reliability, helping me calculate the a minimal amount of subjects needed to get a certain power, calculate the power of the analysis as I performed it,…) as I am struggling with the statistics in the reference papers.

