I have a dataset that was collected over 5 discrete sessions, with each session collected on a different date. During each session, 10 human subjects each trained an artificial intelligence program to perform a set of tasks, and additionally, two separate algorithms trained this AI program to perform the same set of tasks. The 5 sessions are related to each other in that the trained program resulting from Session N for each human or algorithmic trainer was used as the initial program for Session N+1 for that same human or algorithmic trainer, allowing the program's performance to improve across the 5 sessions.

I need to compare the performance of the human trainers [as a group] vs. the separate performance of the algorithmic training of types 1 and 2 [both of the algorithms ran 10 times per session, so that I have parallel data for each of these 3 conditions / types of trainer]. I am only interested in reporting a single outcome measure (a scalar value indicating the success of the program at performing the set of tasks).

I am currently confused about whether I need to be using a longitudinal analysis since the data was collected over 5 sessions; I believe that this would involve the application of a general linear mixed model analysis or generalized estimating equations. Or, can I analyze each of the 5 sessions' datasets separately (e.g. performing ANOVA analysis on Human Trainers vs. Algorithm 1 vs. Algorithm 2 for each session, and use post-hoc analysis when a significant difference is found), since I am not considering covariates such as the age, gender, etc. of the human subjects on how their trained program performed?

I have encountered a suggestion from one expert in the field that I should apply a Bonferroni correction with a factor of 5 since 5 datasets are being compared, but I don't believe that this suggestion is correct, since I am only reporting a single outcome measure for each session, and I believe that the Bonferroni correction is intended when *multiple* comparisons are made / multiple hypotheses are tested for the *same* dataset.

Thanks in advance for any input you can provide.