but there's a lot of variability from baseline to current performance across the schools (for example, some Group A schools had large increases [30+ points], some had no change, and some had large decreases [-30 points]) so that's why we currently looking at overall performance across all schools within each group.
You cannot do that. It is a bit surprising that you analyse data from such a large
study, but want to commit the basic mistake which has been discussed
for 50 years or so in regard of such experiments. If "school" has such a large
impact, then the total increase (or decrease) across schools mainly depends
on how many students from strong versus poor schools are included. You'll
have to model the "school" effect in some way, which otherwise would
create statistical error and probably bias, if being left unaccounted for.
You could perhaps look for multilevel modeling, with students clustered
within schools, and schools within experimental groups. And you'll probably
want to include additional variables (on the student and on the school level),
correlated with performance, in order to reduce random error and bias.