I'm not sure it makes sense to test if one F is higher than the other.

What would the null and alternate hypotheses be? What are you ultimately getting at?

It would make more sense to me as "Does one model fit better than the other" or even combining these two comparisons into a 2x2 anova and testing if one difference in means is greater than the other difference in means. That's just an interaction.

