How to compare means and select final model for GEEs?

My writing experiment factor and levels:
1. line spacing width (3mm, 7mm, 10mm)
2. writing material (pen, pencil)
3. Age group (children, adult, olderly)
4. Gender
5. handedness

Basically, I prefer to compare their mean to see whether there is any significant difference. That's why I use repeated measure Anova, but now, it seems that GEE is a kind of regression and just to see their relationship by Beta, I would like to know can I judge whether there is a significant difference between the levels if the facotr in the Wald test shown to be significant? And then carry out some pairwise post hoc test?

It is a must to show the goodness of fit values. Basically how should we judge which factors and interaction as a final model? There is something called QIC, QICC where smallest the value, better the model, but in my case, I tried that out, when more the interactions were added, lower the QICC value it was, but this is non-sense to put all the interaction in it, the model is too big, which is more than 15 interaction and the estimating table is more than 3 pages.