Thread: Help with understanding evolutionary conservation

    I am working with a dataset with measurements of the %amino acid conservation (DV) for 62 ordered glycosylation sites (IV - 62 levels) derived from 2 compartments (IV - 2 levels, repeated measures) within 5 individuals (IV - 1 level). My null hypothesis is that %conservation is the same for both compartments. If there are significant differences between compartments I would like to know which sites the differences are at. I would like to clarify that I do not expect all sites to have the same mean %conservation between them, just that overall there should be roughly the same amount of conservation variation between compartments at each site. The last point is where I am getting confused I believe.

    One issue I have is sample size. If I do the 2 way repeated measures (within within) ANOVA and look at %conservation per site per compartment, I am left with only 5 measurements per group. It is nearly impossible for me to assess normality and the variance for %conservation between sites is not homoscedastic. As an alternative, can I pool the different sites together just by compartment, increasing n to 324 and just do a paired t-test? That would tell me if the null of equal conservation among compartments was or was not rejected. Then, to identify which sites are differentially conserved, could I perform a paired t-test for each site or would I run into multiple comparison issues? This is where things really break down for me.

    I am using SPSS because it is the only statistics package I vaguely know. I am attaching the spreadsheet I am using in case that helps explain my issues better. Any suggestions you have are greatly appreciated. Thank you.
