# Two paired t-test question

#### vaved

##### New Member
I feel like this may be a stupid question, but I'd love for some clarification....

I have a data set which has many categories (e.g. sports) that can fit into two broader categories (e.g. winter sports and not winter sports) (hence two sample t-test and not ANOVA, the point of my research is comparing the two broader categories) with data from several years. Should I:

a) Run the test on one year of data, with the data being from the categories that fit under the two broad categories (e.g. sample 1/winter sports: snowboarding, skiing and sample 2/non winter sports: track, archery, shotput), resulting in a very small sample size for each
b) Add the smaller categories up to make the broad category and use the different years data as the sample values
c) Something else!

A )sounds better to me, but like I said, it results in a very small sample size.

I would never want to throw away most of my data, so I would rule out (a). Regarding (b), I doubt you'd want to literally add the observations from the sports together, although averaging them might be an option. However, that covers up sport-to-sport variation which really should be taken into account in your statistical inferences. What I would probably do is use all the original data in a linear model that included a term for year, possibly as a random factor, which would make the model a mixed linear model.