1. ## equal/unequal sample sizes

I have recently sent a paper on aflatoxin levels in nuts for publishing. The aim of the study was to analyze a total amount of 106 nuts. The nuts were divided into four categories: walnuts (n=33), hazelnuts (n=19), pistacchio (n=29) and almonds (n=25). The reviewer seems not to have a broad knowledge on study design and statistical analysis, which can be seen in his critique: "In my opinion, every sample should be equal in size, in order to improve the quality of the research." I totally disagree with his opinion, mostly because following statistical analysis did not require the same sample size. This was a cohort epidemiological study, not a experimental one. Can somebody back me up with references or explain in details why do i don't need equal sample sizes to do a high quality research?
Or in other words:
Would the research be better if i compared 4 categories with the same sample size (eg. walnuts=23, hazelnuts=23, almonds=23, pistacchio=23), compared to my sample sizes?

2. ## Re: equal/unequal sample sizes

Well it looks like it would be n=19 per group, since hazelnut is the lowest value. But you know your data better than I. Have you ever seen the movie "Best in Show"?

You are correct in that many analyses do not require equal sizes. Can you tell us more about the procedures you conducted and your design? What was a cohort epidemiological study of nuts? This may help tell us if you needed equal sizes, perhaps, or not. Most likely you are in the correct here.

I believe you may get more "power" with equal sizes, but it does not make or brake you. If you did pairwise comparisons between groups and one group had fewer observations, it would potentially have a higher standard error and you may not find a significant result that would have been there if you had an equal number of observation in that group (since n-value is usually in the denominator of your tests). This could result in faulty conclusions about a particular group. But as long as the tests are well powered, it would not be a big deal. This might encourage you to also submit the effect sizes as well.

3. ## Re: equal/unequal sample sizes

Of course 19 would be the lowest value, but i said 23 just for a hypothetical purpose

To be more precise: The maximum level of aflatoxins in a nut is 4ug/kg. I am interested how many nuts have aflatoxins above this critical value. Luckily only 9 nuts have had >4ug/kg aflatoxins (see the attachment Table 1: Almonds=5, Hazelnuts=3, Pistacchio=1, Walnuts=0).
A chi square test was performed to see if there is a statistically significant difference between the nuts (p=0,0225).
Due to the 0 walnuts and small sample size of the >4 ug/kg nuts (5,3,1,0), I chose not do a pairwise comparison, but just to list the mean, s.d., and median levels.

4. ## Re: equal/unequal sample sizes

Your program should have recommended using the Fisher's exact test in lieu of the chi-square. Fisher's is an exact test (may take longer to run) and the chi-square is its asymptotic equivalent. Your p-value only tells you at least one group differs. Statistically you do not know where this difference is, and it would likely get washed away after corrected pairwise comparisons.

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts