Which test to use? Comparing population with subset from this population.

#1
Hi, here is my example:

In the survey, I have 1000 answers and 70% of respondents answered Yes. Now I take answers only for a specific subset of this group- let's say, men. For example, out of 1000 people, there are 400 men and this subset have 60% of positive answers. What I want to know is - whether answers of this subset are significantly different.
I tried to find the proper test but mostly samples need to be independent...
I don't know if I should use some proportion test or whether I can consider 1000 people as population etc... I am really new to statistics so probably I am asking bad questions, but will appreciate pointing me in the right direction.

Thanks for any help.
 

hlsmith

Not a robit
#2
Different from women or the whole? The whole includes them, so depending on how many men there are the whole is just a weighted version of the subsample.

If you are comparing the genders, a chi-sq or Fisher's exact test can be used, the latter for small samples. If this is just a binary comparison, the best bet would be to calculated a rate difference with a confidence interval.
 

hlsmith

Not a robit
#4
Many people feel compelled to conduct formal tests for trivial/descriptive statistics. In this scenario, I am not sure a test is needed, since any difference is just based on the prevalence of males in the full population and the two groups are not independent. For all intents and purposes, just reporting percentages and count values for the group and sub group should suffice. Why is a standardized test needed and what does it actually say? Doing a whole bunch of unnecessary tests in a study usually just functions to drown out the actual import ones and also make me personally think that false discovery could now come into play.
 
#5
Thanks for the answer! Can we say that this subset is significantly different ( with 95 % of probability ) than the whole set? I have a comparison with percentages, counts, etc... but wanted to mark subsets that are significantly under average ....