1. ## Comparing sample data with population data to check representativeness - How?

Dear All,

I have a small problem here and I hope someone can help me. I did survey at a university department with around 300 people. I have thus far collected 162 filled surveys via the internet. The questions asked include age, gender, hei work function and the number of hours they work per week.

However, I would also like to check the representativeness of the sample compared to the population (university department). It will be possible to retrieved (population) data from the university department on on age, gender, function and hours work per week.

The question is, how should I compare them? What techniques can I use to check the representativeness of my sample compared to the population?
Sorry, I am quite a beginner on statistics and i hope you can point me to some literature on the internet where i can learn step by step.

Hope to hear from you. Thanks!

Lee

2. Lee,

List out the parameters of the population that you feel are important to the definition of "representativeness" (i.e., gender and age).

If the population is say, 65% female, 35% male, then the sample should break down closely to that.

For age, break down the population into age categories or actual age numbers, then compare the % of each in the sample to the % of each in the population.

Do the same exercise for "function" and "hours worked per week" and anything else you feel is important.....

You could just use judgment in determining whether the sample %'s are "close enough" to the population %'s, or you could run a z-test to statistically compare the sample % to the known population %.

3. Dear John,

Thanks for your mail. I am going to get the actual population data next Tuesday. If you don't mind. I am going to work out this problem at post it here for people to comment or critic.

Thanks,
Lee

4. Dear all,

Sorry for the late reply. I had difficulty getting the population data for the past few weeks, but now I have it. I did what you have advice by comparing the means and variance of both the sample and population data. Some of the means and variance are really close and some others not. I have difficulty "judging" what is close and what is not. So perhaps some of you can help me out.

I have attached my little assignment here as well as the sample and population data in excel file. In the excel file, I have combined the sample and poupulation variables one one sheet and the definition of the codes in the second sheet. I hope it will be clear to everybody.

The tricky part is, I'm not sure how to use z-score and chi-square to further test the representativenesss of the sample in SPSS. So i hope to get some pointers from you guys. Feel free to critic.

Many thanks.

