Binomial data, proportion comparison

#1
Hi guys,

I’m trying to find out whether the percentage of students who are satisfied with a service (statistically) significantly changes year-to-year.

This would be easy enough to do with interval data with a t-test (scale of 1-4, looking at the mean), but I would like to know how to compare binary variables (interval data binned into 2 categories: Satisfied vs. Dissatisfied).

Data
Samples are drawn randomly, sizes are approximately 900 cases for each year. Population size is approximately 17,000.

Interval satisfaction data (scale of 1-4) is collapsed into dichotomous satisfaction data (1,2=Satisfied / 3,4=Dissatisfied). Percent Satisfied is reported, based on the dichotomous data.

What I want to find out
Are there significant differences between the reported Percent Satisfied between 2 given years?

Any advice you could provide would be much appreciated.
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
So you want to compare a dichotomous variable by a dichotomous variable (i.e., 2x2 table notation)? Would you assume independence between the years?
 
#3
Thanks for your response, hlsmith.

I would assume independence. Samples of 900 are randomly drawn each year out of a population of about 17000. Even if some of the same students are drawn in both years, it is expected that any student's answer one year will not affect their answer in a following year.

Should I be doing a matrix? I'm trying to understand the logic around this. Are you suggesting that I should do a goodness-of-fit test?

To rephrase, I'm comparing the proportion of satisfied students in one year vs. the percent of satisfied students in the following year. For example, 85% vs 87%. I want to know if the change is statistically significant.

Thanks :)
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
From your description, its seems that you would use the chi-square test, or Fisher's exact test (if any of the four category groups have 5 or fewer students. So yes, you can draft up a 2 x 2 contingency table for use.

You should be able to find quite a bit of literature on the web about these tests, and please continue to post if you still have questions.
 
#6
No, it is not.
1.96*sqrt( (0.85*(1-0.85)/900) + (0.87*(1-0.87)/900)) = 0.03204665

But a t-test/z-test is better.
Sorry for the noobness, but what formula is this (above), and why does it work in this situation?

From what I understand, a t-test wouldn't work on dichotomous data. Is that correct? Or should I assume that because it's also ordinal, then I can apply techniques that apply to ordinal level data?

Cheers.
 
#7
From your description, its seems that you would use the chi-square test, or Fisher's exact test (if any of the four category groups have 5 or fewer students. So yes, you can draft up a 2 x 2 contingency table for use.

You should be able to find quite a bit of literature on the web about these tests, and please continue to post if you still have questions.
Thank you - I've done a bunch of reading, as you suggested.

From what I understand, you suggest applying the chi-square test with the current year's data as "observed" and previous year's data as "expected", right? Would the previous year's distribution come into play at any point? I've attempted the Chi-square test in SPSS (under non-parametric tests), and just needed to plug in the previous year's proportions of expected data. It seems to work, just wanted to check if that's the correct way of doing it.