Greetings all,
I have a question about analysing responses from a multiple choice question.
A lot of what we do in our analysis is say which option is picked most often (and therefore ranked #1) out of a multiple choice question.
E.g. which is your favourite colour?
-red -40%
-blue -35%
-green – 25%
Red ranks first. But if the margin of error is +/- 5 %, the difference between red and blue is not actually statistically significant, right?
So my question is, what test do you use to show which proportions are significantly higher, aside from eyeballing it? One of my coworkers uses a z-test for proportions, but that is for 2 independent samples... (http://www.dimensionresearch.com/res...ors/ztest.html)
But these aren’t independent samples, right? They’re the same sample!
But it’s not like it’s a pair-wise design either....
Chi – square also works to show that there is a statistical difference between the 3 options, but doesn’t show which ones are different: red and blue don’t have to be significantly different to get an overall significant difference.
Right?
This has actually been driving me nuts.
Is a z-test for proportions legitemate? I've read through a bunch of textbooks and websites and can't find anything that deals with this particular problem.
Thanks!
-Tony
Last edited by TonyW; 11-08-2010 at 11:38 AM.
Hi, no, haven't heard anything... what, three years later?!
how about you? any luck?
Actually you got a multinomial sample. So in such case you may also use normal approximation - of course in this time the two sample proportions are not independent anymore as there is a covariance term in the standard error:
Of course under the null hypothesis you could also use the pooled estimate as before.
Sorry to bump such an old thread, but I got interested in a similar problem. After googling for hours I found an online calculator to compare percentages from one sample: http://www.quirks.com/resources/Calculator.aspx.
I then spend another few hours trying to figure out how this online calculator calculates and finally found a formula for the variance that will give the same results as in the online calculator at: http://www.stat.wmich.edu/s160/book/node64.html
I noticed the result will be different than the formula given by BGM. A question that arises is now, which one is correct?
I believe the equation you found is correct. BGM's is correct except the covariance wasn't added twice (it should have a 2 in it). This can be seen in the covariance matrix. Where it will be (for two variables x and b)
|x^2|xb|
|bx|b^2|
xb and bx terms are the same and so would be equivalent to 2xb.
... if that makes sense.
Though I do agree with BGM that if you were making a pooled estimate (based on H0: p1=p2) then you'd use the alternative for a pooled estimate.
Tweet |