Thank you in advance.

- Thread starter twotone4
- Start date
- Tags convenience interval ordinal standard deviation survey

Thank you in advance.

Also, that is one of the issues with surveys sometimes is that they sometimes draw from samples at the two ends of the normal curve (people who are angry, and people who support the idea).

For example, if you take a sample of 20 people, and they those surveys have a really high intraclass correlation, then you probably dont need an exceptionally large number of sampling units to draw inference. You can just weight them, and draw your conclusions. On the other hand, if you sample 20 people and you get 20 vastly different sets of survey responses (unlikely on a 5 point scale ), then you need a much larger sample to draw inference on.

Now if you are asking if you can just take the mean and SD of survey responses, you certainly can, but those are not always meaningful.

I can suggest a pretty good text if you would like that can point you in the right direction.

"are you saying that the survey has issues of sample of convenience because of non-response rates? Thats normal and much of that can be accounted for by proper weighting."

Actually yes, I realized I did not really mean convenience after all, and the issue was yes more of a nonresponse type. I gave everyone something to try and they had to rate it on a scale of 1 to 5. They didn't all submit a rating. What sort of weighting would apply here and to what and how? And absolutely, I would love your resources if you have some to offer! Thank you.

"Now if you are asking if you can just take the mean and SD of survey responses, you certainly can, but those are not always meaningful."

Why wouldn't they be? Is it because of the limited five point scale I am using? You mentioned: "On the other hand, if you sample 20 people and you get 20 vastly different sets of survey responses (unlikely on a 5 point scale ), then you need a much larger sample to draw inference on." Then couldn't just how different the responses are be measured with a measure of spread such as variance?

Thank you!

How you choose to weight your sample depends on what you care about, and again your target population.

If you research questions do not care about a certain population, and you only care about general responses to data, then its probably not that big of a deal. As long as your rate of return is decent (>50%) you should be ok. Now, if you find out there is a reason that a certain group did not respond, then you have an issue with coverage. But I doubt that.

Before I can give much advice on actual design analysis in terms of waiting, I would need to know if you are trying to make inference on different populations. Or are you just making inference on the items rated?

If the latter is the case, you can just take your n, run something like a mann-whitney u test.

I suppose I am not as interested in comparing groups, but more interested in how much data I need to trust a response. If 500 people were given a test item and asked to rate it, they could do so on a scale of 1 to 5 (where 1 is really didn’t like and 5 is really did like). I am not necessarily concerned with what the rating is, in terms of exact value. I am concerned with whether the number of responses I get is enough to trust the rating. How can I be sure if the item is really likable or not? If I get 100 responses and they are all across the board, and the variation is large, then I don’t feel as though I could trust the rating. But if I had 100 responses and they were all a 5, then maybe I could trust the rating. The question I am wondering is whether an item can be considered likable or not. If only five people answered and they all answered 5, maybe it’s enough to say it’s a likable item. Though perhaps these people are the ones who really liked it and it’s not enough to trust the score.

Thank you again.

http://www.math.yorku.ca/SCS/Online/power/

This will tell you how many sampling units you need to detect an effect size of a certain level.

https://www.dssresearch.com/KnowledgeCenter/toolkitcalculators/statisticalpowercalculators.aspx

this one will calculate your power.

Usually a good design will have a beta coefficient of .8

As for adjustment for non-response bias. You can do that, it involves adjusting your standard error using a population correction factor. That will widen your confidence intervals a touch. But honestly, if your sample is large enough, you will find that PCF's do not largely affect the estimates.

A second question is statistical power. How likely it is you will reject the null if in fact you should (another way ot putting this is how likely it is you will not make a type II error). Again there are tools on line such as Gpower that deal with this. There is no easy way to know this except doing such a power calculation.

Third is the issue of non-response. Essentially this deals with if those who respond to your sample are reflectiving of the population as a whole. To some extent that depends on who responds and does not respond. I have never seen literature that suggests what specifically larger samples do to this issue I am not sure it is knowable.

Finally some statistical test are only asymptotically accurate. Skew and other problems can seriously distort the results for these methods unless you have a "large" sample (which is not defined specifically).

All of these concepts are different and you have to be careful on which you are addressing.