voluntary survey responses and statistical analysis

#1
Sorry if my seach of this forum failed to find a previous discussion of this issue. My simple understanding of surveys for which reponses are voluntary is that one cannot validly calculate things like bias, or validly adjust for any perceived bias, or determine a coefficient of variation for a population extrpolatd from the responses, because the sample (the responses) are non-random.

My undersanding seems reinforced by this thread: http://talkstats.com/showthread.php?t=2515&highlight=voluntary+survey

However, I continually see published results of surveys showing professors wailing away at statisical analyses of responses to voluntary surveys. A frequent analysis I see seems to be comparing responses of a survey conducted by phone to responses by mail, and then by comparing the differences of the means in each (using a z-test) they come up with a statement on the bias or lack thereof in each data collection method.

In addition, I also see attempts to calculate coefficients of variation in the nonrandom sample to determine a 95% confidence population range. Given that all responses were voluntary, I don't see how this is valid.

I am summarizing of course to make this post easier to follow, and I do not want to embarrass or "out" any author of any study, so I will not identify any specific study in which I've found this. My goal is to understand "why" this is done.

Are such attempts to use statistics complete nonsense? If so, why do I see it attempted by what appear to be tenured professors in reputable universities? I can't confirm if the study was published in a peer-reviewed publication, but I can confirm wthout a doubt that they performed such studies and analyses and they arte findable on the internet.

My key concern is from a practical standpoint in using less-than-perfect data to make reasonable real-world decisions, not from a purist theory view. Does my need make such use of stats on voluntary results better than not attempting such calculations, or are such attempts pure nonsense no matter what?

Thanks for any insight!
 
Last edited:
#2
I don't really follow your last paragraph.

As I'm sure you know, because it is published does NOT necessarily make it a worthwhile article. For every journal that is peer-reviewed, there is another that will publish anything if the author pays a fee.

I personally think survey research is crap. However, there is a need for this information in some respect and I understand others do not feel the same. The problems that arise from sampling, missing data, selective responding, and many other factors make them difficult to infer many reasonable conclusions from. People do it, however, and some do it well.

With the importance of publications for tenure, you will unfortunately see people doing less than academically sound things at times. It's like a monster that never fills.

The impetus is on the consumer of research to be able to discern crap research from valid research. Unfortunately, so many people are either not concerned or not taught to put value above volume.

To answer your question, I think tenure is the reason people do it, usually. Also, unfortunately, I feel that tooooo many do not realize how ridiculous their research is. It's sad really.
 

CB

Super Moderator
#3
However, I continually see published results of surveys showing professors wailing away at statisical analyses of responses to voluntary surveys.
I think this is quite an important issue to raise. As I understand you're quite right in thinking that inferential methods that assume random sampling are routinely used with samples that are not random. E.g. Null hypothesis significance tests and confidence intervals usually assume random sampling in order to generalise sample results to some population.

Kry talks about survey research being crap, but the fact is that this problem is not limited to survey research. In almost any study with human participants, the prospective participants have the right to decline to participate (and a fair chunk usually do). So human samples are almost never truly random - and this applies to experiments as much as it does to surveys.

What makes this issue particularly odd is that I think one way of looking at null hypothesis significance testing and confidence intervals is as a way of dealing with sampling error. There are definitely other kinds of important errors that apply to studies with human participants: social desirability bias, acquiescent and extreme response styles, measurement error in the classical test theory sense, etc. There are ways to deal with these other errors, but instead it seems like in focusing on significance testing etc. we spend much more effort trying to deal with a type of error occurring as a result of a process (random sampling) that hasn't actually occurred in the first place.

I don't know if there's any magic bullet to sort this all out. Bootstrapping and randomisation tests don't necessarily assume random sampling, so might be one avenue, but as I understand it they still require the assumption that a particular sample is representative of the population one is hoping to generalise to. (Others here know a lot more about these types of tests).

Maybe the best we can do is to acknowledge that genuine random sampling is usually impossible - but that some non-random and convenience sampling designs are better than others. A study that randomly selects individuals from a carefully defined population and manages to get an 80% response rate from a telephone survey by expending a lot of time and money is clearly not equal to a study where the researcher asks his twitter followers to fill out a survey (and to ask their friends to fill it out too). The latter case is particularly worrying, since friends of a researcher are probably more likely to A) Know what the researcher's hypotheses are, and B) Be keen to "help" confirm them.

Those are my thoughts anyway - I'm sure others will be able to point out some holes in my thinking :)