It would depend on how biased the sample was or wasn't and whether you could speculate the directionality of the bias based on a validation sample using quantitative bias analysis.
Ignore the bootstrap as a possibility.
If the answer is no, then what is the point of stating the p-value (especially if it is, say, 0.10)?
It would depend on how biased the sample was or wasn't and whether you could speculate the directionality of the bias based on a validation sample using quantitative bias analysis.
Stop cowardice, ban guns!
I don't know if it is valid or not to calculate a p value. What is not valid is to say that a p value of a convenience sample tells you anything at all about a larger population. And usually that is what you are interested in.
People have a habit of confusing sampling with statistics like say p values. They really are very different topics.
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
Last edited by woodbomb; 01-13-2016 at 02:48 PM.
Your interpretation of the p-value is correct, but I think Noetsi is referencing the generalizability a conclusion based on the p-value to another sample (with a different sampling strategy) or larger population - with the possibility that they could have different attributes.
Stop cowardice, ban guns!
It depends what you're trying to use the p value for. p values aren't always used to make inferences about populations. They can also be used to make inferences about causal effects within a sample, with random assignment to conditions.
E.g., you might interpret a p value from a randomized experiment as the probability of obtaining a test statistic as more extreme than that observed under a null hypothesis that the IV had no effect whatsoever on the sample (and with the auxiliary assumption of a lack of confounding).
If you're using a p value to make inferences about a larger population, well, you can, but only if the assumptions of the statistical method you're using hold true. For example, let's say we're interested in testing a very simple hypothesis: That the mean of variable Y in population P takes a particular value (say, 10). Then a one-sample t-test allows us to test this hypothesis, and the p value will be "valid" provided that the following assumptions hold true:
1. Observations are independent
2. If we were to conduct repeated samplings, the sample means would have the same expected value as the population mean. (I.e., some sample means will be higher than the true population mean, and some lower, but in the long run over a very large* number of samplings, the average of the sample means would be the same as the population mean).
If you use random sampling, assumption (2) will be true (barring the presence of systematic measurement error). But with convenience sampling, there is no reason whatsover to think that assumption (2) is true. In fact, it's barely even meaningful: What does it even mean to conduct repeated samplings, if the sampling method is not clearly defined?
TL;DR yes you can interpret p values based on convenience samples, but only by making assumptions, assumptions which are probably false and perhaps even meaningless in a convenience sample. But everyone does so anyway.
*Ok, infinite.
[QUOTE=CowboyBear;183914]It depends what you're trying to use the p value for. p values aren't always used to make inferences about populations. They can also be used to make inferences about causal effects within a sample, with random assignment to conditions.
E.g., you might interpret a p value from a randomized experiment as the probability of obtaining a test statistic as more extreme than that observed under a null hypothesis that the IV had no effect whatsoever on the sample (and with the auxiliary assumption of a lack of confounding).
If you're using a p value to make inferences about a larger population, well, you can, but only if the assumptions of the statistical method you're using hold true. For example, let's say we're interested in testing a very simple hypothesis: That the mean of variable Y in population P takes a particular value (say, 10). Then a one-sample t-test allows us to test this hypothesis, and the p value will be "valid" provided that the following assumptions hold true:
1. Observations are independent
2. If we were to conduct repeated samplings, the sample means would have the same expected value as the population mean. (I.e., some sample means will be higher than the true population mean, and some lower, but in the long run over a very large* number of samplings, the average of the sample means would be the same as the population mean).
If you use random sampling, assumption (2) will be true (barring the presence of systematic measurement error). But with convenience sampling, there is no reason whatsover to think that assumption (2) is true. In fact, it's barely even meaningful: What does it even mean to conduct repeated samplings, if the sampling method is not clearly defined?
TL;DR yes you can interpret p values based on convenience samples, but only by making assumptions, assumptions which are probably false and perhaps even meaningless in a convenience sample. But everyone does so anyway.
COWBOY,
It appears that we agree as to the meaningless nature of a p-value derived from a convenience sample (in the absence of support for certain assumptions). How odd, that everyone does so anyway. As in, "Let's exercise precise reasoning based on unverifiable assumptions!" Whoo hoo!
I was referring to how generalizable your sample was to a larger sample (commonly a population). For a convenience sample you can not generalize to anything outside the convenience sample. Its not that the p value is wrong, its that the sample its calculated on tells you nothing. So the p value tells you nothing (well other than what the convenience sample tells you).
Personally I consider convenience samples worthless although they are commonly used. Systematic errors, for example people who care strongly about the topic are more likely to comment, are inherent in them.
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
Yeah, it's interesting. I suspect that a small part of the problem may be the fact that random sampling isn't explicitly an assumption of most statistical analyses, so it's easier to ignore. E.g., if we think of the assumptions of a linear model estimated via OLS:
- Errors have conditional mean zero for any combination of values of the predictors
- Errors have the same variance for any combination of values of the predictors
- Error terms are independent
- Error terms are normally distributed.
Random sampling doesn't appear anywhere on the list. But assumption 1 - by far the most important assumption - is almost certainly going to be breached if you don't have random sampling: It could well be the case that you systematically tend to select people who have positive errors for a particular combination of values of the predictors, or whatever. And then your estimates will be biased.
But although people tend to have a hazy idea that random sampling is a good thing, that connection between sampling and assumptions isn't usually drawn explicitly in most texts, so it's easier to ignore.
On a different level, I think you can also look at significance testing as a kind of weird social practice. Gerd Gigerenzer calls significance testing "the null ritual", which makes a lot of sense to me. Most researchers aren't quite sure what a significance test actually tells them even when its assumptions are met, and wouldn't care about the answer even if they did; but they do significance tests anyway, because that's what you do when you a statistical analysis. I.e., it's a ritual - not an investigation.
Random sampling, generalizability, really has nothing to do with statistics per se. That is why it is not part of say the Gauss Markov assumptions. All the assumptions go to is whether the method estimates correctly on a given data base. They do not address, statistics does not address generally, whether you can use a given statistic to analyze a specific sample or population from another sample.
Sampling is a distinct field from statistics and has its own entirely separate set of assumptions. People tend to assume they are the same field when they are not.
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
Also for consideration is the size of the convenience sample given population, etc.
Stop cowardice, ban guns!
I am sure there is disagreement on this, but I don't think a convenience sample ever can be used to generalize to a larger population regardless of N. Nor does having a larger n increase its usefulness (at least in theory). They key is how you sample not how many you sample. If your sample is biased, having more biased people (in terms of estimating a true mean value - not that they have personal bias )does not make the situation better.
"Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995
Equating a convenience sample as biased is unfounded. It definitely has a tendency to be biased in regards to the population's characteristics. Remember you can generalize results to members of other samples comparable based on the same types of collection strategies. Also, data from convenience samples can provide good information and pvalues can be tested within them.
Say I was only able to sample wealth people who work at a single type of business, does it not hurt to examine these individuals and their traits?
Stop cowardice, ban guns!
hl,
Not sure what you mean by "pvalues can be tested within them". A p-value can be used to decide whether to reject or accept a null hypothesis, if that's what you mean.
Do you disagree with me (and COWBOY, I think) that a p-value derived from a convenience sample (in the absence of support for certain assumptions) is meaningless (since there is no evidence that the data were obtained from a RANDOM sample)?
Tweet |