Is it meaningful to calculate a p-value based on a convenience sample?

#1
Ignore the bootstrap as a possibility.

If the answer is no, then what is the point of stating the p-value (especially if it is, say, 0.10)?
 

hlsmith

Not a robit
#2
It would depend on how biased the sample was or wasn't and whether you could speculate the directionality of the bias based on a validation sample using quantitative bias analysis.
 

noetsi

Fortran must die
#3
I don't know if it is valid or not to calculate a p value. What is not valid is to say that a p value of a convenience sample tells you anything at all about a larger population. And usually that is what you are interested in.

People have a habit of confusing sampling with statistics like say p values. They really are very different topics.
 
#4
It would depend on how biased the sample was or wasn't and whether you could speculate the directionality of the bias based on a validation sample using quantitative bias analysis.
hl,

I'm assuming that there has been no validation sample and there is no information on bias. With that qualification, what do you think, and why?
 
#5
What is not valid is to say that a p value of a convenience sample tells you anything at all about a larger population. And usually that is what you are interested in.
Can U clarify? Doesn't a p-value state the probability of getting a statistic more extreme than a certain value given the null hypothesis (assuming a classical model)? Perhaps I misunderstand your point.
 
Last edited:

hlsmith

Not a robit
#6
Your interpretation of the p-value is correct, but I think Noetsi is referencing the generalizability a conclusion based on the p-value to another sample (with a different sampling strategy) or larger population - with the possibility that they could have different attributes.
 

CB

Super Moderator
#7
It depends what you're trying to use the p value for. p values aren't always used to make inferences about populations. They can also be used to make inferences about causal effects within a sample, with random assignment to conditions.

E.g., you might interpret a p value from a randomized experiment as the probability of obtaining a test statistic as more extreme than that observed under a null hypothesis that the IV had no effect whatsoever on the sample (and with the auxiliary assumption of a lack of confounding).

If you're using a p value to make inferences about a larger population, well, you can, but only if the assumptions of the statistical method you're using hold true. For example, let's say we're interested in testing a very simple hypothesis: That the mean of variable Y in population P takes a particular value (say, 10). Then a one-sample t-test allows us to test this hypothesis, and the p value will be "valid" provided that the following assumptions hold true:

1. Observations are independent
2. If we were to conduct repeated samplings, the sample means would have the same expected value as the population mean. (I.e., some sample means will be higher than the true population mean, and some lower, but in the long run over a very large* number of samplings, the average of the sample means would be the same as the population mean).

If you use random sampling, assumption (2) will be true (barring the presence of systematic measurement error). But with convenience sampling, there is no reason whatsover to think that assumption (2) is true. In fact, it's barely even meaningful: What does it even mean to conduct repeated samplings, if the sampling method is not clearly defined?

TL;DR yes you can interpret p values based on convenience samples, but only by making assumptions, assumptions which are probably false and perhaps even meaningless in a convenience sample. But everyone does so anyway.

*Ok, infinite.
 
#8
It depends what you're trying to use the p value for. p values aren't always used to make inferences about populations. They can also be used to make inferences about causal effects within a sample, with random assignment to conditions.

E.g., you might interpret a p value from a randomized experiment as the probability of obtaining a test statistic as more extreme than that observed under a null hypothesis that the IV had no effect whatsoever on the sample (and with the auxiliary assumption of a lack of confounding).

If you're using a p value to make inferences about a larger population, well, you can, but only if the assumptions of the statistical method you're using hold true. For example, let's say we're interested in testing a very simple hypothesis: That the mean of variable Y in population P takes a particular value (say, 10). Then a one-sample t-test allows us to test this hypothesis, and the p value will be "valid" provided that the following assumptions hold true:

1. Observations are independent
2. If we were to conduct repeated samplings, the sample means would have the same expected value as the population mean. (I.e., some sample means will be higher than the true population mean, and some lower, but in the long run over a very large* number of samplings, the average of the sample means would be the same as the population mean).

If you use random sampling, assumption (2) will be true (barring the presence of systematic measurement error). But with convenience sampling, there is no reason whatsover to think that assumption (2) is true. In fact, it's barely even meaningful: What does it even mean to conduct repeated samplings, if the sampling method is not clearly defined?

TL;DR yes you can interpret p values based on convenience samples, but only by making assumptions, assumptions which are probably false and perhaps even meaningless in a convenience sample. But everyone does so anyway.

COWBOY,

It appears that we agree as to the meaningless nature of a p-value derived from a convenience sample (in the absence of support for certain assumptions). How odd, that everyone does so anyway. As in, "Let's exercise precise reasoning based on unverifiable assumptions!" Whoo hoo!
 

noetsi

Fortran must die
#9
I was referring to how generalizable your sample was to a larger sample (commonly a population). For a convenience sample you can not generalize to anything outside the convenience sample. Its not that the p value is wrong, its that the sample its calculated on tells you nothing. So the p value tells you nothing (well other than what the convenience sample tells you).

Personally I consider convenience samples worthless although they are commonly used. Systematic errors, for example people who care strongly about the topic are more likely to comment, are inherent in them.
 

CB

Super Moderator
#10
It appears that we agree as to the meaningless nature of a p-value derived from a convenience sample (in the absence of support for certain assumptions). How odd, that everyone does so anyway. As in, "Let's exercise precise reasoning based on unverifiable assumptions!" Whoo hoo!
Yeah, it's interesting. I suspect that a small part of the problem may be the fact that random sampling isn't explicitly an assumption of most statistical analyses, so it's easier to ignore. E.g., if we think of the assumptions of a linear model estimated via OLS:

  1. Errors have conditional mean zero for any combination of values of the predictors
  2. Errors have the same variance for any combination of values of the predictors
  3. Error terms are independent
  4. Error terms are normally distributed.

Random sampling doesn't appear anywhere on the list. But assumption 1 - by far the most important assumption - is almost certainly going to be breached if you don't have random sampling: It could well be the case that you systematically tend to select people who have positive errors for a particular combination of values of the predictors, or whatever. And then your estimates will be biased.

But although people tend to have a hazy idea that random sampling is a good thing, that connection between sampling and assumptions isn't usually drawn explicitly in most texts, so it's easier to ignore.

On a different level, I think you can also look at significance testing as a kind of weird social practice. Gerd Gigerenzer calls significance testing "the null ritual", which makes a lot of sense to me. Most researchers aren't quite sure what a significance test actually tells them even when its assumptions are met, and wouldn't care about the answer even if they did; but they do significance tests anyway, because that's what you do when you a statistical analysis. I.e., it's a ritual - not an investigation.
 

noetsi

Fortran must die
#11
Random sampling, generalizability, really has nothing to do with statistics per se. That is why it is not part of say the Gauss Markov assumptions. All the assumptions go to is whether the method estimates correctly on a given data base. They do not address, statistics does not address generally, whether you can use a given statistic to analyze a specific sample or population from another sample.

Sampling is a distinct field from statistics and has its own entirely separate set of assumptions. People tend to assume they are the same field when they are not.
 

noetsi

Fortran must die
#13
I am sure there is disagreement on this, but I don't think a convenience sample ever can be used to generalize to a larger population regardless of N. Nor does having a larger n increase its usefulness (at least in theory). They key is how you sample not how many you sample. If your sample is biased, having more biased people (in terms of estimating a true mean value - not that they have personal bias :) )does not make the situation better.
 
#14
Equating a convenience sample as biased is unfounded. It definitely has a tendency to be biased in regards to the population's characteristics. Remember you can generalize results to members of other samples comparable based on the same types of collection strategies. Also, data from convenience samples can provide good information and pvalues can be tested within them.


Say I was only able to sample wealth people who work at a single type of business, does it not hurt to examine these individuals and their traits?
 
#15
Also, data from convenience samples can provide good information and pvalues can be tested within them.
hl,

Not sure what you mean by "pvalues can be tested within them". A p-value can be used to decide whether to reject or accept a null hypothesis, if that's what you mean.

Do you disagree with me (and COWBOY, I think) that a p-value derived from a convenience sample (in the absence of support for certain assumptions) is meaningless (since there is no evidence that the data were obtained from a RANDOM sample)?
 

CB

Super Moderator
#16
Random sampling, generalizability, really has nothing to do with statistics per se. That is why it is not part of say the Gauss Markov assumptions. All the assumptions go to is whether the method estimates correctly on a given data base. They do not address, statistics does not address generally, whether you can use a given statistic to analyze a specific sample or population from another sample.

Sampling is a distinct field from statistics and has its own entirely separate set of assumptions. People tend to assume they are the same field when they are not.
I'm sorry noetsi but this is just not true at all, not even a little bit. The defining goal of inferential statistics is to make conclusions about a population on the basis of a sample! And when we talk about "assumptions" of particular statistics, we're talking about the assumptions necessary for the statistics to have particular desirable properties as estimates of population parameters.
 

CB

Super Moderator
#17
I am sure there is disagreement on this, but I don't think a convenience sample ever can be used to generalize to a larger population regardless of N.
Sorry, disagreeing with you again, but yes you can use convenience samples to make generalisations in some cases, and even quite accurate generalisations. Andrew Gelman is one of the big names in this area. See for example this paper using a convenience sample obtained via the XBox platform to predict election results. This is obviously a complex area, but don't write it off without looking into it.
 
#18
Sorry, disagreeing with you again, but yes you can use convenience samples to make generalisations in some cases, and even quite accurate generalisations.
CowboyBear,

Thanks for the link. I see the following

The central idea of MRP is to partition the data
into thousands of demographic cells, estimate voter intent
at the cell level using a multilevel regression model, and finally
aggregate the cell-level estimates in accordance with
the target population’s demographic composition.

It turns out that I am interested in assessing a test of independence that involved 128 observations classified into a 2 by 4 table. I wonder if the MRP approach is applicable to such a setting. Thoughts?
Thoughts?
 

CB

Super Moderator
#19
CowboyBear,

Thanks for the link. I see the following

The central idea of MRP is to partition the data
into thousands of demographic cells, estimate voter intent
at the cell level using a multilevel regression model, and finally
aggregate the cell-level estimates in accordance with
the target population’s demographic composition.

It turns out that I am interested in assessing a test of independence that involved 128 observations classified into a 2 by 4 table. I wonder if the MRP approach is applicable to such a setting. Thoughts?
Thoughts?
Umm, I'm not totally sure, but I suspect MRP might be a much more complex tool than you need here. Wouldn't a chi-square test do the trick?
 
#20
Umm, I'm not totally sure, but I suspect MRP might be a much more complex tool than you need here. Wouldn't a chi-square test do the trick?
A chi-square was what was used. But there was no reason that I can see to assume a random sample. In short, I see no justification for using the chi-square in the first place.