- Thread starter woodbomb
- Start date
- Tags convenience p-value significance

People have a habit of confusing sampling with statistics like say p values. They really are very different topics.

It would depend on how biased the sample was or wasn't and whether you could speculate the directionality of the bias based on a validation sample using quantitative bias analysis.

I'm assuming that there has been no validation sample and there is no information on bias. With that qualification, what do you think, and why?

What is not valid is to say that a p value of a convenience sample tells you anything at all about a larger population. And usually that is what you are interested in.

Last edited:

E.g., you might interpret a p value from a randomized experiment as the probability of obtaining a test statistic as more extreme than that observed under a null hypothesis that the IV had no effect whatsoever on the sample (and with the auxiliary assumption of a lack of confounding).

If you're using a p value to make inferences about a larger population, well, you can, but only if the assumptions of the statistical method you're using hold true. For example, let's say we're interested in testing a very simple hypothesis: That the mean of variable Y in population P takes a particular value (say, 10). Then a one-sample t-test allows us to test this hypothesis, and the p value will be "valid" provided that the following assumptions hold true:

1. Observations are independent

2. If we were to conduct repeated samplings, the sample means would have the same expected value as the population mean. (I.e., some sample means will be higher than the true population mean, and some lower, but in the long run over a very large* number of samplings, the average of the sample means would be the same as the population mean).

If you use random sampling, assumption (2) will be true (barring the presence of systematic measurement error). But with convenience sampling, there is no reason whatsover to think that assumption (2) is true. In fact, it's barely even

TL;DR yes you can interpret p values based on convenience samples, but only by making assumptions, assumptions which are probably false and perhaps even meaningless in a convenience sample. But everyone does so anyway.

*Ok, infinite.

E.g., you might interpret a p value from a randomized experiment as the probability of obtaining a test statistic as more extreme than that observed under a null hypothesis that the IV had no effect whatsoever on the sample (and with the auxiliary assumption of a lack of confounding).

If you're using a p value to make inferences about a larger population, well, you can, but only if the assumptions of the statistical method you're using hold true. For example, let's say we're interested in testing a very simple hypothesis: That the mean of variable Y in population P takes a particular value (say, 10). Then a one-sample t-test allows us to test this hypothesis, and the p value will be "valid" provided that the following assumptions hold true:

1. Observations are independent

2. If we were to conduct repeated samplings, the sample means would have the same expected value as the population mean. (I.e., some sample means will be higher than the true population mean, and some lower, but in the long run over a very large* number of samplings, the average of the sample means would be the same as the population mean).

If you use random sampling, assumption (2) will be true (barring the presence of systematic measurement error). But with convenience sampling, there is no reason whatsover to think that assumption (2) is true. In fact, it's barely even

TL;DR yes you can interpret p values based on convenience samples, but only by making assumptions, assumptions which are probably false and perhaps even meaningless in a convenience sample. But everyone does so anyway.

COWBOY,

It appears that we agree as to the meaningless nature of a p-value derived from a convenience sample (in the absence of support for certain assumptions). How odd, that everyone does so anyway. As in, "Let's exercise precise reasoning based on unverifiable assumptions!" Whoo hoo!

Personally I consider convenience samples worthless although they are commonly used. Systematic errors, for example people who care strongly about the topic are more likely to comment, are inherent in them.

It appears that we agree as to the meaningless nature of a p-value derived from a convenience sample (in the absence of support for certain assumptions). How odd, that everyone does so anyway. As in, "Let's exercise precise reasoning based on unverifiable assumptions!" Whoo hoo!

- Errors have conditional mean zero for any combination of values of the predictors
- Errors have the same variance for any combination of values of the predictors
- Error terms are independent
- Error terms are normally distributed.

Random sampling doesn't appear anywhere on the list. But assumption 1 - by far the most important assumption - is almost certainly going to be breached if you don't have random sampling: It could well be the case that you systematically tend to select people who have positive errors for a particular combination of values of the predictors, or whatever. And then your estimates will be biased.

But although people tend to have a hazy idea that random sampling is a good thing, that connection between sampling and assumptions isn't usually drawn explicitly in most texts, so it's easier to ignore.

On a different level, I think you can also look at significance testing as a kind of weird social practice. Gerd Gigerenzer calls significance testing "the null ritual", which makes a lot of sense to me. Most researchers aren't quite sure what a significance test actually tells them even when its assumptions

Sampling is a distinct field from statistics and has its own entirely separate set of assumptions. People tend to assume they are the same field when they are not.

Say I was only able to sample wealth people who work at a single type of business, does it not hurt to examine these individuals and their traits?

Also, data from convenience samples can provide good information and pvalues can be tested within them.

Not sure what you mean by "pvalues can be tested within them". A p-value can be used to decide whether to reject or accept a null hypothesis, if that's what you mean.

Do you disagree with me (and COWBOY, I think) that a p-value derived from a convenience sample (in the absence of support for certain assumptions) is meaningless (since there is no evidence that the data were obtained from a RANDOM sample)?

Sampling is a distinct field from statistics and has its own entirely separate set of assumptions. People tend to assume they are the same field when they are not.

I am sure there is disagreement on this, but I don't think a convenience sample ever can be used to generalize to a larger population regardless of N.

Sorry, disagreeing with you again, but yes you can use convenience samples to make generalisations in some cases, and even quite accurate generalisations.

Thanks for the link. I see the following

The central idea of MRP is to partition the data

into thousands of demographic cells, estimate voter intent

at the cell level using a multilevel regression model, and finally

aggregate the cell-level estimates in accordance with

the target population’s demographic composition.

It turns out that I am interested in assessing a test of independence that involved 128 observations classified into a 2 by 4 table. I wonder if the MRP approach is applicable to such a setting. Thoughts?

Thoughts?

CowboyBear,

Thanks for the link. I see the following

The central idea of MRP is to partition the data

into thousands of demographic cells, estimate voter intent

at the cell level using a multilevel regression model, and finally

aggregate the cell-level estimates in accordance with

the target population’s demographic composition.

It turns out that I am interested in assessing a test of independence that involved 128 observations classified into a 2 by 4 table. I wonder if the MRP approach is applicable to such a setting. Thoughts?

Thoughts?

Thanks for the link. I see the following

The central idea of MRP is to partition the data

into thousands of demographic cells, estimate voter intent

at the cell level using a multilevel regression model, and finally

aggregate the cell-level estimates in accordance with

the target population’s demographic composition.

It turns out that I am interested in assessing a test of independence that involved 128 observations classified into a 2 by 4 table. I wonder if the MRP approach is applicable to such a setting. Thoughts?

Thoughts?

Umm, I'm not totally sure, but I suspect MRP might be a much more complex tool than you need here. Wouldn't a chi-square test do the trick?