# Thread: Is it meaningful to calculate a p-value based on a convenience sample?

1. ## Re: Is it meaningful to calculate a p-value based on a convenience sample?

Originally Posted by noetsi
Random sampling, generalizability, really has nothing to do with statistics per se. That is why it is not part of say the Gauss Markov assumptions. All the assumptions go to is whether the method estimates correctly on a given data base. They do not address, statistics does not address generally, whether you can use a given statistic to analyze a specific sample or population from another sample.

Sampling is a distinct field from statistics and has its own entirely separate set of assumptions. People tend to assume they are the same field when they are not.
I'm sorry noetsi but this is just not true at all, not even a little bit. The defining goal of inferential statistics is to make conclusions about a population on the basis of a sample! And when we talk about "assumptions" of particular statistics, we're talking about the assumptions necessary for the statistics to have particular desirable properties as estimates of population parameters.

2. ## Re: Is it meaningful to calculate a p-value based on a convenience sample?

Originally Posted by noetsi
I am sure there is disagreement on this, but I don't think a convenience sample ever can be used to generalize to a larger population regardless of N.
Sorry, disagreeing with you again, but yes you can use convenience samples to make generalisations in some cases, and even quite accurate generalisations. Andrew Gelman is one of the big names in this area. See for example this paper using a convenience sample obtained via the XBox platform to predict election results. This is obviously a complex area, but don't write it off without looking into it.

3. ## Re: Is it meaningful to calculate a p-value based on a convenience sample?

Originally Posted by CowboyBear
Sorry, disagreeing with you again, but yes you can use convenience samples to make generalisations in some cases, and even quite accurate generalisations.
CowboyBear,

Thanks for the link. I see the following

The central idea of MRP is to partition the data
into thousands of demographic cells, estimate voter intent
at the cell level using a multilevel regression model, and finally
aggregate the cell-level estimates in accordance with
the target population’s demographic composition.

It turns out that I am interested in assessing a test of independence that involved 128 observations classified into a 2 by 4 table. I wonder if the MRP approach is applicable to such a setting. Thoughts?
Thoughts?

4. ## Re: Is it meaningful to calculate a p-value based on a convenience sample?

Originally Posted by woodbomb
CowboyBear,

Thanks for the link. I see the following

The central idea of MRP is to partition the data
into thousands of demographic cells, estimate voter intent
at the cell level using a multilevel regression model, and finally
aggregate the cell-level estimates in accordance with
the target population’s demographic composition.

It turns out that I am interested in assessing a test of independence that involved 128 observations classified into a 2 by 4 table. I wonder if the MRP approach is applicable to such a setting. Thoughts?
Thoughts?
Umm, I'm not totally sure, but I suspect MRP might be a much more complex tool than you need here. Wouldn't a chi-square test do the trick?

5. ## Re: Is it meaningful to calculate a p-value based on a convenience sample?

Originally Posted by CowboyBear
Umm, I'm not totally sure, but I suspect MRP might be a much more complex tool than you need here. Wouldn't a chi-square test do the trick?
A chi-square was what was used. But there was no reason that I can see to assume a random sample. In short, I see no justification for using the chi-square in the first place.

6. ## Re: Is it meaningful to calculate a p-value based on a convenience sample?

Originally Posted by woodbomb
A chi-square was what was used. But there was no reason that I can see to assume a random sample. In short, I see no justification for using the chi-square in the first place.
Oh right, sorry, I forgot the context of the conversation! I only know about MRP in terms of the very broad concepts - it's possible you could apply it here, but I just don't know enough about the practicalities of implementation to be much help

Page 2 of 2 First 1 2

 Tweet