I have read and been told that p values do not matter when you have an entire population of interest -the effect size you find is the true effect size even if p says the results are not statistically significant. And that p values when you have populations don't tell you what you should leave in the model and not. I am ignoring here the issue of whether a population can change over time.

I am trying to find a citation for this, ideally one on line (and not outside the US since at work we are blocked from foreign sites). I am having a discussion of this with a colleague.


this is what my colleague sent (and it makes significant practical difference for our organization how this is decided).

In the context of a regressionequation, a p-value indicates the statistical impact of the independentvariable as a predictor of the outcome variable, regardless of sample vspopulation.

For example: Var1 has a p-value of 0.0001; Var2 has a p-value of 0.003; Var3 has a p-value of0.823.

In this model, variables 1 and 2should remain in the model, and variable 3 should be removed.

Population vs sample does not apply in this situation. [that is it does not matter if you have the population or not]