# What is a population

#### noetsi

##### No cake for spunky
I have long said I have a population. We are only interested in our customers. We have all the data for those customers. But you could argue we have a sample of the population of future customers (who might vary from our existing ones). I am not sure about this one way or the other and was curious what others think. It matters for issues like if hetero is an issue and if we should pay attention to p values.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
If you plan to generalize model estimates to future customers, which may include the same or different attributes of the source super population (since these customers don't exist yet) I would think of this as interpolation and extrapolation as applicable. I personally would use prediction intervals on your estimates if projecting them to future customers and if the future customer differ enough weighting methods.

@joeb33050 - I would not necessarily agree with your pharmaceutical example. Clinical trials test the efficacy of drug compounds in subjects typically healthier then those in the general population representing the disease state or ailment. Trial subjects are known not only to have to met inclusion criteria (e.g., English speaking, pass mini mental exam, etc.), but beyond this they can be "healthier and wealthier" given they likely live in a university town, are educated enough to locate a trial and understand study materials, and sign-up, as well as have the time to make all of the associated visits and complete follow-up documents as well as have stable residency or contact info. These trials as mentioned, ensure the drug's safety in ideal patients and also function to titrate the dose amounts. Once trials are over and drugs are approved, then the common folks with comorbidities and polypharmacy may get prescribed the drug. So trial participate can and do differ from all humans.

#### Karabiner

##### TS Contributor
Maybe thinking about populations as a collection of objects is too physical.
Maybe we should think of variables. We collect some observations on that
variables, and try to infer something about that variables. E.g. if the mean of
a variable is higher in one group than in another. So we have to think about
what kind of obervations and how many of them we should collect, and
how the collection process limits the genarlizability of our statements.

With kind regards

Karabiner