# Thread: Using weights in regression

1. ## Using weights in regression

I don't really understand this argument. I thought you had to use weights in regression always when you were not randomly sampling. But this argues otherwise. I don't know how you would ever know if the unweighted error term was correlated with the weights.

Against using weights
In regression models weighting the data will only change the regression coefficients if the unweighted model error terms correlate with the weights. But if this is the case then the model must be mis-specified. Proper specification of the model will eliminate the difference between the unweighted and weighted regression coefficients. And since the unweighted estimates usually have the smallest standard errors these should be used in preference to the weighted estimates.
For using them
Researchers use regression methods for many purposes and it is not always appropriate to use fully specified models. For instance a researcher may wish to develop a prediction model using just a small number of linear terms. In these instances the weighted and unweighted regression coefficients may be very different. In this case it is simpler and safer to assume that the weighted survey model is a better fit to a population model than the unweighted survey model is.
(ii) The arguments about fully specified models may not apply for non-linear models such as logistic regression.
Of course all models are always misspecified since no model will ever equal true reality IMHO

2. ## Re: Using weights in regression

Except in physics, those folks' models are fairly unbiased because they can control many factors.

It sounds like the first quote may be saying the error terms are correlated with weights, but weights aren't used, so the unweighted model is confounded because it is not controlling for this link. I could be wrong on this.

I don't know much about surveying, but you want random samples to make results generalizable. Some times sampling strategies are used to ensure a representative sample. So if you don't randomly sample (e.g., convenience sample) or weight, those results are only generalizable to a comparable subsample similar to the convenience sample. So you have a selection bias.

So you have survey data that is not randomly sampled?

3. ## The Following User Says Thank You to hlsmith For This Useful Post:

noetsi (04-15-2016)

4. ## Re: Using weights in regression

How would you know if the error terms were correlated with the weights?

From what I have read or seen most government surveys use stratification, cluster sampling or the like not random sampling. Obvious examples are the census in the US. It is impossible, or so it is said, to use simple random sampling. Usually its because you don't have a sampling frame to sample with, sometimes it would cost to much to generate this.

Our satisfaction sample uses a stratified sample, I was not consulted on creating it. They probably did this because certain disability groups are known to respond less than others and thus you need to over sample to get enough of them to sample.

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts