1. ## Theoretical Regression Problems

What is the term for when your outcome variable is truncated and that may be dampening your results?
For example, you use a dichotomous outcome for your regression analysis instead of continuous.

2. ## Re: Theoretical Regression Problems

I think you might be thinking of restriction of range. (Though models for truncated DVs are also a thing).

3. ## Re: Theoretical Regression Problems

I have never heard a term for this, but if the variation in the DV [or for that matter the IV] is extremely limited it will impact the slopes [they will be lower than they actually should be]. It has to be pretty extreme for this to matter.

Sometimes when they speak of turning a natural interval variable into a categorical variable, normally frowned on, they talk about "loss of information."

4. ## Re: Theoretical Regression Problems

Originally Posted by noetsi
I have never heard a term for this, but if the variation in the DV [or for that matter the IV] is extremely limited it will impact the slopes [they will be lower than they actually should be]. It has to be pretty extreme for this to matter.
Yeah, I think it is restriction of range you're thinking of (though I believe it biases the correlations/standardised slopes, not the unstandardised ones).

5. ## Re: Theoretical Regression Problems

On page 61 of "Using Multivariate Statistics" by Fidel and Tabachnick it says in part.

"Sample correlations may be lower than population correlations when there is restricted range in sampling of cases or very uneven splits in the categories of dichotomous variables.....A falsely small correlation between two continuous variables is obtained if the range of responses to one or both of the variables is restricted in the sample." On the next page they go "The correlation between a continuous variable and a dichotomous variable, or between two dichotomous variables (unless they have the same peculiar splits)' is also too low if most (say over 90%) responses to the dichotomous variable fall into one category."

This does not seem to apply to standardized slopes which you would normally not use for dichotomous variables anyway. It is in a chapter on data clean up not on regression per se although logically it applies to that.

6. ## Re: Theoretical Regression Problems

Originally Posted by noetsi
This does not seem to apply to standardized slopes which you would normally not use for dichotomous variables anyway.
Yep, though a Pearsons correlation is itself a standardised slope (it's the standardised slope from a simple linear regression).

7. ## Re: Theoretical Regression Problems

I never realized that was the case. Its often stated that you use pearson for interval variables, spearman for ordinal, and polychoric for binary data. In practice that is way too simple. For example Pearson assumes a linear relationship and two variables having a curvilinear relationship won't fit this well [although I don't know if spearman or polychoric will either].

8. ## Re: Theoretical Regression Problems

Originally Posted by noetsi
I never realized that was the case. Its often stated that you use pearson for interval variables, spearman for ordinal, and polychoric for binary data. In practice that is way too simple. For example Pearson assumes a linear relationship and two variables having a curvilinear relationship won't fit this well [although I don't know if spearman or polychoric will either].
Yeah you're right, it's more complicated than that rule suggests.

But yeah the Pearsons correlation = standardised slope thing is a nice property, it shows a bit more clearly what the magnitude of Pearsons correlation actually tells you (i.e., for a standard deviation increase in one variable, the expected standard deviation change in the other = r).

9. ## The Following User Says Thank You to CowboyBear For This Useful Post:

noetsi (02-06-2017)

10. ## Re: Theoretical Regression Problems

So what correlation do you use for non-liner relationships? I have long wondered.

I think as long as the p value is high enough you can use pearson

11. ## Re: Theoretical Regression Problems

Spearman's rho is good for monotonic but non-linear relationships (although it's really describing just the strength of the relationship and not exactly its form).

For relationships that aren't monotonic you'd need a more complex model (E.g., quadratic regression, piecewise regression, spline models, loess, etc.) That in turn means you won't really be able to summarise the model in the form of a single number in the way you can with a correlation (though I suppose you might still report the R2 as a summary of the strength of the relationship in some cases).

12. ## The Following User Says Thank You to CowboyBear For This Useful Post:

noetsi (02-07-2017)

13. ## Re: Theoretical Regression Problems

Essentially if you want to model non-linear relationships you do regression With a quadratic, cubic etc.

14. ## Re: Theoretical Regression Problems

A couple of comments, I have also heard it referenced as a "loss of information". If you are referencing turning a continuous into a binary or categorical variable you can use the term dichotomized or discretized, if applicable.

Is this standardized correlation also why the R^2 can be interpreted on the percentage scale?

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts