+ Reply to Thread
Results 1 to 12 of 12

Thread: Theoretical Regression Problems

  1. #1
    Points: 4, Level: 1
    Level completed: 7%, Points required for next Level: 46

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Theoretical Regression Problems




    What is the term for when your outcome variable is truncated and that may be dampening your results?
    For example, you use a dichotomous outcome for your regression analysis instead of continuous.

  2. #2
    TS Contributor
    Points: 18,889, Level: 87
    Level completed: 8%, Points required for next Level: 461
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,062
    Thanks
    121
    Thanked 427 Times in 328 Posts

    Re: Theoretical Regression Problems

    I think you might be thinking of restriction of range. (Though models for truncated DVs are also a thing).
    Matt aka CB | twitter.com/matthewmatix

  3. #3
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Theoretical Regression Problems

    I have never heard a term for this, but if the variation in the DV [or for that matter the IV] is extremely limited it will impact the slopes [they will be lower than they actually should be]. It has to be pretty extreme for this to matter.

    Sometimes when they speak of turning a natural interval variable into a categorical variable, normally frowned on, they talk about "loss of information."
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  4. #4
    TS Contributor
    Points: 18,889, Level: 87
    Level completed: 8%, Points required for next Level: 461
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,062
    Thanks
    121
    Thanked 427 Times in 328 Posts

    Re: Theoretical Regression Problems

    Quote Originally Posted by noetsi View Post
    I have never heard a term for this, but if the variation in the DV [or for that matter the IV] is extremely limited it will impact the slopes [they will be lower than they actually should be]. It has to be pretty extreme for this to matter.
    Yeah, I think it is restriction of range you're thinking of (though I believe it biases the correlations/standardised slopes, not the unstandardised ones).
    Matt aka CB | twitter.com/matthewmatix

  5. #5
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Theoretical Regression Problems

    On page 61 of "Using Multivariate Statistics" by Fidel and Tabachnick it says in part.

    "Sample correlations may be lower than population correlations when there is restricted range in sampling of cases or very uneven splits in the categories of dichotomous variables.....A falsely small correlation between two continuous variables is obtained if the range of responses to one or both of the variables is restricted in the sample." On the next page they go "The correlation between a continuous variable and a dichotomous variable, or between two dichotomous variables (unless they have the same peculiar splits)' is also too low if most (say over 90%) responses to the dichotomous variable fall into one category."

    This does not seem to apply to standardized slopes which you would normally not use for dichotomous variables anyway. It is in a chapter on data clean up not on regression per se although logically it applies to that.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  6. #6
    TS Contributor
    Points: 18,889, Level: 87
    Level completed: 8%, Points required for next Level: 461
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,062
    Thanks
    121
    Thanked 427 Times in 328 Posts

    Re: Theoretical Regression Problems

    Quote Originally Posted by noetsi View Post
    This does not seem to apply to standardized slopes which you would normally not use for dichotomous variables anyway.
    Yep, though a Pearsons correlation is itself a standardised slope (it's the standardised slope from a simple linear regression).
    Matt aka CB | twitter.com/matthewmatix

  7. #7
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Theoretical Regression Problems

    I never realized that was the case. Its often stated that you use pearson for interval variables, spearman for ordinal, and polychoric for binary data. In practice that is way too simple. For example Pearson assumes a linear relationship and two variables having a curvilinear relationship won't fit this well [although I don't know if spearman or polychoric will either].
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  8. #8
    TS Contributor
    Points: 18,889, Level: 87
    Level completed: 8%, Points required for next Level: 461
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,062
    Thanks
    121
    Thanked 427 Times in 328 Posts

    Re: Theoretical Regression Problems

    Quote Originally Posted by noetsi View Post
    I never realized that was the case. Its often stated that you use pearson for interval variables, spearman for ordinal, and polychoric for binary data. In practice that is way too simple. For example Pearson assumes a linear relationship and two variables having a curvilinear relationship won't fit this well [although I don't know if spearman or polychoric will either].
    Yeah you're right, it's more complicated than that rule suggests.

    But yeah the Pearsons correlation = standardised slope thing is a nice property, it shows a bit more clearly what the magnitude of Pearsons correlation actually tells you (i.e., for a standard deviation increase in one variable, the expected standard deviation change in the other = r).
    Matt aka CB | twitter.com/matthewmatix

  9. The Following User Says Thank You to CowboyBear For This Useful Post:

    noetsi (02-06-2017)

  10. #9
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Theoretical Regression Problems

    So what correlation do you use for non-liner relationships? I have long wondered.

    I think as long as the p value is high enough you can use pearson
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  11. #10
    TS Contributor
    Points: 18,889, Level: 87
    Level completed: 8%, Points required for next Level: 461
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,062
    Thanks
    121
    Thanked 427 Times in 328 Posts

    Re: Theoretical Regression Problems

    Spearman's rho is good for monotonic but non-linear relationships (although it's really describing just the strength of the relationship and not exactly its form).

    For relationships that aren't monotonic you'd need a more complex model (E.g., quadratic regression, piecewise regression, spline models, loess, etc.) That in turn means you won't really be able to summarise the model in the form of a single number in the way you can with a correlation (though I suppose you might still report the R2 as a summary of the strength of the relationship in some cases).
    Matt aka CB | twitter.com/matthewmatix

  12. The Following User Says Thank You to CowboyBear For This Useful Post:

    noetsi (02-07-2017)

  13. #11
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Theoretical Regression Problems

    Essentially if you want to model non-linear relationships you do regression With a quadratic, cubic etc.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  14. #12
    Omega Contributor
    Points: 38,334, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Theoretical Regression Problems


    A couple of comments, I have also heard it referenced as a "loss of information". If you are referencing turning a continuous into a binary or categorical variable you can use the term dichotomized or discretized, if applicable.


    Is this standardized correlation also why the R^2 can be interpreted on the percentage scale?
    Stop cowardice, ban guns!

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats