+ Reply to Thread
Results 1 to 5 of 5

Thread: Query: Using independent variables in regression.

  1. #1
    Points: 74, Level: 1
    Level completed: 48%, Points required for next Level: 26

    Posts
    1
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Query: Using independent variables in regression.




    1. When putting in independent variables in regression model, for example "income" and my outcome is health status. I can use income as categorical or as continuous. So how do i decide? My variable (factor) of interest is not income. I am looking at association between sleep duration and health status. Thank you for the help.

  2. #2
    Points: 2,462, Level: 30
    Level completed: 8%, Points required for next Level: 138

    Posts
    200
    Thanks
    20
    Thanked 48 Times in 43 Posts

    Re: Query: Using independent variables in regression.

    You should use income as a continuous / numerical variable for two reasons: 1.) If each income number in your data frame is treated as a level of a factor, you have a huge amount of different factors and this costs lots of degrees of freedoms (that means the proportion of data points to regression parameters is really bad), and 2.) you would not use all information available in this variable, such as order and proportions. So if you want to use income as a covariate (and that is what I understood) use it as a numerical predictor. And you can possibly even improve your model by centring this predictor.

  3. The Following User Says Thank You to mmercker For This Useful Post:

    Tilipa (12-29-2015)

  4. #3
    Points: 1,926, Level: 26
    Level completed: 26%, Points required for next Level: 74

    Posts
    1
    Thanks
    0
    Thanked 1 Time in 1 Post

    Re: Query: Using independent variables in regression.

    Is there an argument to be made that treating income as a categorical variable allows for non-linear effects on the dependent variable? I understand that you can do this with a continuous variable by adding a squared term, but maybe that assumes a certain functional form. What if the effect of income on health varies by the level of income? For example, what if increases in income have large effects on health for those with low income, small effects on health for those with high income, and no effects on health for those with income in between? (I'm not saying this is plausible, just asking the best way to account for this in a model.)

    My inclination is to plot the data to get a sense of what the relationship looks like between income and health, and then choose a transformation of income that is consistent with that picture. Income is trickier than some variables because it tends to be highly skewed.

  5. The Following User Says Thank You to ErikB For This Useful Post:

    Tilipa (12-29-2015)

  6. #4
    Points: 2,462, Level: 30
    Level completed: 8%, Points required for next Level: 138

    Posts
    200
    Thanks
    20
    Thanked 48 Times in 43 Posts

    Re: Query: Using independent variables in regression.

    Quote Originally Posted by ErikB View Post
    Is there an argument to be made that treating income as a categorical variable allows for non-linear effects on the dependent variable?
    Non-linearity implies that there is an order in the predictor variable, which is not the case in a categorical variable. You could form income classes, and each class could be represented by a factor level. And you could treat this factor as an ordinal variable...

    But to model a nonlinear relationship the most natural way would be to treat income X as a continuous predictor, and then test different nonlinear terms (the classical way would be to test some polynimials and everything else that would make sense). In case of the income, some kind of saturation dependency would make sense. These most appropriate dependency can be selected e.g. by the significance of the corresponding predictors, or by the AIC value of the model.

    Quote Originally Posted by ErikB View Post
    My inclination is to plot the data to get a sense of what the relationship looks like between income and health, and then choose a transformation of income that is consistent with that picture
    This is a good idea, look at a scatterplot and subsequently you have a gauge how to model this dependency

  7. The Following User Says Thank You to mmercker For This Useful Post:

    Tilipa (12-29-2015)

  8. #5
    TS Contributor
    Points: 17,779, Level: 84
    Level completed: 86%, Points required for next Level: 71
    Karabiner's Avatar
    Location
    FC Schalke 04, Germany
    Posts
    2,542
    Thanks
    56
    Thanked 640 Times in 602 Posts

    Re: Query: Using independent variables in regression.


    Quote Originally Posted by ErikB View Post
    Is there an argument to be made that treating income as a categorical variable allows for non-linear effects on the dependent variable?(...) Income is trickier than some variables because it tends to be highly skewed.
    Income IMHO belongs to the variables for which a log-transformation
    should routinely be considered. It is not useful in every study population,
    of course, but in many. E.g. diminishing marginal utility of income is
    a well-known phenomenon.

    With kind regards

    K.

  9. The Following User Says Thank You to Karabiner For This Useful Post:

    Tilipa (12-29-2015)

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats