+ Reply to Thread
Results 1 to 15 of 15

Thread: Discretization of a continuous variable

  1. #1
    Points: 2,514, Level: 30
    Level completed: 43%, Points required for next Level: 86

    Posts
    24
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Discretization of a continuous variable




    He guys,

    I'm analyzing a research regarding patients and their risk factors for developing a certain side effect.
    If i enter age as a continous variable into forward stepwise logistic regression - sex and the diagnosis of the patient have significant OR's , age is not included.
    However, if i say, i believe children will have more side effects, so i discretize the age variable into <16 and >16 years nominal variable called kids. when instering this variable instead of age into the stepwise logistic regression it's significant and have an amazing OR.

    How can this be? I could have chosen any random age, and try it out...
    What is the real result?

    A detailed explanation of this difference will be very appreciated
    Amir.

  2. #2
    Omega Contributor
    Points: 38,289, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,992
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Discretization of a continuous variable

    You need to try and plot the relationship. Perhaps it is not a linear or monotonic relationship.

    I typically use the trade off between sensitivity and specificity to determine the best cut off for a continuous variable, but first you would want to understand the relationship between the variables.

    You can also look at OR for greater than 1 unit increase in age. It should not affect its significances, but I believe it effects the effect size of the OR.
    Stop cowardice, ban guns!

  3. The Following User Says Thank You to hlsmith For This Useful Post:

    ted00 (03-27-2015)

  4. #3
    Points: 2,514, Level: 30
    Level completed: 43%, Points required for next Level: 86

    Posts
    24
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Discretization of a continuous variable

    so you're suggesting ROC curve? well i usually do that in cases the continous variable is indeed significant in the regression...
    I didn't understand the OR>1 remark - SPSS didn't give an OR for age since it's not significant...and didn't enter the analysis in the stepwise method..

  5. #4
    Points: 2,514, Level: 30
    Level completed: 43%, Points required for next Level: 86

    Posts
    24
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Discretization of a continuous variable

    so you're suggesting ROC curve? well i usually do that in cases the continous variable is indeed significant in the regression...
    I didn't understand the OR>1 remark - SPSS didn't give an OR for age since it's not significant...and didn't enter the analysis in the stepwise method..

  6. #5
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Discretization of a continuous variable

    I don't know why this is occuring, I am not a big fan of stepwise, but it is almost always a bad idea to convert a variable that is continuous to one that is categorical. You lose information in the process.

    I think what Hlsmith is suggesting is that rather than look at what the OR is for a 1 year increase in age, you look at what the OR is for a five or ten year increase in age. Or whatever your specific units are.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  7. #6
    Points: 2,514, Level: 30
    Level completed: 43%, Points required for next Level: 86

    Posts
    24
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Discretization of a continuous variable

    Quote Originally Posted by noetsi View Post
    I don't know why this is occuring, I am not a big fan of stepwise, but it is almost always a bad idea to convert a variable that is continuous to one that is categorical. You lose information in the process.

    I think what Hlsmith is suggesting is that rather than look at what the OR is for a 1 year increase in age, you look at what the OR is for a five or ten year increase in age. Or whatever your specific units are.
    how do i do it in SPSS?

  8. #7
    Omega Contributor
    Points: 38,289, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,992
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Discretization of a continuous variable

    Not an SPSS user, but if it is not apparent - one method might be to round all values (e.g., nearest 5 year increment).

    I was just trying to say, ignoring your stepwise approach, that you could use the tradeoff between the SEN and SPEC to find a cutoff if you go that route. Since you seem to be asking how you find the best way to discretize your data. Yes the accuracy value in ROC curve could be used as well.
    Stop cowardice, ban guns!

  9. #8
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Discretization of a continuous variable

    SAS has a specific way to change the unit and I am sure SPSS does as well (although I don't work with it).

    If all else fails you can divide the data by 5 say or 10 and import that into SPSS. So five years, or ten, would be a one unit change then.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

  10. The Following User Says Thank You to noetsi For This Useful Post:

    hlsmith (03-26-2015)

  11. #9
    TS Contributor
    Points: 18,889, Level: 87
    Level completed: 8%, Points required for next Level: 461
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,062
    Thanks
    121
    Thanked 427 Times in 328 Posts

    Re: Discretization of a continuous variable

    Quote Originally Posted by XPeriment View Post
    If i enter age as a continous variable into forward stepwise logistic regression
    Stop right there. Stepwise regression is pretty much always a bad idea (See here, here, and here).

    To quote Andrew Gelman (last link above):
    "Stepwise regression is one of these things, like outlier detection and pie charts, which appear to be popular among non-statisticans but are considered by statisticians to be a bit of a joke. For example, Jennifer and I don’t mention stepwise regression in our book, not even once."

  12. The Following User Says Thank You to CowboyBear For This Useful Post:

    ted00 (03-27-2015)

  13. #10
    TS Contributor
    Points: 22,359, Level: 93
    Level completed: 1%, Points required for next Level: 991
    spunky's Avatar
    Location
    vancouver, canada
    Posts
    2,135
    Thanks
    166
    Thanked 537 Times in 431 Posts

    Re: Discretization of a continuous variable

    Quote Originally Posted by CowboyBear View Post
    Stop right there. Stepwise regression is pretty much always a bad idea (See here, here, and here).

    Jennifer and I don’t mention stepwise regression in our book, not even once."
    nothing to add here. just really helping to emphasize stepwise regression is a bad idea..
    for all your psychometric needs! https://psychometroscar.wordpress.com/about/

  14. #11
    Points: 2,514, Level: 30
    Level completed: 43%, Points required for next Level: 86

    Posts
    24
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Discretization of a continuous variable

    I don't think stepwise is the issue here
    I tried using ENTER instead of stepwise.
    Age:
    B=-0.003 p=0.443 OR=0.997 [0.989-1.005]

    This will lead to a very different conclusion when dividing the Age to children and adults...

  15. #12
    TS Contributor
    Points: 5,246, Level: 46
    Level completed: 48%, Points required for next Level: 104
    maartenbuis's Avatar
    Location
    Konstanz
    Posts
    372
    Thanks
    3
    Thanked 146 Times in 123 Posts

    Re: Discretization of a continuous variable

    The answer was given before, see #2: The effect of age is non-linear. Going from 1 to 2 years is something else than going from 14 to 15 years or 42 to 43 years or 91 to 92 years. If you add age linearly to your model you assume that all these 1 year increments all have the same effect. Unsurprisingly that is almost always not true.

    Also you should expect a large difference in coefficient/odds ratio because the unit of your variables are radically different: age compares people 1 year appart, while kids compares kids with adults. You obviously cannot compare these results directly.

  16. #13
    TS Contributor
    Points: 18,889, Level: 87
    Level completed: 8%, Points required for next Level: 461
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,062
    Thanks
    121
    Thanked 427 Times in 328 Posts

    Re: Discretization of a continuous variable

    Quote Originally Posted by XPeriment View Post
    I don't think stepwise is the issue here
    I tried using ENTER instead of stepwise.
    Age:
    B=-0.003 p=0.443 OR=0.997 [0.989-1.005]

    This will lead to a very different conclusion when dividing the Age to children and adults...
    I'm not saying that the use of stepwise regression is to blame for age having a non-linear effect. I'm just saying that you would be better off selecting your model using a method other than stepwise regression.

  17. #14
    Points: 2,109, Level: 27
    Level completed: 73%, Points required for next Level: 41
    ted00's Avatar
    Location
    USA
    Posts
    237
    Thanks
    21
    Thanked 29 Times in 25 Posts

    Re: Discretization of a continuous variable

    Quote Originally Posted by CowboyBear View Post
    See here, here, and here
    Thank you for those, very useful

    I particularly like this quote from the comments on Gelman's blog:
    "There is yet another problem with Stepwise Regression; a big one. It encourages you not to think."
    The mathematical explanation of a statistical procedure is really just pseudo-code, which we can make operational by translating it into real computer code. --B. Klemens

  18. #15
    Omega Contributor
    Points: 38,289, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,992
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Discretization of a continuous variable


    Yeah it is a process. It took me 8 hours to build a model the other day (testing assumptions, interactions, random effects, model fit, parsimony selection). If a just ran an automated process it would not have known the relationships in those data or content.
    Stop cowardice, ban guns!

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats