+ Reply to Thread
Results 1 to 5 of 5

Thread: Logistic Regression

  1. #1
    Points: 8, Level: 1
    Level completed: 15%, Points required for next Level: 42

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Logistic Regression




    Helo

    I am faced with a problem which looks to be a clear case of logistic regression.
    A dependent variable(binary in nature) needs to be predicted given a set of independent variables. However almost all the independent variables are categorical in nature and for most of them the number of categories are large. Coding each of the categories of all variables would be a highly laborious task. I want to know what is the approach to a situation like this where the independent variable is categorical in nature and has a huge number of categories.

    Requesting all statistical modelers to suggest an approach...really stuck on this one!

    Best Regards
    Abhijeet

  2. #2
    Omega Contributor
    Points: 38,334, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Logistic Regression

    Depends on the program you are using. SAS will automatically create the categories if entered into the class statement. What is your sample size and numbers in the two groups?
    Stop cowardice, ban guns!

  3. #3
    Points: 8, Level: 1
    Level completed: 15%, Points required for next Level: 42

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Logistic Regression

    Quote Originally Posted by hlsmith View Post
    Depends on the program you are using. SAS will automatically create the categories if entered into the class statement. What is your sample size and numbers in the two groups?
    Total number of records is more than 32000. As an example, one of the independent variable has more than 30 categories...others too have more than 20

  4. #4
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Logistic Regression

    hi,
    maybe you could code them with numbers and treat them as a quasi-continuous variable? With sich a high number of distinct categories this could work.
    regards
    rogojel

  5. #5
    Fortran must die
    Points: 58,790, Level: 100
    Level completed: 0%, Points required for next Level: 0
    noetsi's Avatar
    Posts
    6,532
    Thanks
    692
    Thanked 915 Times in 874 Posts

    Re: Logistic Regression


    To some extent it depends on whether the data is ordinal or nominal in nature. If it is ordinal with 20 plus levels it might well be interval like so you could treat it that way. If it truly is nominal data, that you probably want to collapse multiple levels into one before you do any analysis. So you might code levels 1-6 as 1, 7-13 as 2 etc. To do this you need some theoretical or at least common sense reason to create these dimensions. This is often done with age for example.
    "Very few theories have been abandoned because they were found to be invalid on the basis of empirical evidence...." Spanos, 1995

+ Reply to Thread

           




Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats