+ Reply to Thread
Results 1 to 3 of 3

Thread: Logistic Regression

  1. #1
    Points: 6, Level: 1
    Level completed: 11%, Points required for next Level: 44

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Logistic Regression




    Hello,

    I'm currently working on a churn modelling exercise that uses logistic regression. The challenge that i'm facing now is that the model is not returning good results. My hypothesis is that the churn rate in my data set is too small, such that a logistic regression was not able to pick up something meaningful.

    Some of the context are:
    - Sample population is 250000 with a total churners of about 0.3% (approx 700) each month over 3 months duration
    - Predictors include education, marriage, gender, income level etc., most of them requires dummy coding

    My questions are therefore:
    - Is the number of churners in the data set too small for any meaningful logistic regression?
    - If that's the case, would it be ok to remove some of records from the sample population that are non-churners to create a dataset that containers larger numbers of churners?
    - Is there an ideal ratio of churn/ non-churners within dataset that allows for a meaningful logistic regression run?

    Help much appreciated!

  2. #2
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Logistic Regression

    Hi,
    I think you could use the idea of a case-controlled study:

    https://en.m.wikipedia.org/wiki/Case-control_study

    others here might help you how to set this up correctly.

    regards

  3. #3
    TS Contributor
    Points: 17,779, Level: 84
    Level completed: 86%, Points required for next Level: 71
    Karabiner's Avatar
    Location
    FC Schalke 04, Germany
    Posts
    2,542
    Thanks
    56
    Thanked 640 Times in 602 Posts

    Re: Logistic Regression


    I guess that this a "logistic regression for rare events" problem.
    I don't know whether 0.3% might ever be a problem in itself,
    but I know that at least the number of events is crucial for
    logistic regression. As long as you do not use too many
    covariates in your model, you'll probably be ok with n=700
    (or 3*700?) events. http://statisticalhorizons.com/logis...or-rare-events.

    Just my 2pence

    K.

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats