+ Reply to Thread
Results 1 to 3 of 3

Thread: Logistic regression, sample sizes and bootstraping

  1. #1
    Points: 32, Level: 1
    Level completed: 64%, Points required for next Level: 18

    Thanked 0 Times in 0 Posts

    Logistic regression, sample sizes and bootstraping

    Hello, this is my first post. I currently I'm working in my phD in a multivariate logistic model but I have a problem regarding the sample size of my observations:

    -The "success" (1) event group has a sample size of 249 distinct observations
    - The "non success" (0) event group has a sample size of 48,957, and it's a significant part of the population, and many times larger than the "success" group.

    So when I fit a multivariate logistic regression model, two things happen:
    - The p-values of the model coefficients become always significant at alpha 1%
    - The fitted predicted probability variation between the groups becomes very tiny, even if the independent variables are good predictors.

    So I was suggested making a bootstrap of the model. So here is my doubt :

    -Should I do it the more usual way, taking smaller arbitrary size random samples of the complete sample (both groups) with replacement ( or without replacement in this case?)


    -Should I keep the the small "success" group constant and them add an equal number of different "non success" randomly picked cases in each sample.

    In both cases this is not exactly the usual bootstrap as I wish to make sub-samples of a larger group, the original big sample, and perhaps what I want is not bootstrap at all. My idea is to minimize the discrepancy between the two groups sizes. Is this teoretically correct?

    The final objective is to obtain an "avarage" model with the mean coefficients and use it to calculate the propability of the "non success" cases actualy being "successes"

    Or does anybody has other idea for this question?
    Last edited by Edu; 05-06-2014 at 07:22 PM.

  2. #2
    TS Contributor
    Points: 5,246, Level: 46
    Level completed: 48%, Points required for next Level: 104
    maartenbuis's Avatar
    Thanked 146 Times in 123 Posts

    Re: Logistic regression, sample sizes and bootstraping

    I would just stick with your original model. I don't see why you would consider significant results and small differences in predicted probability a problem. Your results just sound like an accurate reprentation of your data to me.

  3. #3
    Omega Contributor
    Points: 38,303, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Not Ames, IA
    Thanked 1,185 Times in 1,146 Posts

    Re: Logistic regression, sample sizes and bootstraping

    I agree with maartenbuis. In addition, many times the samplerate = 1, which is the whole sample, but with replacement. Going that general route does not seem like it would provide any added benefit to your analyses, unless for some reason you think your original sampling design was systematically flawed.
    Stop cowardice, ban guns!

+ Reply to Thread


Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Advertise on Talk Stats