+ Reply to Thread
Results 1 to 2 of 2

Thread: Help with variable selection in logistic regression using a small dataset

  1. #1
    Points: 5, Level: 1
    Level completed: 9%, Points required for next Level: 45

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Help with variable selection in logistic regression using a small dataset




    Hello.
    I have a relatively small data-set looking at a binary outcome of death after a medical procedure.

    There are 102 patients total in the data-set, and 26 deaths.

    I am interested in looking at correlates of death. I first calculated univariate odds ratios, and have a list of 11 factors which are statistically significant. Some of them are probably related to eachother (such as fluid intake during the procedure, fluid output during the procedure, and the net fluid during the procedure).

    Since there are only 26 events, I'm not sure how to best approach choosing which variables to put in a multivariable logistic model. If I'm less conservative I guess I can choose 1 variable per 5 outcomes, but that only allows me to choose 5 variables to put in the model.

    I have been exploring using various methods of forward/backward selection, but im not having much success.

    Given the relatively small number of outcomes, is fitting a logistic model inappropriate? If not, how should I approach choosing which variables to include in a logistic model?

    I am using STATA for analysis.

    Thank you for your help with this!

  2. #2
    Omega Contributor
    Points: 38,303, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,993
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: Help with variable selection in logistic regression using a small dataset


    You are probably looking at 2 predicts at most, the problem is generating internally and externally valid information. Use of 5 predictors would likely leave you with subgroups with 1 or 2 patients dubiously representive of all comparable patient. Use of significance statistically and clinically is your best bet.
    Stop cowardice, ban guns!

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats