+ Reply to Thread
Results 1 to 10 of 10

Thread: Selecting Variables for Multiple Regression (Univariate Significance Levels)

  1. #1
    Points: 44, Level: 1
    Level completed: 88%, Points required for next Level: 6

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Selecting Variables for Multiple Regression (Univariate Significance Levels)




    Hi All! I really hope someone can answer my question.

    I am building multiple linear regressions and I am testing salient variables one by one at the univariate level to determine whether I should include them.

    However, what is the current acceptable limits of p values to include/exclude univariate level variables. I have gotten some very conflicting advice, which is why I turned to this site. At first I was using p < 0.05 but then I was told to use .20 and later read p values ranging between .25-.50. So I am very confused. If I continue to use variables at the 0.05 level is there any literature that can justify this?

    Thanks,

    Pranaphish

  2. #2
    Points: 115, Level: 2
    Level completed: 30%, Points required for next Level: 35

    Posts
    16
    Thanks
    0
    Thanked 3 Times in 3 Posts

    Re: Selecting Variables for Multiple Regression (Univariate Significance Levels)

    Go with stepwise regression. It will keep only the significant variable. Preferably 0.05 is cut off generally used.

  3. #3
    Human
    Points: 7,379, Level: 57
    Level completed: 15%, Points required for next Level: 171
    GretaGarbo's Avatar
    Posts
    862
    Thanks
    286
    Thanked 299 Times in 268 Posts

    Re: Selecting Variables for Multiple Regression (Univariate Significance Levels)

    Stepwise regression is controversial. I suggest to NOT use it. Stepwise=unwise!
    Think for yourself instead!

  4. #4
    R must die
    Points: 24,995, Level: 95
    Level completed: 65%, Points required for next Level: 355
    noetsi's Avatar
    Posts
    4,535
    Thanks
    278
    Thanked 730 Times in 700 Posts

    Re: Selecting Variables for Multiple Regression (Univariate Significance Levels)

    Stepwise has signficant problems despite its use (likely because those that use it are unaware of the issues).

    My advice is not to use bivariate relationships period to decide whether to enter a variable. Because bivariate relationships commonly have little to do with the marginal relationships, unique variance explained by a given predictor, in a multiple regression model. The best strategy is theory or what makes sense substantively. Another alternative is to run various models and chose the one with the lowest BIC (or alternately run them all and remove the ones that are not signficant although some disagree with that strategy as well).

    Another problem with using univariate criteria is that ignores interaction effects.
    This was not what we did in logistic regression. Rather, we transformed the conditional expected value, and made that a linear function of X. This seems odd, because it is odd..

  5. #5
    Test of Gnomality
    Points: 13,909, Level: 76
    Level completed: 65%, Points required for next Level: 141
    hlsmith's Avatar
    Posts
    2,618
    Thanks
    162
    Thanked 435 Times in 424 Posts

    Re: Selecting Variables for Multiple Regression (Univariate Significance Levels)

    It is all situation based on what cut-off you want to use. Many times the inclusion level for entry (candidate) variables will be higher (e.g., 0.20), but the significance in the final model will drop back down (e.g., 0.05), this gives the opportunity to let the covariates comingle for an instance.

    But overall this is typically situation and discipline based.
    Disregard the number of posts I have on this forum => I likely have no idea what I am writing about!

  6. #6
    R must die
    Points: 24,995, Level: 95
    Level completed: 65%, Points required for next Level: 355
    noetsi's Avatar
    Posts
    4,535
    Thanks
    278
    Thanked 730 Times in 700 Posts

    Re: Selecting Variables for Multiple Regression (Univariate Significance Levels)

    Except, IMHO, a lot of disciplines know so little about statistics that their behavior is doubtful. Having come from one of those disciplines myself (public management). It took me a long time to realize just because it came from a journal did not mean the writer actually understood the method. I know now I did tons of stuff that was just flat out bad practice.
    This was not what we did in logistic regression. Rather, we transformed the conditional expected value, and made that a linear function of X. This seems odd, because it is odd..

  7. #7
    Points: 44, Level: 1
    Level completed: 88%, Points required for next Level: 6

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Selecting Variables for Multiple Regression (Univariate Significance Levels)

    Would it help if I said I was testing for main effects of each variable in the regression model and not the interaction of the variables? In my final multiple regression models I use P < .05 for significance, but am just making sure that you know I am asking about the step before, when I am selecting the variables, I have a huge variable set so I need to narrow down. So please make your case for .05 or .2 at the univariate level? References would be appreciated!!! THANKS!!

  8. #8
    R must die
    Points: 24,995, Level: 95
    Level completed: 65%, Points required for next Level: 355
    noetsi's Avatar
    Posts
    4,535
    Thanks
    278
    Thanked 730 Times in 700 Posts

    Re: Selecting Variables for Multiple Regression (Univariate Significance Levels)

    You really should not be testing for the main effects if there is a signficant interaction effect, because the meaning of main effects then is doubtful. Automatically throwing out interaction is not a good idea if admitedly commonly done.

    But to repeat what I said before I don't think it is valid to use the bivariate numbers to tell you what variables to include in the model. Because the effects of a multiple regression model are commonly very different. Strong variables in a bivariate comparison may be very weak in regression. The best way to select is theory or what the existing literature says on this topic. Or, at worse case, what makes sense to you to include.
    This was not what we did in logistic regression. Rather, we transformed the conditional expected value, and made that a linear function of X. This seems odd, because it is odd..

  9. #9
    Points: 44, Level: 1
    Level completed: 88%, Points required for next Level: 6

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Selecting Variables for Multiple Regression (Univariate Significance Levels)

    Noetsi - I wish I could take your advice because it would be much easier, however I have to make a choice and justify it for a research project. And there are only two choices which are use .2 or .05 at the bivariate level, because the majority of the variables (20 variables) could all be included based on literature...however I have read that only 5-6 variables should be used in the model....and the reason I am testing main effects is because I am looking at the relationship of 1 variable in the model but had to control for others for the sake of doing it the way my teacher wanted it done....so yeah... there's that!

  10. #10
    Points: 44, Level: 1
    Level completed: 88%, Points required for next Level: 6

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Selecting Variables for Multiple Regression (Univariate Significance Levels)


    Padashri - Are you saying to use 0.05 at the bivariate level?

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats