+ Reply to Thread
Results 1 to 3 of 3

Thread: Forward selection of non significant variables

  1. #1
    Points: 23, Level: 1
    Level completed: 45%, Points required for next Level: 27

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Forward selection of non significant variables




    Hi,

    In SAS Enterprise Miner, I trained a logistic regression with forward selection and AIC criteria.

    I grouped rare levels for categorical variables. One of these variables was selected by the algorithm but the coefficients of all categories were statistically not significant (different from 0).

    Why the algorithm would select such a variable if all categories are not significant ? Does someone know a scientific explanation ?

    The test level is .05 and the corresponding p-values for the categories are around .93.

    Thank you for your help,
    Marco

  2. #2
    Devorador de queso
    Points: 97,664, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent PosterActivity Award
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,988
    Thanks
    309
    Thanked 2,640 Times in 2,255 Posts

    Re: Forward selection of non significant variables

    Do you have the history of when variables were added in. It's possible for variables to look very important but drop to non significance after other variables are added into the model. Forward selection doesn't drop variables once they're added so that might be the case. Is there a reason you're using a stepwise procedure in the first place though?
    I don't have emotions and sometimes that makes me very sad.

  3. #3
    Omega Contributor
    Points: 39,242, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,094
    Thanks
    405
    Thanked 1,202 Times in 1,163 Posts

    Re: Forward selection of non significant variables


    Run a traditional model using the selected variables and take a look at the type I and then type III effects for the variable and report back what you see.


    PS, I can have a dummy variable, say insurance type. I can enter the cat var into the model and one group is highly predictive of say death, why shouldnt the model keep the full cat var around? Unless I specifically dummy code myself each one as a yes/no. Try that as well if you are unsatisfied the cat var as a whole is sticking around.
    Stop cowardice, ban guns!

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats