+ Reply to Thread
Results 1 to 4 of 4

Thread: Adaptive Lasso and CrossValidation for SNV selection

  1. #1
    Points: 6, Level: 1
    Level completed: 11%, Points required for next Level: 44

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Question Adaptive Lasso and CrossValidation for SNV selection




    Hello everyone,

    I have available 17 000 variables (SNV frequencies, a certain number of zeros) for 40 patients. Each patient is represented by its response to a treatment : 13 responses, 27 no-responses. I want to extract a subset of SNV which can have strong prediction power.

    Because of the large size of set of variables, there are strong correlations, that's why I'm considering adaptive-lasso. I used glmnet R package, Ridge initial estimated coefficients and the following R code :

    Code: 
    library(cvTools)
    library(glmnet)
    
    err.test.response <- c()
    err.test.noresponse <- c()
    nbiters <- 50
    
    for(i in 1:nbiters){
      ## k folds
      kflds <- 8
      flds <- cvFolds(length(y), K = kflds)
     
      pred.test <- c() ## predicted classes
      class.test <- c() ## real classes
     
      for(j in 1:kflds){
        ## Train
        x.train <- x[flds$which!=j,]
        y.train <- y[flds$which!=j]
        ## Test
        x.test <- x[flds$which==j,]
        y.test <- y[flds$which==j]
        ## Adaptive Weights Vetor
        cv.ridge <- cv.glmnet(x.train, y.train, family='binomial', alpha=0, standardize=FALSE,
                              parallel = TRUE, nfolds = 7)
        w3 <- 1/abs(matrix(coef(cv.ridge, s=cv.ridge$lambda.min)[, 1][2:(ncol(x)+1)] ))^1
        w3[w3[,1] == Inf] <- 999999999 
        
        ## Adaptive Lasso
        cv.lasso <- cv.glmnet(x.train, y.train, family='binomial', alpha=1, standardize=FALSE,
                              parallel = TRUE, type.measure='class', penalty.factor=w3, nfolds = 7)
        ## Prediction
        pred.test <- c(pred.test, predict(cv.lasso, x.test, s = 'lambda.1se', type = c("class")))
        class.test <- c(class.test, as.character(y.test))
      }
     
      ## Prediction error
      err.test.noresponse <- c(err.test.noresponse, 1-sum(pred.test=="noresponse"&class.test=="noresponse")
                            /sum(class.test=="noresponse")) # noresponse error vector
      err.test.response <- c(err.test.response, 1-sum(pred.test=="response"&class.test=="response")
                            /sum(class.test=="response")) # response error vector
    }
    
    mean(err.test.noresponse) ## Mean noresponse prediction error
    mean(err.test.response) ## Mean response prediction error
    Is it good to do an external cross-validation like this to evaluate adaptive-lasso prediction power on my data ?

    My results are not conclusive at all, I have mean(err.test.noresponse) = 0.15 and mean(err.test.response)=0.88, so my model doesn't succeed to identify the response. Have you got an idea why my results are so bad and how could I improve this ?

    Thanks for your help and your ideas,

    Corentin

  2. #2
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Adaptive Lasso and CrossValidation for SNV selection

    Hi,
    as a quick idea: have you considered principal component analysis ?

    regards

  3. #3
    Omega Contributor
    Points: 38,374, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Adaptive Lasso and CrossValidation for SNV selection

    I can't recall what SNV stands for, but I am guessing something like single nucleotide ..., or is a some gene marker. So you have 17,000 markers and you want to see if any are related to your binary outcome with the lesser outcome group being 15 or 33%. Lasso is a shrinkage/selection procedure so is better than ridge for your purpose. Its prior belief is all variables have 0 predictive probability and data has to move the posterior probability.


    You did CV which is great, which would have trained on 35 observations and tested on only 5, so 33%(5) would have been 1.7 may have been the lesser outcome group and how often was the explanatory variable present in that 1 outcome? You can probably look at your training folds to ensure they had representation of the lesser outcome group. I am guessing the sample size was your hindrance. Is lasso a common approach in similar studies looking at gene studies with sparse binary vectors, (sparse data combinations). Also, how common were the gene markers, were most 0s, is that what you meant? I don't know the answer, but are e-nets better with this?


    Also, I think most of the regularization models were created based on linear outcomes and predictors, so there application to binary is still a little developmental. Were all of your betas pretty much zero? Side note, you are ruling out interaction terms, but is that reasonable?


    I would say, look to find a comparable research question and see how they addressed it. I am interested in your question, in that I need to do regularization in a couple of months using a binary dependent variable. So update this thread as appropriate to help others. My project wont be as sparse as yours in that I wont be looking at a mountain of binary predictors.
    Stop cowardice, ban guns!

  4. #4
    Omega Contributor
    Points: 38,374, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,998
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Adaptive Lasso and CrossValidation for SNV selection


    Did the glmnet automate the selection of the penalization parameter, or do you need to specify different values and then rerun the CV-fold procedure and examine mis-classification? Ah, I now see the 1se line. Another option is that there is not a relationship.
    Stop cowardice, ban guns!

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats