+ Reply to Thread
Results 1 to 4 of 4

Thread: Evaluation of the most suitable model between LASSO and Forward stepwise selection.

  1. #1
    Points: 25, Level: 1
    Level completed: 49%, Points required for next Level: 25

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Evaluation of the most suitable model between LASSO and Forward stepwise selection.




    Hello, please take into account this: I'm a beginner

    I need to assess which of the two following specifications of a model is more suitable and explain it.
    This is what I obtained running properly on R the tools I had:

    FORWARD STEPWISE SELECTION:




    LASSO REGRESSION:


    This shows the degrees of freedom and the percentage of deviance inside the model.

    fit.lasso
    ##
    ## Call: glmnet(x = x.train, y = y.train, family = "gaussian")
    ##
    ## Df %Dev Lambda
    ## [1,] 0 0.00000 1.135000
    ## [2,] 5 0.03258 1.034000
    ## [3,] 9 0.09986 0.942000
    ## [4,] 17 0.22030 0.858300
    ## [5,] 17 0.34360 0.782100
    ## [6,] 17 0.44590 0.712600
    ## [7,] 17 0.53090 0.649300
    ## [8,] 17 0.60150 0.591600
    ## [9,] 17 0.66000 0.539000
    ## [10,] 17 0.70870 0.491200
    ## [11,] 17 0.74900 0.447500
    ## [12,] 17 0.78260 0.407800
    ## [13,] 17 0.81040 0.371500
    ## [14,] 17 0.83350 0.338500
    ## [15,] 17 0.85270 0.308500
    ## [16,] 17 0.86860 0.281100
    ## [17,] 17 0.88180 0.256100
    ## [18,] 17 0.89280 0.233300
    ## [19,] 17 0.90190 0.212600
    ## [20,] 17 0.90950 0.193700
    ## [21,] 17 0.91580 0.176500
    ## [22,] 17 0.92100 0.160800
    ## [23,] 17 0.92530 0.146500
    ## [24,] 17 0.92890 0.133500
    ## [25,] 17 0.93190 0.121700
    ## [26,] 17 0.93440 0.110900
    ## [27,] 17 0.93640 0.101000
    ## [28,] 17 0.93810 0.092030
    ## [29,] 17 0.93950 0.083860
    ## [30,] 17 0.94070 0.076410
    ## [31,] 17 0.94170 0.069620
    ## [32,] 17 0.94250 0.063430
    ## [33,] 17 0.94320 0.057800
    ## [34,] 17 0.94370 0.052660
    ## [35,] 17 0.94420 0.047990
    ## [36,] 17 0.94460 0.043720
    ## [37,] 17 0.94490 0.039840
    ## [38,] 17 0.94520 0.036300
    ## [39,] 18 0.94540 0.033070
    ## [40,] 18 0.94560 0.030140
    ## [41,] 18 0.94580 0.027460
    ## [42,] 18 0.94590 0.025020
    ## [43,] 18 0.94600 0.022800
    ## [44,] 18 0.94610 0.020770
    ## [45,] 18 0.94620 0.018930
    ## [46,] 18 0.94620 0.017250
    ## [47,] 19 0.94630 0.015710
    ## [48,] 19 0.94630 0.014320
    ## [49,] 19 0.94640 0.013050
    ## [50,] 19 0.94640 0.011890
    ## [51,] 19 0.94640 0.010830
    ## [52,] 19 0.94650 0.009868
    ## [53,] 19 0.94650 0.008992
    ## [54,] 19 0.94650 0.008193
    ## [55,] 19 0.94650 0.007465
    ## [56,] 19 0.94650 0.006802
    ## [57,] 19 0.94650 0.006198

    These are the minimum and maximum value of Lambda

    cv.fit$lambda.min
    ## [1] 0.006801877
    cv.fit$lambda.1se
    ## [1] 0.03983874

    These functions are useful to obtain the values of the coefficients associated with the variables, when Lambda is at its minimum and at its maximum

    coef(cv.fit, s="lambda.min")
    ## 21 x 1 sparse Matrix of class "dgCMatrix"
    ## 1
    ## (Intercept) 0.036065536
    ## V1 0.980640994
    ## V2 1.026499043
    ## V3 0.993698752
    ## V4 0.985566688
    ## V5 0.951476162
    ## V6 0.994761436
    ## V7 1.033719423
    ## V8 1.024623825
    ## V9 1.006812111
    ## V10 1.020157370
    ## V11 1.012844843
    ## V12 1.009196928
    ## V13 1.019196655
    ## V14 1.023799094
    ## V15 0.970762354
    ## V16 0.993270801
    ## V17 0.953188089
    ## V18 -0.009839538
    ## V19 0.029612698
    ## V20 .

    coef(cv.fit, s="lambda.1se")
    ## 21 x 1 sparse Matrix of class "dgCMatrix"
    ## 1
    ## (Intercept) 0.04177626
    ## V1 0.94452267
    ## V2 0.99316613
    ## V3 0.95798195
    ## V4 0.94838097
    ## V5 0.92422061
    ## V6 0.96073814
    ## V7 1.00319146
    ## V8 0.98976810
    ## V9 0.97213574
    ## V10 0.98853023
    ## V11 0.97560603
    ## V12 0.97945382
    ## V13 0.98106038
    ## V14 0.98638349
    ## V15 0.93706507
    ## V16 0.95779553
    ## V17 0.91915907
    ## V18 .
    ## V19 .
    ## V20 .

    In conclusion, I calculated the MSE of the lasso regression and of the full model, in order to check if the lasso method increased the reliability of our prediction:
    y.test.lasso = predict(cv.fit, newx=x.test, s="lambda.min")
    y.test.full = predict(cv.fit, newx=x.test, s=0)
    pred.errors.lasso = y.test - y.test.lasso; # selected model predictions
    pred.errors.full = y.test - y.test.full; # selected model predictions

    mse.lasso = mean(pred.errors.lasso^2)
    mse.full = mean(pred.errors.full^2)

    mse.lasso
    ## [1] 0.9839925
    mse.full
    ## [1] 0.983884
    100 * (1- mse.lasso/mse.full) # percent gain in accuracy due to shrinkage
    ## [1] -0.01102601


    If you arrived this far, THANK YOU. May I ask you to assess which one of the two cases is the most suitable?

  2. #2
    Omega Contributor
    Points: 38,396, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,001
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Evaluation of the most suitable model between LASSO and Forward stepwise selectio

    What is the MSE on the forward selection? Because the number of predicts seems comparable. What is the purpose of the models, future prediction out of sample?
    Stop cowardice, ban guns!

  3. #3
    Points: 25, Level: 1
    Level completed: 49%, Points required for next Level: 25

    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Evaluation of the most suitable model between LASSO and Forward stepwise selectio

    I don't have any MSE for the Forward stepwise selection. The image I posted is the only result I have so I guess it should be possible to compare the two things with AIC for the Forward and MSE for the Lasso but I have no idea how to do it.

    Yes, the purpose is future prediction out of the sample: which one gives the best specification of the model.

  4. #4
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: Evaluation of the most suitable model between LASSO and Forward stepwise selectio


    hi,
    why not simply run a cross validation on the Forward model and on the lasso and compare results? What do I miss?
    BTW I would include the lasso 1se as well, to me it seems to be a better choice in general then the minimum.

    regards

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats