+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 16

Thread: How to choose best robust regression model?

  1. #1
    Points: 7,821, Level: 59
    Level completed: 36%, Points required for next Level: 129

    Posts
    159
    Thanks
    1
    Thanked 7 Times in 7 Posts

    How to choose best robust regression model?




    Hello.

    There are several robust regression methods like LAR-(aka LAV-, LAD-, L1-Norm-)Regression, Quantil-Regression, M-Estimator, ... They are assumed to be especially appropriate for data, that does not fulfill the 5 OLS conditions.

    The major part of the robust regression literature (I read) argues abstractly with the breakdownpoint which robust estimator should generally be preferred.

    The other part of the robust regression literature (I read) argues, the best robust estimator depends from the next best comparable theoretical distribution. E.g. will the LAR-estimator most probably be the best robust estimator at approximate Laplace distribution (although it has a worse breakdown point than quantile-estimator/ M-estimator).

    Question:
    How do I choose the best robust regression model from multiple robust estimators for data, that does (graphically obviously) not fullfill the OLS-conditions?
    Last edited by consuli; 02-26-2017 at 07:26 AM.
    Prediction is very difficult, especially about the future. (Niels Bohr)

  2. #2
    Devorador de queso
    Points: 95,705, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,931
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: How to choose best robust regression model?

    Which conditions are being violated?
    I don't have emotions and sometimes that makes me very sad.

  3. #3
    Points: 7,821, Level: 59
    Level completed: 36%, Points required for next Level: 129

    Posts
    159
    Thanks
    1
    Thanked 7 Times in 7 Posts

    Re: How to choose best robust regression model?

    Quote Originally Posted by Dason View Post
    Which conditions are being violated?
    In robust regression problems - especially in my one - the constant variance assumption is heavily violated in combination with skew residuals.
    Prediction is very difficult, especially about the future. (Niels Bohr)

  4. #4
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: How to choose best robust regression model?

    hi,
    maybe you could just use the generalized least squares with the appropriate variance structure? (package nlme in R)

    regards

  5. #5
    Points: 7,821, Level: 59
    Level completed: 36%, Points required for next Level: 129

    Posts
    159
    Thanks
    1
    Thanked 7 Times in 7 Posts

    Re: How to choose best robust regression model?

    Quote Originally Posted by rogojel View Post
    hi,
    maybe you could just use the generalized least squares with the appropriate variance structure? (package nlme in R)

    regards
    I already have parameter estimates from LAR-Regression, Quantile Regression. Of cause, further nlme parameter estimates may be interesting, too.

    But my question is, what criteria shows me which robust regression model respectively its estimates is best. R^2 and correlation do not work on robust problems.
    Prediction is very difficult, especially about the future. (Niels Bohr)

  6. #6
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: How to choose best robust regression model?

    Quote Originally Posted by consuli View Post
    I already have parameter estimates from LAR-Regression, Quantile Regression. Of cause, further nlme parameter estimates may be interesting, too.

    But my question is, what criteria shows me which robust regression model respectively its estimates is best. R^2 and correlation do not work on robust problems.
    Hi,
    I would use cross validation.

    regards

  7. #7
    Human
    Points: 12,676, Level: 73
    Level completed: 57%, Points required for next Level: 174
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,362
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: How to choose best robust regression model?

    But my question is, what criteria shows me which robust regression model respectively its estimates is best.
    When someone asks about "the best" one start to think about best by what optimality criterion.

    But what is the problem here? Do you simply need to switch distribution, like to gamma distribution or log-normal (skewed and heteroscedastisk)?

  8. #8
    Omega Contributor
    Points: 38,303, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,993
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: How to choose best robust regression model?

    Good point Greta. OP, can we see what this data looks like or the residuals? Thanks.
    Stop cowardice, ban guns!

  9. #9
    Points: 7,821, Level: 59
    Level completed: 36%, Points required for next Level: 129

    Posts
    159
    Thanks
    1
    Thanked 7 Times in 7 Posts

    Re: How to choose best robust regression model?

    Concluding from your answers. There is no generally accepted goodness of fit measure for robust regression problems, right? Even if your answer would be "no", this will answer my question for the short term.
    Prediction is very difficult, especially about the future. (Niels Bohr)

  10. #10
    TS Contributor
    Points: 12,227, Level: 72
    Level completed: 45%, Points required for next Level: 223
    rogojel's Avatar
    Location
    I work in Europe, live in Hungary
    Posts
    1,470
    Thanks
    160
    Thanked 332 Times in 312 Posts

    Re: How to choose best robust regression model?

    Quote Originally Posted by consuli View Post
    Concluding from your answers. There is no generally accepted goodness of fit measure for robust regression problems, right?
    I think this is a fair statement even for "normal" OLS multiple regression.
    I believe that using something like the RMSE measure with cross-validation is the least controversial way to pick a model.

    regards

  11. #11
    Points: 7,821, Level: 59
    Level completed: 36%, Points required for next Level: 129

    Posts
    159
    Thanks
    1
    Thanked 7 Times in 7 Posts

    Re: How to choose best robust regression model?

    To get this discussion a little bit more fact based, I have written a small R program, that calculates OLS, GLM(with Gamma), LAR and Quantil-Regression parameter estimates on two robust datasets from package robustbase. Further it calculates R^2, Pearson coorrelation, BIC and MSE.

    Code: 
    mse= function(y1, y2)  {
      resid= y1 -y2
      return(colSums(resid^2) /length(y1) )
    }
    
    
    library("robustbase")
    library("robust")
    library("quantreg")
    
    
    str(get(data(pension)))
    str(get(data(salinity)))
    
    # Select robust dataframe
    df= get(data(pension))[ , c(2, 1)]
    # df= get(data(salinity))[ , c(2, 4)]
    
    plot( df[ , 1]~ df[ , 2], data = df, cex= .5, col = "blue", xlab = "predictor", ylab = "target")
    
    
    lmmod= lm(df[ , 1]~ df[ , 2], data= df)
    glmgammamod= glm(df[ , 1]~ df[ , 2], data= df, family= Gamma(link = "identity") )
    lmrobmod= lmRob(df[ , 1]~ df[ , 2], data= df)
    rqmod= rq(df[ , 1]~ df[ , 2], data= df, tau= 0.5)
    
    
    lm= lmmod$coefficients
    glmgamma= glmgammamod$coefficients
    lmrob= lmrobmod$coefficients
    rq= rqmod$coefficients
    
    
    # Calc Estimates
    coefs= cbind(lm, glmgamma, lmrob, rq)
    predictors= matrix( ncol=2, c(rep(1, nrow(df)), df[ , 2]) )
    est= predictors %*% coefs
    
    
    # Goodness of Fit
    
    cor(df[ , 1], est, method= "p")
    # Comparison with R^2
    summary(lmmod)$r.squared
    
    mse(df[ , 1], est)
    
    BIC(lmmod)
    BIC(glmgammamod)
    # BIC(lmrobmod) BIC not available
    # BIC(rqmod) BIC not plausible
    
    # Bias Test
    mean(df[ , 1])
    colMeans(est)
    
    # Coefficients
    coefs
    With the same following results:
    R^2 and Pearson-Corelation are indifferent.
    BIC is only available for OLS and GLM.
    MSE always prefers OLS solution (which however is not plausible, as these are special datasets in favour for robust regression).

    I have also testet on other robust datasets. Always the same inplausible results.
    Prediction is very difficult, especially about the future. (Niels Bohr)

  12. The Following User Says Thank You to consuli For This Useful Post:

    hlsmith (02-28-2017)

  13. #12
    Omega Contributor
    Points: 38,303, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,993
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: How to choose best robust regression model?

    I am familiar that robust reg exists, but have not used it. I find it hard to believe that there aren't better resources for you. I will keep my eyes open in case a fortuitously stumble across some thing.


    Could it be possible for you to simulate a dataset very close to your's, with know parameters and assumptions that test all the above approaches?
    Stop cowardice, ban guns!

  14. #13
    Points: 7,821, Level: 59
    Level completed: 36%, Points required for next Level: 129

    Posts
    159
    Thanks
    1
    Thanked 7 Times in 7 Posts

    Re: How to choose best robust regression model?

    Quote Originally Posted by hlsmith View Post
    I find it hard to believe that there aren't better resources for you.
    Any helpful links are highly appreciated.

    Quote Originally Posted by hlsmith View Post
    Could it be possible for you to simulate a dataset very close to your's, with know parameters and assumptions that test all the above approaches?
    I don't know, how to do that. It would be helpful, if (mathematical) guidance was provided how to simulate the datasets, especially how to specify the increasing variance and skewness in the datasets and following how to simulate them. If a clear mathematical concept is layed out, I am pretty confident I can program it in R.
    Last edited by consuli; 03-01-2017 at 12:56 PM.
    Prediction is very difficult, especially about the future. (Niels Bohr)

  15. #14
    Omega Contributor
    Points: 38,303, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    6,993
    Thanks
    397
    Thanked 1,185 Times in 1,146 Posts

    Re: How to choose best robust regression model?

    A win-win. Check out this link on robust and quantiles reg, which has an simulation example:

    https://t.co/xlZpoeeLCX
    Stop cowardice, ban guns!

  16. #15
    Points: 7,821, Level: 59
    Level completed: 36%, Points required for next Level: 129

    Posts
    159
    Thanks
    1
    Thanked 7 Times in 7 Posts

    Re: How to choose best robust regression model?


    Thanks for the robust regression article from regression pope Fox.

    As far I could follow the article, it does neither say about a robust goodness of fit measure nor how to reproduce skew residuals (which would be necessary to reproduce the robust regression datasets with known parameters, as you suggested).

    However, it solved another problem I had. :-D
    Prediction is very difficult, especially about the future. (Niels Bohr)

+ Reply to Thread
Page 1 of 2 1 2 LastLast

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats