+ Reply to Thread
Results 1 to 14 of 14

Thread: backwards stepwise multiple regression, collinearity. Rejected paper.

  1. #1
    Points: 56, Level: 1
    Level completed: 12%, Points required for next Level: 44

    Posts
    4
    Thanks
    4
    Thanked 0 Times in 0 Posts

    backwards stepwise multiple regression, collinearity. Rejected paper.




    Hi helpful people!

    My research paper was recently rejected and some of the feedback I received was in relation to the statistical tests done/not done. I would like help in clarifying what I could do differently as the feedback was not to informative.
    I am attempting to see which baseline characteristics (my independent variables) can predict who will improve the most in my dependent variable after an intervention. As it is not published yet I wonít give to many details but a similar example would be trying to decide if any baseline characteristics in humans (such as muscle mass, age, gender, alcohol use, pulse rate etc) can predict improvement in 100m foot race times after undergoing a strength exercise program. I have a cohort of about 100 individuals all undergoing the same intervention.
    In order to test this I collected data on all my baseline values and measured participants 100m times before and after undergoing the strength exercise program. I then made a multiple regression model were I included previous known confounders and my baseline characteristics of interest and used backwards stepwise removal of non-significant regressors to end up with a model of 3 independent variables significantly associating with improvement in 100m race times. For the sake of the argument letís make up the following; gender, thigh muscle mass and smoking status (yes/no).

    I was asked/critiqued on the following (again examples are made up);

    1; type of sports shoe is a well-known determinant of 100 m race times, were improvement in race times still associated with baseline thigh muscle mass after adjusting for choice of sport shoe,?

    -type of sport shoe was one of the independent variables included in my multiple regression model, however it was not significant when included with the other independent variables so it was removed in the backward stepwise removal process. Is any other statistical test more appropriate to run?

    2, Could collinearity explain the results as several of the independent variables are likely to be similar

    -I ran collinearity diagnostics in SPSS and did not receive any VIF values over 4 (with only one independent variable had a VIF at 4, the rest were under 3)

    3; discuss regression to the mean as an explanation to my results

    -I concede that it is likely that regression to the mean plays a part in which individuals improved the most/least but I donít see how this impacts on the baseline characteristics in a significant way other than that these individuals are given greater weight in the results since they show the biggest change. I divided my cohort into tertiles based on improvement in race times and did not find that they differed in baseline values in any of my independent variables of interest.

    Any input much appreciated!

  2. #2
    Omega Contributor
    Points: 39,138, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,086
    Thanks
    404
    Thanked 1,196 Times in 1,157 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    What exactly was outcome (time, percent change, etc.)?

    Did you have initial time in the model?

    I believe it may be JAMA instructions to authors that has an excellent description of modelling before after change. I will look up the source tomorrow.

    How many independent tests did you initially have in the model?

    We will tease things out, but perhaps some of your issues may be related to your descriptive write-ups. Did they give you a flat out rejection or major/minor revisions?

    A control group would have been nice, and perhaps still doable!
    Stop cowardice, ban guns!

  3. The Following User Says Thank You to hlsmith For This Useful Post:

    potatopatch (09-05-2017)

  4. #3
    TS Contributor
    Points: 19,199, Level: 87
    Level completed: 70%, Points required for next Level: 151
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,080
    Thanks
    123
    Thanked 429 Times in 330 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    Quote Originally Posted by potatopatch View Post
    As it is not published yet I wonít give to many details but a similar example would be trying to decide if any baseline characteristics in humans (such as muscle mass, age, gender, alcohol use, pulse rate etc) can predict improvement in 100m foot race times after undergoing a strength exercise program.
    I think it might be better if you just told us what the actual research is about. Many researchers nowadays actively put up preprints of their work long before publication or acceptance - empirical research isn't Game of Thrones, putting up spoilers doesn't hurt anyone . Perhaps if we knew more about what was actually going on we might be able to give more informative feedback?

    I included previous known confounders and my baseline characteristics of interest and used backwards stepwise removal of non-significant regressors to end up with a model of 3 independent variables significantly associating with improvement in 100m race times.
    Stepwise regression is pretty universally seen as a bad idea these days; it results in biased parameter estimates. In your case in particular it will probably result in you excluding confounding variables from your model that should actually be included (but that happen not to be statistically significant due to your fairly small sample size). If I was reviewing the paper I would want you to just include known confounders and forget the stepwise process.
    Matt aka CB | twitter.com/matthewmatix

  5. The Following User Says Thank You to CowboyBear For This Useful Post:

    potatopatch (09-05-2017)

  6. #4
    Points: 56, Level: 1
    Level completed: 12%, Points required for next Level: 44

    Posts
    4
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    thank you for your replies.

    I had 12 IVs enter the model, all the well established confounders in the field (8) and my 4 IVs of interest. Outcome was change in time and I did not have initial time in the model, however neither of the reviewers had any comment about this (unless that is what the regression to the mean was about, however he/she did not word it that way). I was debating if I should include it or not.

    I used stepwise backwards as I felt it was way to many IVs to have reliable model. But I do see your point as how it may exclude an important confounder. What is a better way, just pick the most important confounders and 1-2 new IVs of interest and enter into a multiple regression model without any stepwise selection?

  7. #5
    TS Contributor
    Points: 19,199, Level: 87
    Level completed: 70%, Points required for next Level: 151
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,080
    Thanks
    123
    Thanked 429 Times in 330 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    Quote Originally Posted by potatopatch View Post
    What is a better way, just pick the most important confounders and 1-2 new IVs of interest and enter into a multiple regression model without any stepwise selection?
    12 predictors in a model with 100 observations is a bit complex, but not unreasonably so. You can just include them all. Certainly don't now try to decide which IVs to include after you've already seen the results.
    Matt aka CB | twitter.com/matthewmatix

  8. #6
    TS Contributor
    Points: 1,930, Level: 26
    Level completed: 30%, Points required for next Level: 70

    Posts
    258
    Thanks
    39
    Thanked 72 Times in 63 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    If you're not directly interested in the 8 that are believed to already be important, you could try "combining" those 8 via principal components analysis to reduce the dimensions youi're dealing with. You could include the prinicpal components in the model and focus on the 4 new variables.

  9. The Following User Says Thank You to ondansetron For This Useful Post:

    potatopatch (09-05-2017)

  10. #7
    Omega Contributor
    Points: 39,138, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,086
    Thanks
    404
    Thanked 1,196 Times in 1,157 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    A reference below.


    Subsection on change:
    http://www.fharrell.com/2017/04/stat...re.html#change
    Stop cowardice, ban guns!

  11. The Following User Says Thank You to hlsmith For This Useful Post:

    potatopatch (09-05-2017)

  12. #8
    Points: 56, Level: 1
    Level completed: 12%, Points required for next Level: 44

    Posts
    4
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    thanks for all the input so far! Will sink my teeth into the refrence you provided tonight but at a quick glance most of the criteria for calculating change from baseline actually seem ok in my study.

    I now made one model with all independent IVs and I have another question. I am required to report the Rsquared value of the model which in this case is rather high at .65, however many of the IVs in the model are not significant (p-values at .8 and .6 etc). Is this a problem? I feel that the Raquared value in this case is not very informative with such a model.

  13. #9
    Omega Contributor
    Points: 39,138, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,086
    Thanks
    404
    Thanked 1,196 Times in 1,157 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    Building a model is a process. Not everyone is going to agree on approaches. Ideally you write out exactly what you plan to do, before you even begin. Thus "investigator degrees of freedom" don't come into play. Such actions come with experience. Point in hand, unless some of the confounders are really cemented into the literature, I am not sure I would keep the 0.8 variable around. Options include removing the variable and seeing if any other coefficients (suspected relationships) change.


    There are adjusted r^2 measures, which control for the number of variables in the model. The traditional r^2 will become inflated when more variables are added to the model, thus adjusted r^2 is more appropriate.
    Stop cowardice, ban guns!

  14. #10
    TS Contributor
    Points: 1,930, Level: 26
    Level completed: 30%, Points required for next Level: 70

    Posts
    258
    Thanks
    39
    Thanked 72 Times in 63 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    Quote Originally Posted by potatopatch View Post
    thanks for all the input so far! Will sink my teeth into the refrence you provided tonight but at a quick glance most of the criteria for calculating change from baseline actually seem ok in my study.

    I now made one model with all independent IVs and I have another question. I am required to report the Rsquared value of the model which in this case is rather high at .65, however many of the IVs in the model are not significant (p-values at .8 and .6 etc). Is this a problem? I feel that the Raquared value in this case is not very informative with such a model.
    You should (or could) ignore p-values for variables that are "well-established" by the prior literature or theory. It doesn't make good sense to, nor is it judicious to, look at p-values (or CIs) for "significance" of variables that are "known" in advance to be "important".

  15. The Following User Says Thank You to ondansetron For This Useful Post:

    CowboyBear (09-05-2017)

  16. #11
    TS Contributor
    Points: 19,199, Level: 87
    Level completed: 70%, Points required for next Level: 151
    CowboyBear's Avatar
    Location
    New Zealand
    Posts
    2,080
    Thanks
    123
    Thanked 429 Times in 330 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    Quote Originally Posted by ondansetron View Post
    If you're not directly interested in the 8 that are believed to already be important, you could try "combining" those 8 via principal components analysis to reduce the dimensions youi're dealing with. You could include the prinicpal components in the model and focus on the 4 new variables.
    True, but keep in mind that if you do this then the model doesn't actually include controls for the 8 potential confounding variables - it just controls for a smaller subset of principal components based on these original variables. Jake (one of our forum regulars) recently co-wrote an article discussing how "controlling" for potential confounds that you haven't measured perfectly results in quite serious problems: http://journals.plos.org/plosone/art...l.pone.0152719
    Matt aka CB | twitter.com/matthewmatix

  17. #12
    TS Contributor
    Points: 1,930, Level: 26
    Level completed: 30%, Points required for next Level: 70

    Posts
    258
    Thanks
    39
    Thanked 72 Times in 63 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    Quote Originally Posted by CowboyBear View Post
    True, but keep in mind that if you do this then the model doesn't actually include controls for the 8 potential confounding variables - it just controls for a smaller subset of principal components based on these original variables. Jake (one of our forum regulars) recently co-wrote an article discussing how "controlling" for potential confounds that you haven't measured perfectly results in quite serious problems: http://journals.plos.org/plosone/art...l.pone.0152719
    Yes, that's worth noting, as the PCA will lose some of the information in the underlying variables when they're combined. I was thinking that if the OP believed the "confounders" to be highly similar variables, it might be worth reducing the dimensions to capture as much of that information as possible. As always, approaching the research question with a few different techniques and validation methods will be helpful to determine stability of the conclusions. I'll have to read the article, now!

  18. The Following 2 Users Say Thank You to ondansetron For This Useful Post:

    CowboyBear (09-05-2017), hlsmith (09-05-2017)

  19. #13
    Omega Contributor
    Points: 39,138, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,086
    Thanks
    404
    Thanked 1,196 Times in 1,157 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.

    odan's suggestion reminded me of the use of a propensity score term (kind of). However there is not one variable of interest and I am unfamiliar enough with PCA to know if you have multiple endogeneity, perhaps unfaithfulness, etc., what ramifications can occur.

    Comes down to also drawing a causal diagram illustrating relationships and signs!!
    Stop cowardice, ban guns!

  20. #14
    Points: 56, Level: 1
    Level completed: 12%, Points required for next Level: 44

    Posts
    4
    Thanks
    4
    Thanked 0 Times in 0 Posts

    Re: backwards stepwise multiple regression, collinearity. Rejected paper.


    I have another question. After reading your excellent reference on change from baseline I followed some of the links provided there and ended up at

    http://biostat.mc.vanderbilt.edu/wik.../MeasureChange

    They recommend many things to ensure you are chosing an ok effect measure of change among them;

    "Plot difference in pre and post values vs. the average of the pre and post values (Bland-Altman plot). If this shows no trend, the simple differences are adequate summaries of the effects, i.e., they are independent of initial measurements."

    I am only familliar with the Bland-Altman plot when it comes to validating that two techniques measure the same thing, here it is used in "reverse". I tried this with my values and did not get a significant association indicating that the measures are more likely independent from each other. Is this a common technique to use? Should I report this in my paper as some proof that regression to the mean is limited in my study (as well as the other criteria listed in the first link)

    Cheers for any input

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats