An important issue I have done little on historically.

"Variables entered in the development model were selected using stepwise backward-elimination approach, starting with all previously reported significant predictors found theoretically and practically to be associated with 90-day employment outcome. All variables significant at the p < 0.05 level were included in the model."

Ok so stepwise is bad (the people who wrote this are smart people, their article shows it. So using stepwise puzzles me). We could use lasso. And they drop variables out of the model like most practitioners I think. But they split their data into two pieces and used the second piece to test the predictions in the first piece. Is dropping variables out this way valid when you do that?

"To examine the performance and goodness of fit of the model, we evaluated measures of overall performance, calibration and discrimination. Overall performance was evaluated using predictive accuracy, Nagelkerke R2 and Brier score statistics. Predictive accuracy assessed how well the model predicted the likelihood of an outcome for an individual client. The Nagelkerke R2 quantified the percentage of the outcome variable (90-day employment) explained by predictors in the model. The Brier score quantified differences between actual outcomes and their predicted probabilities, that is, the mean square error (Steyerberg et al., 2010). The Brier score ranges from 0 to 0.25, values close to 0 indicate a useful model and values close to 0.25 a non-informative or worthless model (Steyerberg et al., 2010)."

I have heard doubtful things about the use of R squared in logistic regression and am not familiar with the Brier score at all (I never heard about it before last night). What do others think about using this form of R square or Brier score in evaluating a model?

I am going to ask about calibration and discrimination next

"Variables entered in the development model were selected using stepwise backward-elimination approach, starting with all previously reported significant predictors found theoretically and practically to be associated with 90-day employment outcome. All variables significant at the p < 0.05 level were included in the model."

Ok so stepwise is bad (the people who wrote this are smart people, their article shows it. So using stepwise puzzles me). We could use lasso. And they drop variables out of the model like most practitioners I think. But they split their data into two pieces and used the second piece to test the predictions in the first piece. Is dropping variables out this way valid when you do that?

"To examine the performance and goodness of fit of the model, we evaluated measures of overall performance, calibration and discrimination. Overall performance was evaluated using predictive accuracy, Nagelkerke R2 and Brier score statistics. Predictive accuracy assessed how well the model predicted the likelihood of an outcome for an individual client. The Nagelkerke R2 quantified the percentage of the outcome variable (90-day employment) explained by predictors in the model. The Brier score quantified differences between actual outcomes and their predicted probabilities, that is, the mean square error (Steyerberg et al., 2010). The Brier score ranges from 0 to 0.25, values close to 0 indicate a useful model and values close to 0.25 a non-informative or worthless model (Steyerberg et al., 2010)."

I have heard doubtful things about the use of R squared in logistic regression and am not familiar with the Brier score at all (I never heard about it before last night). What do others think about using this form of R square or Brier score in evaluating a model?

I am going to ask about calibration and discrimination next

Last edited: