# Thread: Relative impact of variables on DV

1. ## Relative impact of variables on DV

I keep circling back to this issue, this time with a twist. I am arguing with some sports statisticians (who use statistics very differently than the rules I have come to know) about what is more important, the ability to hit or the ability to hit and walk. Both in terms of their contribution to scoring runs. Essentially the argument boils down to, if you have to trade off an ability to hit versus hit and walks which is better to score runs. So for example would you prefer someone who batted .250 and had a OBP of .270 (OBP includes walks and hits) or would you prefer someone who batted .230 but had a OBP of .290.

Initially I was going to regress walks, singles, doubles, triples, and home runs on runs and see which has the greater slope. But I don't think this really answers the question above. I think a better way would be to see which proportion of runs could be assigned to each of these categories. I am also not sure if I have to treat these five ways of getting on base as part of a categorical variable (with a reference category omitted). I don't think so since they are essentially an interval variable.

So is there a way to find out what proportion of runs each of these groups (e.g., singles) contributed to runs controlling for the other? A second question is about an argument they made about how to address this. They argued that because R squared is higher (in terms of runs scored) for OBP than hits this meant OBP was a stronger contributor (predictor) of runs. My argument was that since hits was part of the calculations of OBP all this showed is you added another variable to the model (walks) and did not show OBP was more important than hits at prediction at all.

2. ## Re: Relative impact of variables on DV

I have never played around with baseball data, but it always seems to breach the independence assumption and wonder if it ever controls for clusters like in mixed models?

Are you talking about RBIs? If you are walked, no runner typically advance. But if you have a hit runners are more likely to advance. Runners are always gonna advance with homeruns, but they are more rare. Baseball data is goofy in my opinion. You can always use partial R-squared for each variable in the model, I guess.

 Tweet

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts