Yesterday, I unveiled the best pitcher expected walk percentage equation yet. By simply looking at the percentage of total pitches that are thrown for a strike and the rate at which the strikes thrown are put into play, we can get a pretty good idea of what a pitcher’s walk percentage should be. I was literally in the middle of typing up today’s post putting the equation to work on 2013 data to get an idea of which pitchers should have a higher or lower BB% when another light bulb went off.
Felix Hernandez had appeared on the list of pitchers with a higher xBB% than BB%. In checking out his Str% and I/Str trends on Baseball Reference, the explanation was that his I/Str was at a career low which was increasing his xBB%. Obviously, if a batter fails to put a strike in play, the at-bat continues and the opportunity for a base on balls still exists. However, a low I/Str also illustrates a pitcher’s dominance. If a batter is unable to put a pitcher’s strikes into play, you would assume this pitcher has a high strikeout percentage. In Felix Hernandez’s case, his strikeout percentage sits at a career high. No wonder his I/Str is at a career low, batters are striking out rather than putting the ball in play!
So now I’m thinking that a pitcher’s strikeout percentage must have some impact on his walk rate. Back I went to my data set, bringing in K% and adding that variable to the other two already part of the initial equation. The resulting R-squared improved by a meaningful amount, and better yet was the effect on Felix Hernandez’s xBB%. But first, let’s check out the regression graph.
xBB% = 0.7598 + (-0.7300 * Str%) + (-0.5729 * I/Str) + (-0.2341 * K%)
While it’s usually not a good idea to add another variable for the heck of it, I think the addition of strikeout percentage is necessary. Under the old equation, Hernandez’s xBB% was 8.1%, while under the new one, it is just 7.2%. That’s a huge difference and I feel much more comfortable with the latter. That would be right in line with his career walk percentage. Also important to note is that his Str% is actually lower than his previous three seasons, which is the likely explanation behind his expected walk percentage not being even better.
If you recall, the R-squared from yesterday’s equation was 0.73, so this update provides a small, but meaningful gain. As usual, it still appears that there is work to be done. But I am happier with this equation than yesterday’s, so next week I could get back to the task I initially started and use the equation to look at 2013 data.