Using WPA to grade bullpen management, part one

For my money, one of the greatest sabermetric successes of the last 10 years has been the incredible amount of applications for win expectancy. We have graphs that track a team’s chances of winning a game in real time. We have leverage index tables that act as multipliers for intense, game changing situations. And we have live running totals of every player’s contributions to a team win, known as Win Probability Added, or WPA.

The relative lack of conceptual complexity is what sets WPA apart from almost every other statistic in the deep end of the sabermetric toolbox. If you wanted to explain wOBA to a statistical novice, you’d run into some trouble. Statistics like wOBA or UZR or even OPS sit firmly on a theoretical plane, hidden away from reality by several layers of abstraction. Runs are not actual, physical runs in the RBI sense, but theoretical constructs of what a run would be in a perfect spherical vacuum. It’s absolutely for the best, as these statistics pare away excess and correct for biases in the data set. But when you’re trying to explain to someone that Mike Trout’s wOBA wasn’t .409 runs or .409 hits per at bat, but just .409, it’s natural that some people would have a conceptual problem.

But WPA is different. Anyone can understand it, because it’s a natural mathematical extension of how we all watch the game. If a player hits a double that boosts his team’s chances of winning from 60 to 80 percent, he gets credited with 0.20 WPA. A full 1.00 WPA isn’t a theoretical win (as in WAR), but an actual win formed from a composite of real game events. To borrow a phrase from every physicist I’ve ever known, WPA is elegant in its simplicity.

And like most stats, WPA spurred a chain reaction of ancillary concepts and metrics. A quick perusal of the Win Probability section of Fangraphs shows dozens of related statistics, all stemming from basic WPA. I find the Clutch statistic unusually satisfying, as it precisely quantifies something that’s inherently nebulous. It’s very simple—it measures performance and compares it to context-independent performance in a bubble. A player who hit better when it mattered will end up with a positive Clutch score, and vice versa. Again, it’s a very mathematically elegant solution to an idea that’s simple in concept, yet difficult to work with analytically.

Study after study has shown that clutch performance is (mostly) a non-repeatable skill, and as such, the Clutch metric isn’t used very often. Batters generally can’t turn it up when they need to. Weak hitters are sometimes forced to come up to the plate when the game is on the line, and there’s not a whole lot that the hitter can do. But what about relievers? Unlike hitters, a bullpen can collectively demonstrate a repeatable clutch skill, simply because the manager gets to decide when to use stronger arms. If a bullpen needs a clutch strikeout in a tight spot, the manager can tilt the odds to make it happen. So why not use Clutch score to evaluate and grade a manager’s in-game tactics?

To get there, we have to start with a few modifications. The formula for Clutch score is below.


To put it in more understandable terms, total WPA is adjusted so that a full season’s leverage is equal to the average. Unleveraged context-independent performance is then subtracted. The result shows how much better a player was in high-leverage situations, as compared to his average performance level. The pLI denominator in the first term is an adjustment, which basically corrects a player’s plate appearances so that the player appeared in an equal number of high- and low-leverage situations. It’s a tiny adjustment for most batters, save for the handful of players who are often deployed situationally in pinch hit appearances.

It is a necessary adjustment, however, because Clutch score is attempting to measure the difference in performance between a high-leverage situation and the hitter’s usual performance level. For the purpose of example, let’s create an imaginary player who had 0.40 WPA on the year. He finished the year with two plate appearances, both home runs. One came in a perfectly average situation with a leverage index of 1.0, which was worth 0.10 WPA. The other was a clutch game-changer with a leverage index of 3.0, worth 0.30 WPA. Since he hit exactly the same way in both clutch situations and non-clutch situations, his Clutch score should be precisely zero. But without that pLI adjustment, his Clutch score would be 0.2, simply because he was dealt more than his fair share of high-intensity situations. Batters obviously can’t choose where and when to hit, so the pLI denominator removes it from the equation.

For relievers, however, that pLI correction covers up exactly what we’re trying to measure. Managers choose when to deploy relievers, which means that if we remove the correction factor from the Clutch equation, we’ll arrive at a metric that combines reliever performance with manager tactics to give a total number of how many wins the bullpen added by performing well in the clutch situations they were purposefully given.

The top ten relievers in 2012 who squeezed the most WPA out of a combination of timely performance and efficient bullpen management are below.

The Incompleat Starting Pitcher
The end of the nine-inning start and how we got here.

Jim Johnson 5.35 2.18 3.17
Fernando Rodney 4.82 2.46 2.36
Steve Cishek 2.68 0.36 2.32
Mike Adams 2.61 0.33 2.28
Vinnie Pestano 3.54 1.27 2.27
Craig Kimbrel 4.15 1.98 2.17
Rafael Soriano 2.95 0.80 2.15
Tom Wilhelmsen 2.63 0.53 2.10
Luke Gregerson 2.49 0.50 1.99
Darren O’Day 3.42 1.47 1.95

Jim Johnson had the second highest average leverage index (pLI) of all qualified relievers in 2012, and he put it to good use, by chipping in 3.17 additional wins to the Orioles’ total. Most relievers were bunched up within one win above or below zero, but Jim Johnson stood far and away above the crowd. Click to enlarge.


The other end of the spectrum, consisting of relievers who performed poorly in the large amount of clutch situations their managers gave them, also exhibited a pretty large deviation from zero.

Jon Rauch -0.66 0.20 -0.86
Brian Duensing 0.51 1.37 -0.86
Matt Thornton -0.32 0.61 -0.93
Kameron Loe -1.17 -0.22 -0.95
Jim Miller -0.78 0.19 -0.97
Scott Atchison 0.21 1.25 -1.04
Francisco Rodriguez -1.01 0.15 -1.16
Matt Albers -1.11 0.51 -1.62
Alfredo Aceves -2.55 -0.40 -2.15
Heath Bell -3.03 -0.63 -2.4

Next article, I’ll isolate manager tactics from player performance, as well as combine individual player scores into a single number for each bullpen as a group.

Print This Post
Sort by:   newest | oldest | most voted

Very interesting article,look forward to part two!

Pizza Cutter
Pizza Cutter

I’m intrigued to see where this is going…

Hey Dan, that’s an interesting approach to this issue.  As you say, your approach combines both the usage of the reliever and his performance in the mix.  I would think the best way to take the manager decision out would be plain old Clutch, so I’m interested to see where you’re going. It’s worth saying that WPA/LI is a very interesting stat in and of itself. It’s basically a measure of the situational win a player contributed, regardless of the LI of the situation. It’s unique and hard to explain, but very powerful. Also, this is a good time to… Read more »