This week, I’ve talked about the retrospective price of WAR on an aggregate level. What I haven’t studied is the retrospective price of WAR by position. I thought this was particularly important in light of my finding that positional adjustments didn’t matter much for arbitration salaries. Players who played tougher defensive positions were underpaid in arbitration, relative to those who played easier defensive positions. As it turns out, the price of WAR has been much more expensive for some positions.
|Position||N||2007-11 $/WAR||2007 $/WAR||2008 $/WAR||2009 $/WAR||2010 $/WAR||2011 $/WAR|
The elephant in the room, of course, is the astonishing $16.4 million per WAR paid to relief pitchers. Let’s ignore that for a moment and simply look at hitters. It’s clear that as you move to the easier end of the defensive spectrum, the more likely you are to be overpaid. Of course, that could be a mirage, because players can be moved to first base, outfield or even to designated hitter if they’re aging poorly. Because of that, I looked only at the price paid in the first year of deals. The same result held: hitters who play tougher positions are underpaid on a per-WAR basis.
|Position—First Year Only||N||2007-11 $/WAR|
Not only are first base, outfield and designated hitter overpaid either way, but so are relievers. In fact, relievers are paid nearly twice as much for the same WAR as other players. Does that mean that relievers are overpaid? Maybe not. That all hinges on WAR being correctly specified.
There are two different WAR statistics commonly available on the web — and they agree more often than they disagree. The other WAR, which Sean Smith (a.k.a. “Rally”) invented, is called rWAR and is available at Baseball Reference. Smith’s rWAR has a much higher replacement level for starting pitchers and much lower replacement level for relievers. As a result, using rWAR suggests relievers are paid 30% more than comparable producers elsewhere on the roster, while fWAR puts that premium at 200%.
Does that mean that rWAR has a more accurate view of relievers than our WAR at FanGraphs? Not really. It just shows how complicated reliever evaluation can be. If we run the same analysis on Baseball Prospectus WARP, we would get answers even more extreme than the examples here. Relievers accumulated 89.4 fWAR this past season, but only about 50 WARP. Relievers might be overpaid by an even larger factor if we used Baseball Prospectus’ numbers, which means FanGraphs might represent a sort of a centrist’s view on reliever value.
This undoubtedly requires more research, but I’ll propose one early possibility about the price of relievers: For a market failure this large to exist, there would need to be incentive problems at some level. The first thing I thought to check was whether general managers are building better bullpens to save their jobs. After all, GMs don’t give out contracts to maximize their chance of winning a division or a World Series. They give out contracts to maximize their own value, which is inextricably tied to keeping their jobs.
Between 1994 and 2011, there were 40 GMs fired among 532 team seasons. Below, I have a breakdown of hitter, starter and reliever WAR, looking separately at teams whose GM was fired and teams whose GM was not fired. In that table, we can see that the average team whose GM was fired had half as much WAR from relievers as the average team whose GM who were not fired. Overall, the average ranking of bullpens that belonged to GMs who later were fired was 21.6.
|WAR Group||Avg. if GM Not Fired||Avg. if GM Fired||Ratio||Rank if GM Fired|
What this suggests is that general managers have watched one another be fired for having bad bullpens — but they’ve seen peers be forgiven for having bad lineups or poor starting pitching. It seems that when a team falters because of poor hitting or bad pitching, the farm system, the scouts, the manager or the players get blamed. When a bullpen implodes, it’s the general manager’s fault.
I tried to run a couple of regressions to figure out if there was anything else we could learn that would prove or disprove this hypothesis. First, I ran a regression of a team’s wins on its WAR from hitters, starters and relievers.
Wins = 42.2 + .95*WAR_hitters + 1.03*WAR_SP + 1.36*WAR_RP
It does look like relievers might be undervalued, accoridng to WAR. Still, the 95% confidence interval for the coefficient on reliever WAR is (1.12, 1.60), so this could be a small difference. In reality, the fact that high-strikeout relievers have particularly good BABIPs could mean that their FIPs slightly underrate their skill levels.
But then I ran a logit regression to test the probability of a GM being fired as a function of his team’s hitters, starters or relievers. The logit regression isn’t linear — which makes it difficult to read too much into the exact numbers — but the coefficient on reliever WAR is extraordinarily large when compared to the coefficients on hitter WAR and starting pitcher WAR.
GM_fired = f ( -.133 – .044*WAR_hitters -.093*WAR_SP – .203*WAR_RP)
Using the ratios of the coefficients in these two equations isn’t a perfect mathematical approach, but it does tell a compelling story:
The coefficients of WAR on winning are comparable, but the coefficients of WAR on being fired are very different. Reliever performance factors into a GM’s ability to keep his job far more than the batters he signs.
On the other hand, when I run the same two regressions for rWAR, we see entirely different results.
Wins = 42.2 + .94*rWAR_hitters + .89*rWAR_SP + .97*rWAR_RP
GM_fired = f ( -.636 – .074*rWAR_hitters -.094*rWAR_SP – .043*rWAR_RP)
The same tables are quite different, as well.
|rWAR Group||Avg. if GM Not Fired||Avg. if GM Fired||Ratio||Rank if GM Fired|
Interpreting rWAR numbers is difficult, because I’m making an adjustment for draft picks that I do for fWAR. Using fWAR to develop a way to evaluate draft pick costs, the replacement level yields a pretty natural 10% discount rate. Teams behave as though they value wins a year from now 10% less than this year. So even though they somewhat value the WAR from future draft picks, they value them significantly less than WAR in the coming year. Using rWAR, though, I actually got a negative discount rate. That seems unlikely since teams probably don’t value future wins more than present wins — but that’s what the model suggested. Not only that, but the price of rWAR would probably not be linear using rWAR. It would be slightly lower from that of superstar players. In essence, the model would need to be reworked to accommodate rWAR, and that might suggest that the fWAR results are more likely to be accurate.
Like the previous two articles, it’s OK if you come away with more questions than answers. We’re dealing with a rough approximation of an increasingly complicated labor market, and definitive conclusions are hard to reach. What these tables do suggest is that there’s either room for improvement in the way managers put together their teams or in sabermetrics, as a whole. As we’re discovering, the answer is probably somewhere in between.