The Strike Zone is (Still) Getting More Consistent

Not long ago, I pointed out a couple hilarious game strike zones called by Sean Barber and Clint Fagan. Both umpires called balls on pitches well within the usual zone, and both umpires called strikes on pitches somewhere around the shins. They were awful displays of umpire judgment, but after that, Barber called a much better zone in his next game, and far more importantly, both Barber and Fagan are Triple-A umpires and not regular major-league umpires. The regulars are better than the prospects, just like we see with the players.

And about those regulars — I’ve pointed out in the past that they seemed to be calling more consistent strike zones. One of the neat things about a post like that is that it can be updated, and now that we’ve got a few hundred games finished in 2014, I come bearing some further encouraging news.

I don’t intend to get into too much detail — the numbers can mostly speak for themselves. We’ll look at this in two ways. One leans on a homespun metric I’ve written about before that comes from the FanGraphs plate-discipline leaderboards. We know how many strikes and pitches there have been, league-wide. We know the rate of pitches in the PITCHf/x strike zone, and we know the rate of swings at pitches out of the PITCHf/x strike zone. What this allows for is a simple calculation of expected strikes, which can then be compared to the number of actual observed strikes. In the table below, you’ll see a couple meaningful columns. The first is the difference between actual strikes and expected strikes, per 1,000 called pitches. The second is the difference between actual strikes and expected strikes, per game.

Season Diff/1000 Diff/G
2010 -31 -5.0
2011 -24 -3.8
2012 -16 -2.5
2013 -13 -2.1
2014 -5 -0.8

Per full game, five years ago, there were five fewer strikes than you’d expect based on the PITCHf/x strike zone. So far this year we’ve got a difference under 1, and the table shows a steady improvement. There are still fewer strikes than the automated zone wants there to be, but the gap now is practically nothing, relative to what it’s been.

Now, that doesn’t say enough on its own. What if umpires are calling way too many balls in the zone, and way too many strikes out of it? What if the mistakes are just balancing out? By the second approach, we’ll use data available at Matthew Carruth’s StatCorner. Carruth determines a strike zone as it’s actually called by the league-average umpire. Let’s look at the year-to-year rates of balls in the zone, and strikes out of the zone:

Year zBall% oStrike%
2010 15.2% 7.9%
2011 15.3% 7.3%
2012 14.5% 7.2%
2013 14.0% 6.9%
2014 12.9% 7.6%

In the early going, oStrike% has picked up a bit, perhaps because of an increased emphasis on catcher pitch-framing. Or perhaps because of something else. It’s a small increase, and it represents a return, for now, to a range it’s occupied before. Look over at the zBall% column. Umpires have called fewer balls in the zone than ever, and the improvement from last year, for now, is more than a full percentage point. I’m sure some of this is noise, because some of every sample is noise no matter how big, but we’re talking about a couple hundred baseball games. Just because it’s too early to say anything conclusively doesn’t mean it’s too early to be encouraged.

Even with the improvement, there are a lot of balls called in the zone, and there are a lot of strikes called outside of it. You’re never, ever, ever going to get strike-zone perfection, not as long as it’s up to people, particularly a lot of somewhat older people. But if you accept that there are going to be mistakes, you should be happy with any signs of improvement, and the strike zone now seems to be more consistent than ever. The best alternative to a perfect strike zone is a consistent strike zone, and though what we’re talking about is incremental, most of the umpires in the big leagues have been doing this for eons. It’s a little amazing they’ve collectively been able to make these adjustments, probably in part due to PITCHf/x feedback.

And there also might just be better catchers, who are better and thus more convincing receivers. Over the course of the season, we’ll monitor that oStrike%. But if better receiving makes for fewer balls called in the zone, it can’t be that much of a bad thing. Those who dream of an automated strike zone won’t find much to be happy about in this. One mistake might be one mistake too many. But home-plate umpires are evolving, and even if they never evolve into pitch-calling robots, this is far better than no growth at all.

Print This Post

Jeff made Lookout Landing a thing, but he does not still write there about the Mariners. He does write here, sometimes about the Mariners, but usually not.

57 Responses to “The Strike Zone is (Still) Getting More Consistent”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Mr baseball says:

    I remember most of the stat community ripped selig for installing questec in ballparks. Seems they did a 180 on this.

    -30 Vote -1 Vote +1

  2. Andrew McGee says:

    I would hope this is exactly what was in mind when PITCHf/x was designed. Sure, even it could use some improvement, but one hopes with reasonable feedback between umps and the PICHf/x developers (back and forth) everyone gets better.

    Vote -1 Vote +1

  3. X says:

    Wow, 13% error rate in the strike zone? That’s four times less error than a coin flip! Let me break out the “Mission Accomplished” banner.

    -5 Vote -1 Vote +1

  4. Trillage says:

    Great stuff. I’m confused about one thing, though: “You’re never, ever, ever going to get strike-zone perfection, not as long as it’s up to people, particularly a lot of somewhat older people.” That last part: why would umpires being “older people” make perfection less likely? Aren’t younger people similarly imperfect? Is it because older people are more likely to have failing eyesight? I would think that the experience of established umpires would more than compensate for that, as positioning is more important for calling balls and strikes than eyesight simply.

    Vote -1 Vote +1

    • Just an eyesight thing. Happy to be wrong.

      Vote -1 Vote +1

      • Northhampstonstead says:

        You actually made the point earlier in the article that “prospects,” i.e., the umpires who tend to be younger and less experienced, tend to make more mistakes. I don’t think that eyesight matters much when you’re talking about an apple-sized object in an invisible box over the plate. Experience probably matters a lot more, just as you suggested.

        Vote -1 Vote +1

        • In theory you could have experienced young umpires, and they would probably be a little better than our current experienced older umpires. It was strictly a remark about age, if you could hold experience equal. It would probably be a subtle thing anyway.

          Vote -1 Vote +1

        • dls says:

          In theory you could have technology do it, and be 100% accurate.

          Vote -1 Vote +1

      • Baltar says:

        You’re not wrong, Jeff. As an older person (over 2/3 of a century), I recognize that I have a lot of problems beyond eyesight. My mind doesn’t work nearly as well as it did 1/3 of a century ago. I’m talking about the difference between Albert Einstein and George W. Bush. My quickness–what quickness?
        And…and…I was thinking about something. What was it?

        Vote -1 Vote +1

  5. SucramRenrut says:

    The real problem is that the better technology used on broadcasts probably leads most people to think the game is being called less consistently than in the past. The pitchfx on the Toronto broadcast makes umps look bad all the time.

    +5 Vote -1 Vote +1

    • Za says:

      Is it actually drawing from the PITCHf/x system, though? Or is it a proprietary system?

      Vote -1 Vote +1

  6. Jason Powers says:

    Normally not a PC or one to be a police it, but Mr Sullivan you should change the older person line unless youve definitively done some analysis on said umpires that equates error rate to age. Sure, as we get older our reaction times and performances decline in athletic endeavors as Dr. Fair from Yale and others have determined. But is umpiring skill tied to age and is its supposed that ones visual acuity is errant due to just getting older? Moreover how good does ones eyesight have to be? 20/20? Better? Corrected to? You were sloppy with line…as another noticed. Agism happens enough in society and sport without adding to it with an overgeneralization for a population whose judgment and rule knowledge might be more decisive than just the hope we can just let technology do all the decision making for us.

    -7 Vote -1 Vote +1

  7. BG says:

    This article estimates that over 20 percent of pitches are incorrectly called. That’s around 60 wrong calls a game. Does anyone else think that we should eventually automate ball and strike calls?

    Vote -1 Vote +1

  8. BG says:

    Actually it does. If you look at the last row of the second table, and add up the percentages for missed ball calls and missed strike calls, you will see that 20.5% of all pitches are called incorrectly. As for number of pitches in an average game, just google it.

    -11 Vote -1 Vote +1

    • Roughly 13% of called pitches in the zone are balls.

      Roughly 7-8% of called pitches out of the zone are strikes.

      You can’t just add those together. And not all pitches are called. Some pitches are swung at! Almost half of the pitches.

      +14 Vote -1 Vote +1

      • Jason B says:

        But in an effort to find fault, or support for the position I already staked out regardless of the data, I don’t have time to quibble with your silly maths!

        Vote -1 Vote +1

    • Tangotiger says:

      In case someone else thinks they should add those numbers, consider this illustration. In a game of 280 pitches, you’ll have 140 pitches called. Let’s say there are 70 of those in the strike zone and 70 outside (all numbers for illustrative purposes only). Of the 70 in the strike zone, 7% are badly called, or 5 pitches. Of the 70 out of the strike zone, 13% are badly called, or 9 pitches.

      That gives us 14 pitches out of the 140 called pitches, or 10%.

      The original commenter, rather than taking a weighted average of 13% and 7% instead added them, thereby doubling the error rate. And instead of applying it to called pitches, applied it instead to all pitches, thereby doubling the quanity. And so, his estimate of 60 pitches is 4 times larger than it should be.

      The lesson? Practice safe math.

      +12 Vote -1 Vote +1

  9. brentdaily says:

    This is really interesting, thanks Jeff.

    While many balk at the value of catcher framing, the affect of umps getting better is actually 33% more impactful on the run environment than the best framers over the past couple years. (e.g. Grandal and Conger each “stole” ~3 strikes/gm in ’13). Since 2010 umps have improved by 4 strikes/gm.

    I think the umps deserve a lot of credit for improving as well as a chapter in our discussion about the reduced run environment/increase in Ks we’re experiencing right now.

    Vote -1 Vote +1

    • Warning Track Power says:

      Except that embracing framers is a part of the listed period and we cannot account for the effect of better framers on the data, especially since the actual improvement during the period is on pitches in the zone that are called balls, probably at the edges of the zone, which is what good framing does in the first place.

      There was an article here where a catcher admitted that his organization provided him with framing data and he tried to work on his areas of deficiency, with better results. I’d be inclined to give the credit to him rather than just take the data above and assume that the umps are just calling better.

      Vote -1 Vote +1

  10. Schuxu says:

    So with umpires getting better, how does this change the value of a good framer? Does he become more valuable or less compared to other catchers?

    Vote -1 Vote +1

    • soupman says:

      i wouldn’t say that the umps are necessarily “getting better”, instead that they are more often basing b/s calls on the rulebook definition of the zone.

      there are a number of explanations as to why this could be the case.

      Vote -1 Vote +1

  11. Kevin says:

    Robot umps now! Rallying cry of the people. And the robots.

    Vote -1 Vote +1

  12. jerry weinstein says:

    It would be interesting to see the effect of a hitter’s & pitcher’s status/resume on strikes & balls.Do the upper echelon more established pitchers get more marginal strikes called when facing a rookie hitter and the same for the established hitter vs. the rookie pitcher?It would nice to see that quantified if such a bias exists.

    Vote -1 Vote +1

  13. gmbristol says:

    Feel like we start a PAC to automate balls and strikes. There is no reason to tolerate this sort of persistent error in judgement. Let the players decide the game. The umps should be irrelvant to the outcome. Which is clearly not the case, even if they are getting better.

    Vote -1 Vote +1

    • The Ancient Mariner says:

      We should also start a movement to replace the players with robots! There’s no reason to tolerate their persistent errors in judgment either.

      Vote -1 Vote +1

      • Sam Cro says:

        Or we can get auto-bots to replace The Ancient Mariner and have it reply to posters with stupid comments!

        Vote -1 Vote +1

  14. YankeeGM says:

    I’ve always thought that with technology as it is now, the home plate ump could be given a ball/strike counter that simply vibrated when the pitch is a strike. This would maintain the illusion that umps are calling balls and strikes while actually getting the calls right.

    Vote -1 Vote +1

    • Hank G. says:

      I’ve always thought that with technology as it is now, the home plate ump could be given a ball/strike counter that simply vibrated when the pitch is a strike. This would maintain the illusion that umps are calling balls and strikes while actually getting the calls right.

      An alternative would be to monitor the umpire’s calls and send a painful, but non-lethal shock every time he makes the wrong call. This would probably not be as accurate as your suggestion, but has the potential to be much more entertaining.

      Vote -1 Vote +1

    • Baltar says:

      And the benefit of maintaining the illusion is …?

      Vote -1 Vote +1

    • joser says:

      You need an ump to call plays at the plate anyway (and examine scuffed balls and break up mound conferences and hitter-batter altercations), so there’s going to be a guy there. I actually would like a simple display in the ump’s mask showing the PitchFX determination (this could be as minimal as four red LEDs in a cross shape indicating balls that missed the zone in a particular dimension, and a central LED to indicate a strike). The umps would be told they had no requirement to call the pitch according to the indicator (after all, PitchFX does fail occasionally) but human nature being what it is, they would tend to do so — whether consciously (especially in those cases when they aren’t certain, which may correspond with the “blown” calls that bug us the most) or not. Pitch-by-pitch feedback will train them, either to make better calls or to just rely on PitchFX — and either way consistency wins.

      Vote -1 Vote +1

  15. Sam Cro says:

    If we can land a man on the moon or place a bomb up the butt of Terrorists, we can use technology to call Balls and Strikes. Buy the Umps some Google Glasses and tie it into PITCHf/x strike zone, get going on this.

    Vote -1 Vote +1

  16. Hank G. says:

    Automating the calls on balls and strikes would probably have a lower error rate than human umpires, but even if it didn’t it would surely be more consistent. Batters would have to adapt to only one strike zone rather than one for each individual umpire.

    Vote -1 Vote +1

    • Jason B says:

      As long as by “one strike zone” we mean one “as defined by the rulebook strike zone” FOR EACH BATTER. Which would have to be recalibrated slightly from batter to batter, or stored within the automated strike zone system for each individual, as the knees and shoulders of a 5’7 player would not be in the same location as a 6’5 batter.

      (Hopefully no one starts with that nonsense of “same exact strike zone regardless of each player’s size” canard, which is ridiculous on its face; “Hey, I’m ten inches shorter than that last dude, of course I should have to swing at balls at my neck level to compensate!”)

      Vote -1 Vote +1

      • a says:

        A human already does the recalibration for pitch f/x, don’t see why the same thing couldn’t be done for this.

        Vote -1 Vote +1

        • Jason B says:

          Totally. It’s an easy issue to identify and solve. I just remember in one of the earlier discussion of automating the strike zone, the asinine notion of “hey just keep it the same for everyone!” surfaced.

          Vote -1 Vote +1

  17. pft says:

    I saw an article recently showing 14% of called pitches incorrect, but maybe it was
    talking about outside the zone

    Are you using the actual strike zone or the “typical” strike zone (which is the zone umps usually call despite being larger than the official zone)

    Vote -1 Vote +1

  18. pft says:

    I think we should just let the catcher call balls and strikes. If the other team does not like the calls we can have a bench clearing brawl. This will help enforce the honesty of the catcher, especially for big hitters and big teams, and provide a lot of excitement when the catcher cheats.

    Vote -1 Vote +1