Strike Zone Generosity and Team Pitching Success

Everything, ultimately, has to come down to runs. Or wins, I suppose, but wins and runs are strongly correlated. By boiling measures and evaluations down to runs, we’re given an understanding of how much they matter at the end of the day. We know how to value a guy who hits a lot of home runs. We know how to value another guy who’s said to be great in the field. Runs and wins are at the core of performance analysis, because runs and wins are what teams are trying to add to get better.

When you talk about catcher pitch-framing, one generally ends up talking about the difference between a ball and a strike. It might seem like a missed call here and there shouldn’t matter — these are just individual pitches! — but each call does matter, and as they pile up, they matter more. Toward the end of last season, Joe Maddon said something to the effect of Jose Molina saving his team 50 runs or so because of his receiving. Catchers are ranked on their framing by runs saved or cost, and this is calculated by using the run-value difference between a ball and a strike. Each season, the best framers seem to be tens of runs better than the worst framers. When you’re talking about tens of runs, you’re talking about a significant effect.

But the actual effect, presumably, is quite complicated, in the way that park factors are quite complicated. You can look at framing in isolation, and that’s how you can end up with differences of tens of runs. But it isn’t as simple as just taking a pitch and making it a ball or a strike. That pitch will have an effect on the next pitch, which will have an effect on the next pitch, and so on. Perhaps, after a ball, a strike is more likely to follow. Perhaps, after a strike, a ball is more likely to follow. How much does framing, or the size of the strike zone, make a difference on a team level, when you include the whole picture?

Following, you will see charts. Preceding the charts is this explanation of what you’re looking at. For the trillionth time, I’ve used FanGraphs’ plate-discipline data to come up with expected strikes totals for individual team pitching staffs from 2008-2012. I then calculated the difference between actual strikes and expected strikes, and put that on a per-game scale, where the average game has about 78 called pitches. I then adjusted the numbers to set the year-to-year league average at zero. A Diff/Game of 1.0 refers to about one extra strike per game. A Diff/Game of -3.0 refers to about three fewer strikes per game. I calculated every team’s Diff/Game for every year since 2008, and then I plotted some performance metrics against them. The charts begin now.

differa

Looking at team ERA- against Diff/Game, we see a downward trend, but the relationship is weak. Still, the average ERA- of the 15 worst teams in Diff/Game is 105. The average ERA- of the 15 best teams in Diff/Game is 96. Our slope is -1.8 — that is, for each additional strike, ERA- goes down by nearly two points. There’s something here, as noisy as it is.

difffip

Of course, we know that ERA and thus ERA- can be noisy. FIP- offers a little stabilization, and here we see a slightly stronger relationship, albeit still a weak one. The slope of our line is about -1.5, but the data sure looks scattered.

diffxfip

And here’s our strongest relationship of the three, where we just isolate strikeouts, walks, and fly balls. The slope, again, is about -1.5, and the average of the 15 worst teams in Diff/Game is 104 while the average of the 15 best teams in Diff/Game is 96. There’s a lot of noise here, still, and that’s to be expected, but it sure feels like we’re not measuring nothing. By this measure, at least, getting a more generous strike zone is helping pitching staffs prevent runs. When you put it that way it sounds silly and obvious, but one also has to notice the fairly weak relationship. In between the pitches that can be framed, there are a lot of other pitches, and it isn’t clear the effect a well-framed pitch or a poorly-framed pitch has on the next pitch in the sequence.

Potentially of interest:

diffstrike

We hardly see any relationship between strike rate and Diff/Game. Yeah, the slope of the line is positive, but the correlation is tissue paper. This hints at the interconnectedness of it all, and speaks to the danger of thinking about pitch-framing in isolation. Not all pitches are borderline pitches.

For the record, if we go all the way back to the first chart, with ERA-, recall that the slope of the line is about -1.8. Over a full season, this works out to a little over ten runs saved for each additional strike per game over average. Since 2008, 34 teams have finished with a Diff/Game of at least 1.0, while nine teams have finished with a Diff/Game of at least 2.0. Two teams have finished at at least 3.0. At the other end, 30 teams have finished at at least -1.0, with eight teams at at least -2.0 and one team below -3.0. None of the worst 15 teams in Diff/Game finished with an average or better FIP-, which seems worth noting.

The top team of the last five years in Diff/Game? The 2009 Atlanta Braves. The runner-up? The 2011 Atlanta Braves. In third place? The 2008 Atlanta Braves. The 2010 Atlanta Braves show up in sixth, and the 2012 Atlanta Braves show up in seventh. Brian McCann and David Ross have been all right, at least in this department. And also in many other departments too.




Print This Post



Jeff made Lookout Landing a thing, but he does not still write there about the Mariners. He does write here, sometimes about the Mariners, but usually not.

9 Responses to “Strike Zone Generosity and Team Pitching Success”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. NatsFan73 says:

    In as much as cherry picking data is bad, if you were to imagine your first three graphs without the extreme handful of samples on the left and right ends what little relationship there is almost vanishes. Perhaps this is because there may only be a relationship when a team gets A LOT of the calls going one way or the other. An occasional bad call is like an occasional bad pitch. You can work around it and it’s probably not going to make a huge difference in the end. But if you’re consistently getting bad calls you’re going to end up giving up more runs than you’d like.

    Vote -1 Vote +1

    • Doug Lampert says:

      I’m looking at them and trying to ignore the end-points and lines, and I’m pretty sure I still see a weak corralation if you remove the 3 or so points furthest to the right or left. XFIP- especially is a tight enough grouping that I’m almost sure removing the extreme right or left won’t have much effect.

      Vote -1 Vote +1

  2. Braves1995 says:

    And if you cherry pick data to ignore the 9th inning of Game 5 of the NLDS, the Nationals had a great season last year.

    Vote -1 Vote +1

  3. The 50 run comment from Maddon appears to have come from a Max Marchi tweet, the Baseball Prospectus and former Hardball Times writer who has led the way in researching raw framing data. He’s not released any complete data this season — maybe teams are paying him for it now? — but I imagine 50 runs tops the league.

    With respect to your earlier article, I wonder if increasing umpire efficiency has made Molina even more valuable — as the 50 run figure seem to demonstrate improvements by the umpires may not have effected Molina’s framing talent.

    Vote -1 Vote +1

  4. James says:

    Jeff, first, I really enjoy the work you guys do but sometimes, as in this case, I wish you would take it one fairly easy step further. R^2 tells you something about strength of correlation, but for a rigorous statistical interpretation it’s fairly useless. What could be done here is to do a regression analysis where you compare your data sets to the null hypothesis that the slope of each line is zero. Then you can choose whatever criterion you want for statistical significance (p<0.05,p<0.01,etc.) and the test tells you within that confidence either the null hypothesis is true (a lot of noise) or false (woohoo, it's a real trend!)

    Vote -1 Vote +1

  5. I disagree that we do a good job of valuing fielders. I’d go with slightly better than nothing…

    Vote -1 Vote +1

    • Baltar says:

      Advanced fielding statistics are a long ways from perfect, but a lot better than anything else in the history of the game–especially “eyes.”
      I wish critics would stop knocking it and suggest something better instead.

      Vote -1 Vote +1

      • I don’t have a better solution. But that shouldn’t stop me, or anyone, from acknowledging the limitations of our tools.

        Btw, our eyes are a pretty descent tool, when aggregated, at evaluating defense.

        Vote -1 Vote +1

  6. Jon S. says:

    I dig the use of Science. As much as I like seeing fellow Chem majors find success outside of the discipline, I think seeing someone find a use for that book learnin’ while being great at something not chemistry is even better.

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *