Strike Zone Generosity and Team Pitching Success

March 1, 2013

Everything, ultimately, has to come down to runs. Or wins, I suppose, but wins and runs are strongly correlated. By boiling measures and evaluations down to runs, we’re given an understanding of how much they matter at the end of the day. We know how to value a guy who hits a lot of home runs. We know how to value another guy who’s said to be great in the field. Runs and wins are at the core of performance analysis, because runs and wins are what teams are trying to add to get better.

When you talk about catcher pitch-framing, one generally ends up talking about the difference between a ball and a strike. It might seem like a missed call here and there shouldn’t matter — these are just individual pitches! — but each call does matter, and as they pile up, they matter more. Toward the end of last season, Joe Maddon said something to the effect of Jose Molina saving his team 50 runs or so because of his receiving. Catchers are ranked on their framing by runs saved or cost, and this is calculated by using the run-value difference between a ball and a strike. Each season, the best framers seem to be tens of runs better than the worst framers. When you’re talking about tens of runs, you’re talking about a significant effect.

But the actual effect, presumably, is quite complicated, in the way that park factors are quite complicated. You can look at framing in isolation, and that’s how you can end up with differences of tens of runs. But it isn’t as simple as just taking a pitch and making it a ball or a strike. That pitch will have an effect on the next pitch, which will have an effect on the next pitch, and so on. Perhaps, after a ball, a strike is more likely to follow. Perhaps, after a strike, a ball is more likely to follow. How much does framing, or the size of the strike zone, make a difference on a team level, when you include the whole picture?

Following, you will see charts. Preceding the charts is this explanation of what you’re looking at. For the trillionth time, I’ve used FanGraphs’ plate-discipline data to come up with expected strikes totals for individual team pitching staffs from 2008-2012. I then calculated the difference between actual strikes and expected strikes, and put that on a per-game scale, where the average game has about 78 called pitches. I then adjusted the numbers to set the year-to-year league average at zero. A Diff/Game of 1.0 refers to about one extra strike per game. A Diff/Game of -3.0 refers to about three fewer strikes per game. I calculated every team’s Diff/Game for every year since 2008, and then I plotted some performance metrics against them. The charts begin now.

Looking at team ERA- against Diff/Game, we see a downward trend, but the relationship is weak. Still, the average ERA- of the 15 worst teams in Diff/Game is 105. The average ERA- of the 15 best teams in Diff/Game is 96. Our slope is -1.8 — that is, for each additional strike, ERA- goes down by nearly two points. There’s something here, as noisy as it is.

Of course, we know that ERA and thus ERA- can be noisy. FIP- offers a little stabilization, and here we see a slightly stronger relationship, albeit still a weak one. The slope of our line is about -1.5, but the data sure looks scattered.

And here’s our strongest relationship of the three, where we just isolate strikeouts, walks, and fly balls. The slope, again, is about -1.5, and the average of the 15 worst teams in Diff/Game is 104 while the average of the 15 best teams in Diff/Game is 96. There’s a lot of noise here, still, and that’s to be expected, but it sure feels like we’re not measuring nothing. By this measure, at least, getting a more generous strike zone is helping pitching staffs prevent runs. When you put it that way it sounds silly and obvious, but one also has to notice the fairly weak relationship. In between the pitches that can be framed, there are a lot of other pitches, and it isn’t clear the effect a well-framed pitch or a poorly-framed pitch has on the next pitch in the sequence.

Potentially of interest:

We hardly see any relationship between strike rate and Diff/Game. Yeah, the slope of the line is positive, but the correlation is tissue paper. This hints at the interconnectedness of it all, and speaks to the danger of thinking about pitch-framing in isolation. Not all pitches are borderline pitches.

For the record, if we go all the way back to the first chart, with ERA-, recall that the slope of the line is about -1.8. Over a full season, this works out to a little over ten runs saved for each additional strike per game over average. Since 2008, 34 teams have finished with a Diff/Game of at least 1.0, while nine teams have finished with a Diff/Game of at least 2.0. Two teams have finished at at least 3.0. At the other end, 30 teams have finished at at least -1.0, with eight teams at at least -2.0 and one team below -3.0. None of the worst 15 teams in Diff/Game finished with an average or better FIP-, which seems worth noting.

The top team of the last five years in Diff/Game? The 2009 Atlanta Braves. The runner-up? The 2011 Atlanta Braves. In third place? The 2008 Atlanta Braves. The 2010 Atlanta Braves show up in sixth, and the 2012 Atlanta Braves show up in seventh. Brian McCann and David Ross have been all right, at least in this department. And also in many other departments too.

9 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

NatsFan73

11 years ago

In as much as cherry picking data is bad, if you were to imagine your first three graphs without the extreme handful of samples on the left and right ends what little relationship there is almost vanishes. Perhaps this is because there may only be a relationship when a team gets A LOT of the calls going one way or the other. An occasional bad call is like an occasional bad pitch. You can work around it and it’s probably not going to make a huge difference in the end. But if you’re consistently getting bad calls you’re going to end up giving up more runs than you’d like.

Doug Lampert

Reply to NatsFan73

I’m looking at them and trying to ignore the end-points and lines, and I’m pretty sure I still see a weak corralation if you remove the 3 or so points furthest to the right or left. XFIP- especially is a tight enough grouping that I’m almost sure removing the extreme right or left won’t have much effect.

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG