## Probabilistic Pitch Framing (part 1)

Let’s take a look at some recent pitches and assess the framing job by the catchers.

Exhibit A: pitch #4 in this sequence from Freddy Garcia to Lucas Duda, as framed by Gerald Laird.

Hey, great framing job, Gerald Laird! That pitch was clearly a rulebook ball and you got a strike called for your pitcher. 1 point for you.

Exhibit B: pitch #3 in this sequence from Jeff Samardzija to Joey Votto, as framed by Dioner Navarro.

Boo, terrible framing job, Dioner Navarro! You just cost your pitcher what was clearly a rulebook strike! -1 points for you.

To the best of my knowledge, this is how most pitch-framing calculations currently work. We check to see if the pitch was in the zone, and give the catcher a positive or negative credit for pitches that were called differently from how they “should” have been called. But is that really answering the right question?

Consider the two (extreme, cherry-picked) examples above. In example A, a pitch was called a strike that was just off the outside corner of the plate to a left-handed hitter on a 3-0 count. It is almost certainly the case that no one in the ballpark was surprised at the result of that pitch. After all, we know that the strike zone as it is called to left-handed hitters extends a bit off the corner, and that on 3-0 counts the umpire tends to extend the strike zone a bit anyway. So should Gerald Laird get full credit for getting that pitch called a strike?

Exhibit B is the exact opposite case in many ways. We had an 0-2 count on a left-handed hitter, and the pitch was near the top of the strike zone. Given that the strike zone as it is called shrinks somewhat in an 0-2 count, and that it is shifted away to a left-handed hitter, the catcher was unlikely to get that call. So should Dioner Navarro get a full demerit for that pitch being called a ball?

Let’s do some crude calculations. The pitch to Duda was 0.974 feet from the center of the plate, and 2.01 feet off the ground. Since the start of the 2012 season, there have been (according to the best data I can find) 203 pitches to left-handed hitters in a 3-0 count that fell between 0.9 and 1.2 feet from the center of home plate (in the right-handed batter’s box) and ended up between 1.6 and 2.4 feet off the ground. *Over 77% of those pitches (157/203) were called strikes. *Laird shouldn’t get much credit at all for that frame job, right?

Similarly, let’s explore exhibit B. The pitch to Votto was 0.671 feet from the center of the plate and 3.341 feet off the ground. I can find 89 pitches that fell between 0.47 and 0.83 feet from the center of the plate (inside to a lefty, of course) and ended up between 3 feet and the top of the strike zone to left-handed hitters in an 0-2 count. Of these, 50 (56%) were called balls. So should we really be penalizing Dioner Navarro all that much for that frame job?

As I hinted above, we have been answering the wrong question. We shouldn’t be comparing what a catcher did to the rulebook strike zone. We should be comparing what a catcher did *to the probability that the call would have gone the way it did anyway.* It doesn’t matter what the actual strike zone is; all that matters is how the umpires are calling it. This turns the calculation from a binary one (like the calculation of fielding percentage) to a probabilistic one (like the calculation of plus/minus). Under this system, Laird would have received a credit of 0.23 for his frame, and Navarro a demerit of 0.44 for his framing.

In part 2 of this series, we will actually go about constructing the formal system to do this so we don’t have to do crude approximations like the ones above (spoiler: it will look a lot like the excellent work Matthew did here). There will be math, yes, but there will also be lots of pretty pictures and maybe even an animated gif! In part 3, we will actually apply this system to see which catchers have done the best frame jobs since the start of 2012 (assuming I can associate catcher data to my pitch f/x data by then).

Huge thanks to MLB for making the pitch f/x data freely available (seriously, how awesome is that?), Mike Fast for teaching me how to make a pitch f/x database, and Brooks Baseball for making the images in this post. Also, thanks to you for reading this post and adding helpful, insightful comments below.

Print This Post

Some notes relayed to me in private:

1. My focus on the “rulebook” strike zone is wrong-headed. Ben Lindbergh uses Zone and O-Zone, which are slightly more sophisticated. Still, the point is the same: it’s just a binary “was it in the zone” test, and not what I’m proposing.

2. It appears someone *has* done what I will present, almost identically (kinda upset I didn’t know this before doing all this work). But I haven’t seen it written up on the web anywhere, so I’ll at least write up what I’m doing so that it will be freely available. And who knows, maybe there are subtle differences between the two.

I don’t care about what has been written up somewhere and is not available on the web, this is a great article, Kudzu. It should be reprinted in the main FanGraphs section, and you should be hired as a FanGraphs writer.

I can’t wait for your remaining articles.

Your approach is good, but I don’t understand why you only talked about one pitch for each of the at bats. In Exhibit A. you should be looking at pitches 1,2,and 4. Pitch 3 is far enough outside that any called strikes in that location are probably just noise. In Exhibit B. 1,2, and 3 are all called pitches close enough to the zone to be given probablistic values in your metric.

n determining the probability you may also want to consider the pitch type and the handedness of the pitcher as well as the location and the count. This will make your sample sizes smaller, of course, but the extra accuracy may be worth it. You also should review Max Marchi’s articles on pitch framing if you haven’t already. He now writes for BP but his initial article of framing may have been written for the Hardball Times.

Part 2 is coming out soon and will have more details, but *all* pitches received by a catcher will be given probabilistic values in the metric. I only talked about one pitch per at-bat here because I was cherry-picking the extreme ones.

Thanks for the comments.