Poking Some More at the Effects of Receiving

Used to be the hipster thing was to talk about pitch-framing, or pitch-receiving, and how it’s more important than it’s been given credit for. That was all well and fun, but people have a pretty good idea now, as the concept has gone borderline mainstream. And it turns out we don’t actually know that much about the effects, since it’s not as simple as calculating the difference between a ball and a strike. Of course, all else being equal, a good receiver is more valuable than a bad one, but we don’t know how much more valuable. The new hipster thing is to talk about receiving realistically. To distrust the idea of a guy being worth something like 50 runs above average. I live in Portland so you can trust me on my evaluation of hipster things.

Over the rest of this post, not everything is figured out. You could argue that very little is figured out, and so much more research could be done. Research by people with more time and way better technical skills. But I’ve decided to mess around with some numbers, and I’ll try to make this as reader-friendly as possible. I’m not going to lay out for you the true effects of good or bad pitch-receiving. Hopefully this’ll just make you think a little, before you think about something else.

Central to this post will be a home-brewed statistic known as Diff/100. In the past, I’ve used Diff/1000, and this is just that divided by ten because why not? Diff/100 is derived from numbers readily available here at FanGraphs. It’s the difference, per 100 called pitches, between actual strikes and expected strikes, based on zone rate and out-of-zone swing rate. Diff/100 is adjusted to set the league average every year at zero. A positive Diff/100 means a pitcher or team got more strikes than expected. A negative Diff/100 means a pitcher or team got fewer strikes than expected. A catcher who’s bad at receiving, like Ryan Doumit, would contribute to a negative Diff/100. I’ve written this paragraph so many times that I don’t know how many more times it’ll need to be written. Probably all of the times.

The first thing I looked at was simple, and on the team level. I split seasons and looked at every team from between 2008 to 2012. For each team season, I calculated Diff/100, then I looked at the relationship between that and ERA-, FIP-, and xFIP-. This seems to call for a table:

Stat r Slope
ERA- -0.2 -1.5
FIP- -0.2 -1.2
xFIP- -0.3 -1.2

Correlations exist, and as Diff/100 increases, ERA-, FIP-, and xFIP- decrease, slowly. Think of the slope as the gain or loss per one extra strike (per 100 called pitches). The highest Diff/100 in the sample belongs to the 2009 Braves, at +3.9. The lowest Diff/100 in the sample belongs to the 2011 Indians, at -3.9. Between extremes, that’s a difference of nearly eight strikes per 100 called pitches. But we don’t know why we might be seeing what we’re seeing. Pitchers, of course, have some effect on the way they’re received, and a staff with good command might come out looking better than a staff with worse command. I decided to dig into individuals, and now this gets a little more complicated. I promise I’ll be gentle.

I went to the pitcher leaderboards, split seasons between 2008 and 2012, and set a minimum of 100 innings pitched. For every individual pitcher season, I calculated Diff/100. Then, for every pitcher who threw at least 100 innings in consecutive seasons, I calculated the change in Diff/100, along with the changes in ERA-, FIP-, and xFIP-. This left me with a pool numbering 387. Then I sorted the numbers by change in Diff/100, looking for the biggest changes both positive and negative.

For example, between 2011 and 2012, Derek Lowe‘s Diff/100 dropped by an incredible 10.2. Between 2008 and 2009, Mark Hendrickson‘s Diff/100 increased by an also incredible 8.2. The way I figure, individual pitchers will have roughly constant command. Command will, of course, vary, but this is the best I can do to isolate the receiving component. Let’s look now at another table, isolating the 20 pitchers with the biggest Diff/100 drops, and the 20 pitchers with the biggest Diff/100 gains. Shown are their average season-to-season changes in ERA-, FIP-, and xFIP-.

Pitchers ? Diff/100 ? ERA- ? FIP- ? xFIP-
20 drops -4.7 10 8 7
20 gains 4.7 -5 -3 -2
Average 0.1 4 3 2

The question marks are supposed to be delta symbols! Pretend that they’re delta symbols.

Unsurprisingly, the pitchers with the biggest drops in Diff/100 got worse. Meanwhile, the pitchers with the biggest gains in Diff/100 got better. On average, season-to-season ERA- increased by four points. For the 20 biggest drops, ERA- increased by ten points. For the 20 biggest gains, ERA- decreased by five points. And so on in that fashion. Sure enough, we don’t see no effect. Controlling for pitcher identity, it looks like receiving can make a real difference.

But the correlations are very, very weak. Here’s an example chart, plotting the change in xFIP- against the change in Diff/100. And the change in xFIP- has the strongest correlation with the change in Diff/100. The r value is -0.12.


Look at the line and you see a trend. Look at the points and you see a bunch of points. It’s not that there’s no effect. It’s that the effect is small, and there’s a lot that goes on with pitchers. According to the slope, for each additional strike per 100 called pitches, xFIP- changes by about 0.6. But this is noisier than Motörhead at an airport.

There’s no question in my mind that there’s a big gap between the best receivers and the worst ones. I don’t see much reason to believe a good receiver can save a pitcher season, or a bad receiver can cripple one. Presumably, if a pitcher has a bad receiver, he’ll find a means of compensating. If a pitcher has a good receiver, he might end up throwing more pitches out of the zone. Receiving matters. Of course it matters. It can’t not matter. But there’s a lot of work to do on figuring out how much it matters, and so as exciting as it is to look at, we should probably proceed with caution, just to be safe.

Print This Post

Jeff made Lookout Landing a thing, but he does not still write there about the Mariners. He does write here, sometimes about the Mariners, but usually not.

13 Responses to “Poking Some More at the Effects of Receiving”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. channelclemente says:

    There’s no effect. Look at the residuals, and it ought to support that fact.

    Vote -1 Vote +1

  2. Timothy says:

    R2 of 1.46% means we shouldn’t care at all about the slope of the line. At the tails maybe, but for the vast majority it has no effect.

    Vote -1 Vote +1

  3. tz says:

    Portland = hipster nirvana. THAT explains why I can’t stand my daughter’s boyfriend

    Vote -1 Vote +1

  4. wes says:

    It’s likely that two or three influential points are responsible for what little slope there is–I’d suspect the two points on the left immediately above the line (-11, 17 and -7, 12) and the one point immediately below the line on the right (8.5, -7). Take those away and any slope likely disappears.

    Vote -1 Vote +1

  5. Neil says:

    The previous posters are right about the absence of a real effect here. But I’ve been interested in this since my little league coach taught me to frame pitches and I love your posts on the subject.

    Here’s what I’m thinking. You need control variables. Pitching framing has a small effect and it’s lost behind other factors that have large effects. Things like HR rate, change in velocity, etc. You want to control for everything else that matters so that you get an unbiased estimate for the effect of Diff100.

    Keep it coming. These posts occupy my brain more than most.

    Vote -1 Vote +1

    • Scott says:

      Seconded, while it’s easy to point out potential flaws in the methodology they are just that: potential flaws. I am still willing to believe in a non-zero probability of some of those outliers being significant. It is common practice in hard science to drop the most anamolous data points in an otherwise tight set chalking it up to experimental error et cetera. However, I am not convinced that the outliers here aren’t in fact the most interesting potential case studies for further investigation.

      Vote -1 Vote +1

  6. shapular says:

    Is this regression even significant? What’s the p-value?

    Vote -1 Vote +1

  7. Hitler But Sadder says:

    Those sluts over at Baseball Prospectus would beg to differ.

    Vote -1 Vote +1

  8. sambf says:

    Totally ignoring the question of whether this is statistically significant, and assuming–arbitrarily–that pitch framing counted for all of the change in ERA- and no more, then ~5 points in ERA- between average and very good means ~.20 points in ERA, which is ~30 runs per season per team: roughly in line with the 50 run number quoted for Molina.

    (Though, again, this is ignoring whether this correlation is actually real and causal.)

    Vote -1 Vote +1

  9. Greg says:

    Interesting article, but isn’t babip influenced by count? I would assume players forced to protect the strike zone and swing at pitches outside the zone, on average, make at least somewhat worse contact. I can’t remember where to find league batted ball splits by count, but if that’s correct, then FIP and xFIP really aren’t ideal here.

    Vote -1 Vote +1