All we want are numbers that matter, beyond the numbers that matter. Wins and losses are already in the books, and in certain cases teams have already significantly changed their own playoff odds, but we’re all waiting for the point at which we can do some meaningful analysis. The sample sizes thus far are incredibly small, and this is a big reason why people are paying so much attention to pitcher fastball velocities — that’s one of the only things that stabilizes almost immediately. Velocity is entirely up to the one guy. The numbers that stabilize fastest tend to be the numbers relying on the fewest players.
But you can also look at numbers that build sample sizes quickly. Like, say, pitch-by-pitch numbers, since there are hundreds of pitches in each game. What that suggests is that it’s not entirely too early to look at 2014 pitch-framing statistics, and there’s evidence to believe this carries over well even over small samples. And in the early, early going, Yankees pitchers have worked with the most favorable strike zone, while Cubs pitchers have done the very opposite of that. I hope you like framing content, because this summer we’re probably going to beat it to death. So, actually, I hope you don’t like framing content? Whatever, here comes data.
Following is a table of called strikes above or below the expected average, given pitch locations. Data’s shown on the team level, and it comes courtesy of Matthew Carruth’s pitch-framing model. The average run value of an extra strike is in the vicinity of 0.14 runs, but pitches get called strikes and balls under varying circumstances, so at this point I’m personally more interested in just the strikes totals.
“Sample” refers to the number of called pitches, those being balls or strikes not swung at. Missing, of course, is data from the two games played in Sydney. Samples range from just a little over 400 to a little over 700, so it’s clearly early — I don’t need to tell you it’s early — but while all these numbers need to be regressed, so far the Cubs have been giving away the most strikes, while the Yankees have been earning the most extra strikes. The Cubs are in last by seven pitches, and while the Yankees are in first by just three, their sample is also 95 pitches smaller than Houston’s.
Is what we see surprising? Not that much. Francisco Cervelli has been a good receiver in the past, and the Yankees brought in Brian McCann, who’s long been one of the best. Welington Castillo has a history of being below-average. The Rays, Brewers, Padres, and Pirates are up near the top, as we’d expect. It’s not a shock to see the Rockies, Twins, and Dodgers around the bottom. Of course — of course — there’s noise here, and the order is going to change between now and the end of the season, but it’s remarkable to me how quickly some of these framing numbers are getting toward their own level.
Carruth measures strikes above or below average per game. I found all the individual catchers who caught at least 1,000 called pitches a year ago, and who have also caught at least 100 called pitches this year. Today is April 9, and there are more than 150 games left for every team, but we can already see some pretty good agreement, even without any sort of adjustment made for pitcher identities:
This quickly, we can get an r value of 0.60. For the sake of comparison, I looked at every hitter who batted at least 200 times last year, and who has batted at least 30 times this year. For walk rate, there’s an r value of 0.34. For strikeout rate, there’s an r value of 0.43. For ISO, there’s an r value of 0.33. For wRC+, there’s an r value of 0.12. We know not to think too much about early-season wRC+, because it doesn’t have much predictive power, because it doesn’t relate well to the recent past. There’s already a fairly strong relationship between the framing numbers, and that suggests it isn’t too soon to put some stock in them.
Let’s do a quick comparison. Now, two charts, the first showing called balls and strikes thrown by the Yankees, and the second showing called balls and strikes thrown by the Cubs. The black zone box is just a simple approximation, and not to be read into too deeply.
Almost everything in the zone thrown by the Yankees has been a strike, and there’s considerable extension to the left, presumably capturing some of those lefty strikes. With the Cubs, there are more balls in the zone, and there are fewer strikes out of it. Carruth compares data to the strike zones as they’re actually called by umpires. On the team level, the Yankees have had about 6% in-zone balls, and about 11% out-of-zone strikes. The Cubs have had about 20% in-zone balls, and about 7% out-of-zone strikes. It’s early, but not too early to think that could be meaningful.
As always, I have to note that there’s a pitcher-command component at play here, and I don’t think it would be unreasonable to suggest Yankees pitchers, on average, have had and will continue to have better command than Cubs pitchers. Also, it’s early enough that there could be some umpire influence in the data. But I’m comfortable asserting this much: based on the data we already have, it looks like the Yankees will pitch to a favorable strike zone. And it looks like the Cubs will not. And in between the present extremes, there are other numbers worth paying attention to. We’ll revisit this pretty often over the course of the year, but for the time being, there’s probably more substance in this leaderboard than in most of the other ones you can find around.
Print This Post