Would be pretty interested in seeing what the avg. BABIP skill level was of opposing batters in games Slaught caught vs. games LaValliere caught. The two catchers would never have a truly random sample of the Pirates overall schedule, as opposing teams with a lot of LHP starters would draw Slaught usually, for example. If these LHP-heavy teams happened to have a high-BABIP offense, the apparent correlation between catcher and BABIP could show up.
Also, the analysis assumes that both catchers had team-average defense around them. This is probably not the case if there were any other frequent platoons on the Pirates. I.e., it’s possible that Slaught had better teammate defense that LaValliere did if, on average, the Pirates had better defenders who were RHBs.
If you’d normally require p < 0.05 with one comparison, with 14 comparisons you might require p < .0036. If you'd use p < .2 then with 14 comparisons you'd need p < .016 — although .2/14 will get you a rough estimate I think it's probably better to do 1 – (1-.2)^(1/14) .
So, if I'm interpreting your numbers correctly, there's really no evidence of a catcher effect.
Strike Three: Do MLB Umpires Express Racial Bias in Calling Balls and Strikes?
07/01/2011 | 12:33 pm
Our paper on discrimination in baseball has finally been published (the June issue of the American Economic Review). While it received a lot of media and scholarly comment in draft, the final version contained a whole new section. The general idea is that those discriminated against will alter their behavior to mitigate the impacts of discrimination on themselves. But while reducing the impacts, these changes are not costless. For example, if you’re an Hispanic pitcher and think that the white umpire is against you, you’ll change your pitches. Where will you throw? How will you throw?
The paper shows that the pitcher will avoid giving the umpire a chance to use his discretion in judging a pitch. More pitches go into the strike zone, more are clearly balls. More are fastballs, fewer curves and change-ups. A rational response, but by avoiding the umpire’s discrimination the pitcher makes it easier for the batter to hit the ball or to walk. Here’s the abstract:
Major League Baseball umpires express their racial/ethnic preferences when they evaluate pitchers. Strikes are called less often if the umpire and pitcher do not match race/ethnicity, but mainly where there is little scrutiny of umpires. Pitchers understand the incentives and throw pitches that allow umpires less subjective judgment (e.g., fastballs over home plate) when they anticipate bias. These direct and indirect effects bias performance measures of minorities downward. The results suggest how discrimination alters discriminated groups’ behavior generally. They imply that biases in measured productivity must be accounted for in generating measures of wage discrimination.
“That was a lot of work to prove nothing. How did this make it to the Big Blog?”
Statistically/scientifically “disproof” is often just as important as “proof.” This really shows that catchers don’t have much effect on pitchers. Those p-values are very high given the number of comparisons made, as was pointed out before. The only thing significant in these tests are the pitcher differences (which is to be expected).
Only concern is this: You had a large sample of games with two very different catchers. But what you didn’t have was a large sample of pitchers. I still think it is possible that SOME pitchers are more comfortable with a specific catcher. If Varitek and Salty caught 100 pitchers, you’d see no statistical difference. But maybe Beckett would be one of the ones who does much better with Varitek because he’s very comfortable with him. Also I would suspect that pitchers may perform worse with a catcher they are not at all familiar with. I.e., maybe Boston’s struggles were an adjustment period to Salty, same as Lincecum struggling to deal with going from Molina to Posey last year and now Posey to a new guy yet again this year.
Jim Leyland is a lover of platoons. In 1990, for example, there were three L/R platoons in use by Pittsburgh. Slaught and LaValliere split time behind the plate, starting 61 and 87 games, respectively. Likewise, first basemen Sid Bream (L/L) and Gary Redus (R/R) started 100 and 58 games, respecively. And at the hot corner, Wally Backman (L/R) split time with Jeff King (R/R) 68 games to 86. As one would expect, the LaValliere/Bream/Backman and Slaught/Redus/King trios appear more often than not in the 1990 Pirates defensive lineups (http://www.baseball-reference.com/teams/PIT/1990-lineups.shtml). It should also be noted that Redus was pulled late in over 2/3 of his 1B starts, while Backman completed fewer than 50% of his starts.
The Pirates’ pennant push in 1991 was messier from a platoon standpoint. LaValliere started 100 games to Slaught’s 53, but this season saw LaValliere complete just 77 of his 100 starts. Orlando Merced replaced Sid Bream as the lefty half of the 1B platoon, starting 95 games at the position (completing 85 and playing in some capacity in an additional 10 for 105 total). Redus (43 starts, 33 CG) remained the primary righty platoon member, with Lloyd McClendon (15 starts, 11 CG) also spending a bit of time there. Third base and right field were messy that year, with Bobby Bonilla spending the first month as the everyday RF, shifting everyday to 3B in May, and platooning with R/R John Wehner at 3B and L/R Gary Varsho in RF until the late August acquisition of everyday 3B Steve Buechele, at which point Bobby Bonilla moved firmly back to RF. LaValliere/Merced and Slaught/Redus match up in defensive lineup frequency. King, Bonilla and Buechele all had stretches of being the primary 3B, while Bonilla/Wehner also was a short-term platoon. Switch-hitting Mitch Webster also picked up 15 RF starts from mid-May through the end of June (.565 OPS for his 2nd of 3 1991 teams).
More of the same in 1992, as LaValliere and Merced each started 87 games, while Slaught started 65 behind the plate and righties Redus and King combined for 61 starts. Buechele was the everyday 3B until a midseason trade freed up the spot for King, who gave some time to Wehner, but not in a significant L/R platoon. The post-Bonilla era in RF saw 9 starters, of whom 6 amassed between 100 and 400 innings played at the position.
Through these 3 seasons, Jose Lind and Jay Bell were constants up the middle, while Bonds and Van Slyke manned 2/3 of the OF. Right field and third base saw a lot of fluctuation. And first base was as much a pure platoon situation as the catcher spot. There’s definitely a chance that these other platoons had an effect on the BABIP numbers, but it’s also worth considering how any other teams’ L/R platoons might have affected Drabek/Walk as opposed to Tomlin/Smiley/Smith. Those splits were out of the starting catcher’s control, as well as the 1B’s, thus leaving to chance whether a better defensive C and 1B (LaValliere/Bream) would be in the lineup or not against a team stacked with lefties against Drabek/Walk or the better 3B (not Bonilla or Backman) would be in there against a righty-heavy lineup facing Tomlin/Smiley/Smith. That could certainly have an effect that is outside the study of simple SP/starting catcher relationships and would merit further study once we have better defensive metrics derived from early-90s game footage.
There have been a couple of great articles recently on THT that use PitchFX data (which is much more granular than what you’re doing here) and found that a catcher with good strike-framing abilities can have an impact of up to a full additional win above replacement over the course of a season (which is shorter for catchers). Before you ask, yes, they normalized for pitcher strikezone, batter strikezone, and umpire strikezone.
As mentioned above, those catcher p-values are all insignificant. I don’t mean to “pour on the haterade,” because I really thought this was an interesting read, but the reason they’re insinificant is rather subtle, and many people with serious training in statistics often mess this up. Here’s a basic explanation of the mistake: http://xkcd.com/882/
Comment by biscuit pants — August 12, 2011 @ 7:34 pm