How Do We Assign Credit for Catching BaseStealers?
So, who’s in charge here? In 2004, Yadier Molina made his major league debut alongside Chris Carpenter‘s Cardinals debut, the beginning of one of the most successful batteries in recent baseball history. In the decade since, Carpenter has owned a 65 percent caught stealing rate; during the same time, Molina has caught roughly 40 percent of wouldbe basestealers. While Carpenter had a quick move to the plate and excellent control, many would argue that it was Molina’s presence behind the plate that led to his success (in all facets of the game). But could it be the other way around? Did Molina benefit from Carpenter’s presence on the mound? If so how much?
Like most interactions in baseball analysis, there is no clear way to distribute credit for the outcome. It is much debated how the value of a caught stealing should be divided among the three parties involved (the pitcher, catcher and baserunner). How much of a caught stealing should we grant to the pitcher on the mound? Does the credit lean toward the pitcher or the catcher? However unclear the picture remains on the surface level, the answer depends on numerous factors that are very much related.
Say, Carpenter (if he were still active) is on the mound and the Cardinals manage to put away Billy Hamilton on an attempted steal of second. In this situation, we assume that Molina is the main cause, because his talent at throwing runners out is wellknown, tested, and consistent among many different battery mates. In singular cases, like this, it is easy to generalize which party is at fault or deserves credit — Molina is one of the few guys with the ability to throw out Hamilton. Whether through scouting information (poptimes, movetomitt, the runner’s jump) or playbyplay data, we know Molina has the tangibles and the record to pull it off.
The MolinaHamilton showdown is the most extreme example I could think up. But imagine the normal spectrum where average pitcher meets average catcher meets average baserunner. There, it isn’t quite as easy to determine credit objectively. This is usually where we use modeling to make sense of it
Where Baseball Meets Basketball Analytics
Most of my research revolves around the battery dynamic — as I call it “The Battery Effect” — or the idea that there are so many confounding variables in a single battery interaction (a wild pitch or a stolen base attempt) that it’s wrong to assign all credit to one party.
For some reason, baseball decided to reward all caught steals and passed balls to the catcher and all wild pitches to the pitcher, instead of divvying responsibility “ntimes” among “nparties”. My previous research and the correlations suggest that the pitchers “control” most every battery outcome, but as you’ve heard before, correlation does not always equal causation. So I’ve been searching for a better way to adjust for “The Battery Effect” and assign credit more responsibly among the pitcher and catcher.
My attention was turned to a popular Basketball analytics model in Regularized Adjusted Plus Minus (RAPM) by Nik Oza — founder of the the Georgetown Sports Analysis, Business and Research Group. RAPM adjusts each player’s plusminus (the accumulative sum of score margin while the player is on the court) to reflect the quality of his teammates and opposition, while accounting for home court advantage. Without RAPM, plusminus (+/) suffers from many of the same problems we observe in the battery setting — like colinearity between teammates and lineups. These problems are inherent in any team sport, where the line between one player’s contributions and another’s influence is blurred. While baseball for the most part is a collection of individual contributions, perhaps no other situation is more similar to that of a team sport than the exchanges among the pitcher, the catcher and the baserunner — so a RAPM framework in this setting makes sense. The question is who affects whom? What are the player’s true contributions independent of his teammates?
Note: Feel free to skip this section ahead if you’re not interested in the mathematical details.
Introducing Regularized Adjusted PlusMinus For Battery Related Outcomes
The idea behind any RAPM framework, like the one suggested by Nik Oza, is a ridge regression which penalizes coefficients and regresses them towards zero. The penalty factor is known as lambda, and the lambda that minimizes the error of the model is chosen and can be computed through a program like R. The ridge regression can adjust for colinearity of variables and decrease the variability of the prediction while sacrificing some validity (introducing bias).
Say that pitcher A has pitched only with catcher B behind the plate or a majority of pitcher C’s innings have come with catcher D — the ridge regression does a good job at adjusting for the relationship between the two battery mates. With that in mind, each pitcher, catcher and baserunner becomes an independent variable in our model — with their respective coefficients reflecting the impact of their presence on an battery outcome. However, because the ridge regresses towards zero, in the future I would like the model to take on a more Bayesian approach by regressing with some prior (determined by pitcher handedness) in mind and a minor league prior for rookies. This can be achieved without a RAPM framework: In the past, Jared Cross, of Steamer Projections, has used a Mixed Effects Model to assign credit among battery mates while adjusting for batting order.
Many other variables could affect a stolen base attempt, in addition to the battery or baserunner:
 Outs
 Score margin
 Basestate
 Inning
 Wild pitch
 Passed ball
 Pickoff
 Pitcher handedness
 Hitter handedness
With those factors in place I added them to the model and pulled all opportunities of the last three years from Retrosheet playbyplay. I grouped each line of my data by each opportunity where a runner was on first or second with no lead runner ahead. Later, I created dummy variables for every baserunner, pitcher and catcher possible. With the other factors above I plugged and chugged these numbers through the R package “glmnet”, and adopted and amended Jacob Frankel’s code used to calculate RAPM found here. (My code can be found here for replication’s sake).
The predictor I sought was caught steals per opportunity (what we will call adjCS+/). I ran through and applied a similar framework to stolen base attempts, to get adjSBA+/. We don’t want stolen base attempts in the denominator of our response variable, just because we don’t want to assume the same frequency of attempts for all parties involved. Instead, we know that attempting a stolen base against one catcher/pitcher tandem is much different than another. For this reason, I grouped by opportunity — or the number of times a baserunner was in a selected base state against the battery. I included only basestates where the runner could steal the base ahead of him, given no one was directly in front of him on the basepaths.
Back to the Carpenter dilemma. As we mentioned, Carpenter did have immense success keeping runners from stealing, with a 0.25 caught steals per stolen base attempt above average. However, when converting his plusminus into an expected caught stealing percentage (xCS%) we see a large differential between his observed success and his adjusted performance. The difference between his CS% and his xCS% is nearly 20 percentage points (65 percent versus 48 percent). So when adjusting for the quality of his backstop (mostly Molina) his adjusted performance nets three runs less than his career track record suggests. (Note: to derive xCS% I used caught stealings divided by attempts as the response variable; the rest of the article uses caught stealings divided by opportunities to keep it consistent with SBA+/).
Oh yeah, that’s three runs over a decade. So even when observing the most extreme dropoff between unadjusted CS% and expected CS% for a pitcher, we don’t even see even half a win as a consequence. Simply put, the pitcher does not accumulate enough stolen base attempts against him (unless his reputation is awful) for his expected performance to mean much. While a pitcher may “control” most battery outcomes, it’s not what they are sought out or selected to do, and for good reason. Meanwhile, some catchers and their pitchers are not as far off in contribution as the eye would suggest.
However, the opposite is true of a catcher. He accumulates many more attempts against him by virtue of his job — to sit behind the plate for a thousandplus innings a year. However, a catcher’s value is not only in his arm but in his ability to reduce attempts through his reputation — the same is true for certain feared pitchers. So it is also important to introduce a statistic that objectively defines a player’s reputation, or the effective amount of stolen base attempts added or subtracted per opportunity (adjSBA+/).
Overall, the predictability of both adjSBA+/ and adjCS+/ was weaker than previous seasons’ caught stealing percentage. I didn’t expect this method to be predictive in the first place. My reasoning is that if this were an independent measure of pitcher/catcher defensive contribution, then yeartoyear caught stealing percentage would not reflect their “defensive skill” — since the remaining influences of last year’s similar battery and similar environment will remain.
However, I expected that when evaluating team switchers, we will see that their independent evaluation metrics will overtake caught stealing percentage in predictability. To test this out, I took all teamswitchers from 20032013 and ran a regression of year one adjCS+/ and adjSBA+/ on year two caught stealing percentage. Only 40 catchers had switched teams from 20032013 with at least 500 opportunities with runners on the base paths. Their yeartoyear relationship between caught stealing percentages was a mere 2 percent compared to 8 percent between adjCS+/ and CS%. So, in honesty this is not a predictive tool to begin with. Descriptively, adjSBA+/ could explain 70 percent of the variation in a pitcher’s CSruns in the same year, but there were little other relationships of interest. Instead I find interest in its use as a descriptive tool to tell us who affected the battery’s performance overall, which we will come back to later.
Key

Results
Below are the 20112013 numbers (Note: Retrosheet removes pickoffs from caught stealings, so caught stealing percentage is without pickoffs while the regression included it. See Mark Buehrle for discrepancy between these two.*)
Of course, the “Runsadded” is just an estimate, mostly because of a rough estimation of SBA value — which can be debated in the comment section.
Here I took adjCS+/ and adjSBA+/ and factored in the number of opportunities each catcher/pitcher had (this means runner in our selected base states). Multiplying these metrics by opportunities gives us a rough aggregate of their contribution (CSadded and SBAadded). The rough assumption here in turning this into runs is the value of a SBA added or subtracted. In general, the fewer stolen bases the better, since we are not testing the probability of a success or failure. So I took the difference between SBAadded and CSadded and multiplied by the value of a stolen base. Say that a player added two caught stealings and 50 stolen base attempts. Then, technically, he added 52 stolen bases. I know this is a rough way of assigning a run value but it will do for now until we can objectively define the value of a stolen base attempts added or subtracted — which I would like to be related to individual catcher/pitcher break even points, much like this framework.
Catchers 20032013 

Name  CSAdded  SBAAdded  RunsAdded  CS  SBA  Opp  CS%* 
Yadier Molina  1  196  49.8  160  399  9866  40% 
Ivan Rodriguez  10  116  35.8  122  397  8299  31% 
Matt Wieters  16  145  25.8  92  322  5318  29% 
Gerald Laird  37  3  23.4  116  327  4202  35% 
Miguel Montero  14  53  22.1  81  303  5320  27% 
Ramon Hernandez  22  25  20.6  121  516  7464  23% 
Salvador Perez  14  42  19.3  33  93  1672  35% 
Jose Molina  31  3  19.1  79  248  3143  32% 
Joe Mauer  10  102  18.6  96  376  6979  26% 
Kenji Johjima  18  19  16.9  67  204  3781  33% 
————  ————–  ————–  ————–  ————–  ————–  ————–  ————– 
Paul Lo Duca  14  87  12.8  108  468  4980  23% 
Alex Avila  0  54  13.2  79  329  3684  24% 
Chris Iannetta  26  10  14.3  60  330  4566  18% 
Geovany Soto  10  34  14.9  81  380  4597  21% 
Victor Martinez  5  57  17.5  101  515  6135  20% 
Mike Piazza  7  66  21.5  32  241  1969  13% 
Brian McCann  15  56  23.9  132  655  8016  20% 
A.J. Pierzynski  37  9  26.0  135  832  11788  16% 
Jason Kendall  11  81  27.6  149  665  9882  22% 
Michael Barrett  28  48  30.2  54  324  4205  17% 
Pitchers 20032013 

Name  CSAdded  SBAAdded  RunsAdded  CS  SBA  Opp  CS%* 
Carlos Zambrano  0  73  18.4  27  65  1667  42% 
Mark Buehrle  10  98  17.7  11  53  2260  21% 
Jon Garland  7  46  15.9  33  65  1634  51% 
Zack Greinke  7  37  13.7  33  75  1544  44% 
Livan Hernandez  11  19  11.7  46  124  1890  37% 
Chris Carpenter  5  33  11.3  28  42  1218  67% 
James Shields  2  33  9.9  29  74  1536  39% 
Justin Verlander  9  17  9.8  39  120  1604  33% 
Bartolo Colon  0  38  9.5  17  34  1015  50% 
Rich Harden  7  18  9.2  19  43  628  44% 
————–  ————–  ————–  ————–  ————–  ————–  ————–  ————– 
John Lackey  9  68  11.4  44  217  1976  20% 
Ted Lilly  7  33  12.6  9  128  1440  7% 
Chris Young  1  55  13.3  10  106  561  9% 
Jose Contreras  11  82  13.7  25  163  1011  15% 
Brandon Webb  9  81  14.1  32  166  1298  19% 
Carl Pavano  58  14.3  26  140  1157  19%  
Cole Hamels  5  47  15.0  16  134  1389  12% 
Tim Lincecum  9  46  17.2  17  145  1341  12% 
Tim Wakefield  21  132  19.3  44  248  1569  18% 
A.J. Burnett  2  113  26.7  36  264  1874  14% 
These numbers are on a counting basis, so what about on a pure rate statistic level?
Below are the leader boards from 20112012:
Best and Worst Pitcher Reputation 

Name  ADJSBA+/ 
Clayton Richard  0.139 
Johnny Cueto  0.086 
Nathan Eovaldi  0.08 
Josh Tomlin  0.077 
Doug Fister  0.065 
———  ——— 
Alex White  0.073 
Jeff Niemann  0.08 
Tommy Hanson  0.091 
Scott Feldman  0.096 
John Lackey  0.097 
Best and Worst Pitcher Caught Steals Added 

Name  ADJCS+/ 
Alfredo Simon  0.039 
Josh Collmenter  0.037 
Jair Jurrjens  0.036 
Michael Pineda  0.035 
Rodrigo Lopez  0.030 
———  ——— 
Chris Archer  0.020 
Brandon Morrow  0.021 
Nathan Eovaldi  0.024 
Alfredo Aceves  0.030 
Clayton Richard  0.034 
Best and Worst Catcher Reputation 

Name  ADJSBA+/ 
Salvador Perez  0.033 
Chris Stewart  0.029 
Matt Wieters  0.028 
Humberto Quintero  0.025 
Yadier Molina  0.024 
———  ——— 
Hank Conger  0.017 
Eli Whiteside  0.019 
Jose Lobaton  0.020 
Alex Avila  0.025 
Rob Brantly  0.039 
Best and Worst Catcher Caught Steals Added 

Name  ADJCS+/ 
Yan Gomes  0.020 
Salvador Perez  0.010 
Josh Phegley  0.008 
Yorvit Torrealba  0.007 
Nick Hundley  0.007 
———  ——— 
Martin Maldonado  0.007 
Derek Norris  0.007 
Ryan Doumit  0.008 
Chris Stewart  0.010 
Joe Mauer  0.010 
In the Scope of the Battery
Like the previous research I have conducted has shown, a pitcher’s adjusted CS performance correlates with the battery CS performance at a 1.5/1 ratio with catcher past CS performance. With this in mind, I also used our proxy of reputation to see which battery mate had more of an impact on the amount of stolen base attempts that took place under their watch.
When comparing CS+/ and SBA+/ with the battery numbers the following was found:
Comparing CS+/ & SBA+/ With Battery Numbers 

Regression  Position  R^2 
CS% Battery~adjCS+/  Catcher  0.002 
CS% Battery~adjCS+/  Pitcher  0.021 
SBA% Battery~adjSBA+/  Catcher  0.034 
SBA% Battery~adjSBA+/  Pitcher  0.566 
The interesting note here is that it is the pitcher “reputation proxy” (adjSBA+/) that correlates best with the actual SBA% of the battery — and the margin is not even close. For anyone who thinks that a base runner steals off the pitcher, this is more evidence in their favor.
Next Steps
When I return to this topic, I’d like to see Bayesian priors for both pitchers and catchers. These can be based off pitcher handedness and we have plenty of minor league catching data to build from. I think the inclusion of the above will improve the predictive value of this framework.
Meanwhile, I’d like to see dummy variables connected to pitch location, and swing/no swing. These factors have an effect on caught stealing percentage. On this front, pickoffs would need to be removed — which would require a link between PITCHf/x and Retrosheet.
I attached all the code here (R code, SQL code), and the data is in this google doc. So build on it if you like. I hope my next article focuses on what this means for the base runners involved.
References & Resources
 Thanks to Jared Cross, for his help with R Code and Method, and to Nik Oza, for his consult and his idea to adapt RAPM for the running game.
 Greg Rybarczyk, The Hardball Times, Stolen base attempts: an algorithm for allocating run value
 Jacob Frankel, Hickory High, How To Calculate RAPM
 Tangotiger, Evaluating Catchers
I don’t see pitchouts included in your list of other factors. Did you consider them at all?
I did not consider them, yet. I understand the problems that presents and I am working towards adding PITCHfx data to incorporate pitch type, location, swing/no swing, removing pitchouts. Before I would return if like to study if pitch location/pitch type is solely a function of the pitcher. In the meantime, SQL and rcode is attached if anyone wants to take up including pitch variables.
Max – Retrosheet shows pitchouts in the Pitch Sequence list. It also shows pickoff throws which is another variable you might want to consider.
fewer overall stolen base attempts which reduce runs scoring on extra base hits with the runner in motion.