# Stolen base attempts: an algorithm for allocating run value

It is customary to credit a runner if he runs on the pitch and reaches the next base safely without the benefit of any major misplays by the defense. We call this a stolen base.

But how much credit does the runner really deserve? After all, there are always other players involved in a stolen base attempt, and frequently these other players are more responsible than the runner for the advancement, or the out that results. The pitcher may ignore the runner and allow a long lead, or a walking lead, or he may execute a very slow delivery to home plate. The catcher may bobble the pitch, or execute a slow exchange and release; his throw may be off target, or weak. The fielder at the play base may drop or miss the throw, or may fail to apply the tag.

This article presents an algorithm for logically dividing credit on stolen base attempts among the participating players, sharing the run value of the play result based on the quality of their performances.

To keep things simpler, this algorithm will cover only situations with a sole runner on first base who attempts to steal second base, where the play does not result in a passed ball or wild pitch, and where there is no defensive error or any additional advancement by the runner beyond the play base. Other plays involving these situations have their own algorithms, which will be discussed elsewhere.

The algorithm will also consider contributions from only the runner, pitcher and catcher, leaving consideration of the fielder’s contribution for future discussion.

### Play value

How much is a successful steal of second base worth to the offense? How much does a failed attempt at second base cost the offense? These questions we will answer using run expectancy values. Run expectancy refers to the expected number of runs scored from each of the 24 run-out states. Here are the RE numbers for 2012:

The run value of any play can be determined by calculating the change in run expectancy from the initial to final state. So, with a runner on first, a successful steal of second base with none out changes the RE from 0.858 to 1.073 (the run value for a man on second, none out), a change of +0.215 runs. If the attempt is unsuccessful and the runner is thrown out, the change in RE is from the initial 0.858 to a final value of 0.263 (the run value for bases empty, one out), a net change of -0.595 runs. With one out, the play values are +0.144 runs for success, and -0.411 runs for failure. With two outs, the values are +0.097 and -0.221 runs.

A brief aside here: it is important to keep in mind that the RE values listed above are an aggregation of all major league data, so the precise run expectancies for a situation may (will) be different, depending on the players involved. Figuring out how the odds shift in particular situations is one of the things a good manager does. Figuring out the “centerline” odds and making them available to the manager is one of the things a good analyst does.

So, how does one go about deciding if the potential reward of a stolen base is worth the risk? Let’s do the math. The “break-even point” (BEP) is the success rate for attempts for which run value gained on successes and run value lost on failures balance each other. It is given by the equation:

BEP = CS Value / (CS Value – SB Value)

For zero outs: BEP = (-0.595)/(-0.595 – 0.215) = 0.735 = 73.5%

So, if one can exceed a 73.5 percent success rate, attempting to steal second with none out will be beneficial in the long term. If not, one would be advised to not try for the steal, although making an attempt from time to time when the odds say not to will help to keep opposing teams from becoming too accurate in anticipating one’s tactical moves.

With the groundwork now laid, we can move on to discussing the attribution of credit/blame for the outcome of stolen base attempts. When trying to allocate performance value on a play, we first must identify the participant players. For a stolen base attempt at second base, there are four participants: the runner, pitcher, catcher and a fielder (we will treat any advances or putouts that take place after the main as separate). Let’s consider each of the participants for a moment.

### The participants

The **runner** is the most important player in any stolen base attempt, in the sense that he (among the involved players) decides when there will be an attempt, and of course he can unilaterally decide not to attempt a steal as well. Another important aspect of the runner’s involvement is that the runner’s performance forms one complete side of the confrontation: the runner’s reaction to the pitcher’s first move begins the play, and his initial touch of second base ends the play (if a tag hasn’t ended it sooner). The elapsed time between the pitcher’s first move and the runner’s touching second base is the key metric for the runner.

The **pitcher’s** delivery time to home plate governs the first portion of the defensive side of the stolen base attempt. The pitcher’s performance is relatively independent, in that the pitcher generally cannot alter his delivery or pitch selection based on the actions of the runner. The pitcher’s impact on the runner’s lead and/or jump, including the influence of the handedness of the pitcher, is significant, but will be discussed elsewhere.

The **catcher** exerts a huge influence on stolen base attempts, naturally. Unlike the runner and pitcher, the catcher does not begin his performance from a “clean slate”; he inherits a different situation on every stolen base attempt, based on the pitcher’s delivery time and the first portion of the runner’s sprint to second base. The catcher’s performance is encapsulated in the amount of time between his first touching the pitch and the arrival of his throw in the glove of the fielder covering second base. The accuracy of his throw is, of course, important, but will be discussed elsewhere.

The fielder’s task is simple, if not always easy: He catches the throw and applies the tag. In this discussion, we will assume the fielder catches the throw and applies the tag, and we will not consider the value he provides by doing so; analysis of the fielder’s contribution will be discussed elsewhere.

### Calculating individual player values

A sampling of stolen base attempts from the 2011-13 seasons was analyzed, with times measured for segments of the play corresponding to the performances of the runner and pitcher. The data for successful and unsuccessful attempts were separated, and probability density functions (PDFs) were fit for each category. The PDFs were then weighted and combined to yield plots which show the likelihood of success vs. the runner’s and pitcher’s times.

**Runner time chart:**

**Pitcher time chart:**

Note: due to the limited size of the sample, these plots should be considered approximate, and those who wish to make use of this algorithm should avail themselves of a larger sample of data, ideally a full season or more. However, the effectiveness of the algorithm is not dependent on the precision of the charts, and the focus of this discussion will remain on the algorithm.

Upon measuring the runner or pitcher’s time, and using the appropriate chart to convert the time to a “Safe %,” the weighted value of the performance is calculated by multiplying the Safe% by the SB Value, multiplying (1-Safe%) by the CS Value, and adding the two numbers.

**Runner’s Value:** The first value contribution to be calculated is that of the runner. It is determined as follows:

- Measure the runner’s time, which is the time elapsed between the pitcher’s first move and the runner touching second base. Even if the runner is tagged out, the runner’s time is counted to the instant he touches second base.
- Consult the runner’s time chart and find the corresponding Safe% for the runner’s time.
- Multiply the Safe% by the SB Value, multiply (1-Safe%) by the CS Value, and add the two numbers. This is the Runner’s Value.
- Example (using values for zero outs): for runner’s time = 3.26 seconds, the corresponding Safe % is 90.0%. Multiply 90.0% by +0.215, and add (1-90.0%) times -0.595, which equals +0.134 runs. This is the Runner’s Value. A positive number indicates a favorable contribution for the runner (adding runs), while a negative number indicates an unfavorable contribution (reducing runs) .

**Pitcher’s Value:** Next, the pitcher’s value is determined, as follows:

- Measure the pitcher’s time, which is the time elapsed between the pitcher’s first move and the pitch touching the catcher’s glove.
- Consult the pitcher’s time chart and find the corresponding Safe % for the Pitcher’s Time.
- Multiply the Safe% by the SB Value, multiply (1-Safe%) by the CS Value, and add the two numbers. This is the Pitcher’s Value.
- Example: for pitcher’s time = 1.33 seconds, the corresponding Safe% is 68.5 percent. Multiply 68.5% by +0.215, and add (1-68.5%) * -0.595, which equals -0.040 runs. This is the Pitcher’s Value. The negative number here indicates a favorable result for the pitcher (reducing runs).

**Catcher’s Value:** Finally, the catcher’s value is determined, as follows:

- The Catcher’s Value is calculated as the overall run value of the play result (i.e. SB Value or CS Value) minus the sum of the Runner’s Value and Pitcher’s Value.
- Example: given the inputs above (Runner’s Value = +0.134 runs, Pitcher’s Value = -0.040 runs), the Catcher’s Value will depend on whether the runner is safe or out at second base. If the runner is safe, the Catcher’s Value = +0.215 runs – (+0.134 runs) – (-0.040 runs) = +0.121 runs. If the runner is out at second, the Catcher’s Value = -0.595 runs – (+0.134 runs) – (-0.040 runs) = -0.689 runs.
- If the runner successfully steals second base with a very fast time, and the pitcher’s delivery time to home is extremely slow, the sum of the Runner’s Value and Pitcher’s Value could in an extremely rare instance exceed the SB Value. In this case, the Catcher’s Value would be negative (i.e. reducing runs, i.e. a good defensive contribution), which would not make sense on a play where the catcher had essentially no impact on the play and the runner was safe. In this case, the Catcher’s Value is set equal to zero, and the Pitcher’s Value is adjusted so that the total play value equals the SB Value.

If the runner is safe in our example, the credit/blame is allotted as follows:

- Runner’s Value: +0.134 runs
- Pitcher’s Value: -0.040 runs
- Catcher’s Value: +0.121 runs
- Total Run Value: +0.215 runs

If the runner is out in our example, the credit/blame is allotted as follows:

- Runner’s Value: +0.134 runs
- Pitcher’s Value: -0.040 runs
- Catcher’s Value: -0.689 runs
- Total Run Value: -0.595 runs

Note that the runner and pitcher get the same credit in both instances, because they delivered the same performances. The catcher’s credit depends on whether he was able to receive a pitch at time = +1.33 seconds, and get it to second base in time for the tag to be applied before time = +3.26 seconds. This is a tough play for a catcher to make, and if we do the math, we find that the catcher’s break-even point on this play is 15 percent—if he can throw out runners on a play like this more than 15% of the time, his performance is adding value to his team.

**Boundary cases:**

To satisfy ourselves that this algorithm delivers sensible values, let’s consider some boundary plays (using values for zero outs).

**Fast runner, slow pitcher**: Runner’s time = 3.30 seconds -> 87 percent safe -> +0.110 runs. Pitcher’s time = 1.65 seconds -> 82 percent safe -> +0.067 runs. Catcher’s Value = +0.047 runs if SAFE, -0.763 runs if OUT. This fits: With a fast runner and slow pitcher delivery, the catcher gets a huge amount of credit if he throws the runner out, but only a small penalty for failing to do so.

**Fast runner, fast pitcher:** Runner’s time = 3.30 seconds -> 87 percent safe -> +0.110 runs. Pitcher’s time = 1.26 seconds -> 60 percent safe -> -0.109 runs. Catcher’s Value = +0.223 runs if SAFE, -0.588 runs if OUT. The runner’s excellent performance and the pitcher’s excellent performance cancel each other out, leaving the outcome of the play in the hands of the catcher.

**Slow runner, slow pitcher:** Runner’s time = 3.88 seconds -> 63 percent safe -> -0.082 runs. Pitcher’s time = 1.76 seconds -> 84 percent stfe -> +0.082 runs. Catcher’s Value = +0.224 runs if SAFE, -0.587 runs if OUT. Again, the runner’s performance and the pitcher’s performance balance each other, rendering the catcher’s performance decisive.

**Slow runner, fast pitcher:** Runner’s time = 3.75 seconds -> 67 percent safe -> -0.052 runs. Pitcher’s time = 1.22 seconds -> 51 per cent safe -> -0.185 runs. Catcher’s Value = +0.461 runs if SAFE, -0.349 runs if OUT. With a slow runner and fast pitcher delivery, the catcher has an easier-than-usual task, and thus merits a big penalty if he allows the stolen base; if he guns the runner down, he gets less credit than in most situations, since the runner and pitcher have essentially done some of his work for him.

**Average runner, average pitcher:** Runner’s time = 3.56 seconds -> 74 percent safe -> +0.001 runs. Pitcher’s time = 1.40 seconds -> 73 percent safe -> -0.001 runs. Catcher’s Value = +0.223 runs if SAFE, -0.587 runs if OUT. Both the runner and the pitcher have delivered performances that are essentially at the break-even point, which of course means that the catcher’s performance will decide the outcome.

### What about “deterrence”?

Some pitchers (typically left-handed ones) are known for their deceptive delivery, which makes it difficult for a runner to detect whether the pitcher is going home or coming over to first; this, of course, makes runners less willing to attempt a stolen base, since they don’t want to be picked off if they read the pitcher’s motion incorrectly. This apparent ability to deter stolen base attempts is usually regarded as a positive feature for a pitcher.

However, it is important to keep in mind that pitchers like this do not deter stolen bases; they deter stolen base attempts, and stolen base attempts end in both positive and negative results for both sides. In 2012, there were 3,229 successful stolen bases, and 1,136 caught stealing, for a success rate of 74.0 percent. The break-even success rate in 2012, based on the frequency of RE24 states during stolen base attempts, and the value of stolen bases and caught-stealings, was about 74.7 percent. In 2012, major league teams in aggregate stole bases at a success rate equal to break-even, meaning the overall run value from stolen base attempts is near zero.

If the average run value of a stolen base attempt is zero, then there is no value, positive or negative, in deterring attempts, on average. A pitcher who generally discourages attempts will allow fewer stolen bases, but he will also benefit from fewer caught stealings, and the net value will be essentially zero. Therefore, no value is attributed to a pitcher for stolen base attempts that do not occur.

### Future considerations

There are lots of areas where this stolen base attempt algorithm can be expanded. First of all, the performance values of the participating players can be subdivided, to provide additional insights on specific aspects of their play.

- The runner’s performance value can be divided into lead, jump, run, and slide.
- The pitcher’s performance value can be divided into release time, pitch time/speed and handedness (as it pertains to delaying the runner’s jump)
- The catcher’s performance value can be divided into exchange/release time, throw accuracy and throw power

We discussed earlier that deterrence of steal attempts, such as might come from a pitcher having a very deceptive pitching motion, would not be assigned value, based on the similarity of the break-even rate and the actual success rate. However, a deceptive motion may not always completely deter attempts; it may instead hamper them, as measured by a shorter lead allowed, and/or a slower jump allowed. Future elaboration of the stolen base algorithm may include allotting a portion of the responsibility (run credit) for the runner’s lead and jump to the pitcher, which should allow better modeling of pitchers with deceptive deliveries.

Some other situations that were excluded from this discussion of the basic algorithm can be covered in the future. For example, stolen base attempts at third base, double steals and steals of second with a runner on third who stays put each have their own algorithms. Stolen base attempts where the pitch is off-target and not caught cleanly by the catcher can be considered. Wild throws, and the value added (or lost) by the fielder at the play base can be considered.

There is a lot to consider when diving deep on valuation of player performances; we are only at the very beginning.

Greg – I also think that you may have some selection bias problems. If a pitcher is able to get the ball to the catcher in 1.20 seconds and still have 40% of the attempted steals successful then he either has a really bad catcher or only very fast base runners are attempting to steal on him.

Sean,

I went in with some preconceived notions about the shape of those curves as well, and what I found wasn’t what I expected.

I’d definitely encourage you to pull up MLB’s video archive, search on key word stolen base, and time a bunch of attempts that fit my criteria (i.e. 2nd base, no other runners, no wild pitch, etc.) You’ll understand how fast runners frequently get nailed and slow runners frequently reach safely a lot better if you do this yourself…

Will do, Greg. Thanks for the research!