## Most Effective Stealers of All Time

Dave Cameron recently had an interesting discussion of the greatness of Rickey Henderson here at FanGraphs. The post left a lot of angles to uncover, some of which require manipulation of data beyond my present capabilities. With that said, one of the astonishing things was that Rickey was caught stealing 42 times one year, nearly negating 130 successful steals (at least from a conventional sabermetrics standpoint).

So what became interesting to me was what might happen if you were to look at who the most effective stealers were of all time. A truly deep look, complete with changing WPA calculations for each situation, would be impossible. (Down 2 in the 8th trying to steal 3rd with 2 out would be a disaster if failed, while getting caught stealing 2nd with 2 outs up 10 runs would be trivial, except for the beaning that might follow next game). And in some sense, each team is different. It might be less advantageous to steal with Barry Bonds at the plate, instead of a line drive hitter such as Barry Larkin, so the calculation isn’t perfect.

But I thought I would take a stab at it. We take the breakeven rate in Moneyball of 70% steals required just to break even. Each unsuccessful steal is, of course, a needless out. Each successful steal is therefore the equivalent of 3/7ths an out avoided (if we keep the sanctity of the 70% breakeven rate).

A total of 7 successful steals is as good as 3 avoided outs, but 3 unsuccessful steals is still worth 3 needless outs. So someone that steals 70 bases and is caught 30 times is no more valuable from a steals standpoint than someone that attempts 0 steals. He may add value on the basepaths in other ways, but we are strictly looking at steals here.

With that said, I took a look at the past 50 years (starting with 1959), and took every season where someone had more than 50 steals. As for why I chose 1959 — well, beyond being a round number, it should be noted that before 1912 data on times caught stealing was unavailable, and even through the 20’s the data was highly unreliable. It just so happened that the 30’s and 40’s and 50’s was not a running era, so we can say the modern running game began 50 years ago.

So who is the most effective stealer of all time by season?

It wasn’t Rickey.

Vince Coleman, in his second year, had 107 steals and was only caught 14 times. He could have been cuaght 30 extra times and still been ahead of the game. What is amazing, though, is in spite of that massive edge, he was probably worth something in the arena of 5.9 extra runs by my (conservative) calculation.

I figured that, over 26.5 outs (close to the true average, since, while home teams sometimes only need 24 outs, there is the longtail risk of 12+ inning games where each teams need 36+ outs), that if you score 4.9 runs per game, each out saved is worth just under .2 runs. (If you score 5 runs and need 25 outs to do it, each out would be exactly .2 runs. I realize that, due to leverage reasons, you could safely increase this worth, it’s just not something I felt comfortable calculating with limited resources). Then taking a slightly modified runs-wins formula (just about using 10 runs = 1 win), I can say that Vince Coleman’s steals are worth .6 wins conservatively. Alternate calculations factoring in leverage could double this number, so the best stealing base season of all times, was only worth .6-1.2 wins.

What were the top 5 seasons after Vince overall?

Well, glad you asked! I can send the whole spreadsheet to all who ask, but I thought I’d post here first.

- Vince Coleman 1986: 107 Steals, 14 CS
- Maury Wills 1962: 104 Steals, 13 CS
- Rickey Henderson 1983: 108 Steals, 19 CS
- Rickey Henderson 1988: 93 Steals, 13 CS
- Vince Coleman 1987: 109 Steals, 22 CS
- Tim Raines 1983: 90 Steals, 14 CS

Who were the worst offenders on the basepaths among prolific modern stealers?

Steve Sax in 1983 got caught 30 times in 86 attempts. That’s good for -1.1 runs. Bill North and Alex Sanchez rounded out the leadfooted three.

Overall, though, prolific stealers were generally net positive. Only 8 out of 213 seasons on this list had negative expected value.

The first conclusion I draw is that prolific basestealers, on the whole, are a net positive, even if it is a small positive. Obviously, there exist on the individual level suboptimal situations where runners still take off, but the large majority of prolific stealing seasons had high success rates. A highly negative rate of steals, of course, will be curtailed by coaches in due time before they could make this list.

—————–

Now, one more point. What is a true breakeven point? In the Moneyball Era, with AL teams in 2001 averaging 4.9 runs a game, 70% might be accurate. But in a dead ball era, a caught stealing might well have been more okay, because Albert Pujols wasn’t around to hit you home from first.

It then dawned on me that you could sort of model (crudely but still) a good breakeven model for steals based on runs scored per game.

4.9 runs per game = 70% breakeven

3.6 runs per game = 60% breakeven rate

and so on…

The values sort of fly apart at the extremes, but if your team is scoring 8 runs a game, it sort of would only make sense to steal if you had Usain Bolt standing at first, since, odds are the next guy might well get on base as well.

So in the spreadsheet I am sharing, you can change the average runs per game scored. If you move the average to 4.4 runs per game, Vince Coleman in 86 is still the best season of all time, but some other seasons creep up in value. (The breakeven at 4.4 runs per game with my crude calculator is 66%.)

Anyway, for anyone who wants the spreadsheet, post in the comments with a way of getting it to you — or, if FanGraphs is willing to let me upload it, I would be happy to do so.

Contextualized, Vince Coleman’s 1985, when only 4.07 runs per game were scored, makes his efforts extra impressive. It is a shame that he wasn’t great with the bat, and something of a buffoon to boot, because once he got on the bases, he legitimately did alter the outcome of games.

Print This Post

Great stuff! I hope you contribute more in the future.

This is great. Can you email a copy of the spreadsheet to mhudlow28@yahoo.com?

I actually figured more of the high base stealing totals would have a negative value, so that’s good to know. I suspect the worst totals were from the older days (Ruth went 17 for 38 one year). It might be harder to figure the worst single seasons, since it would require finding anyone with 5-10 attempts. But any idea who the worst basestealer ever is by this measure? Babe Ruth and Lou Gehrig were both pretty bad (123/240, 103/203) in high run scoring times, and the records on CS are spotty, but those are my guesses at “worst stealers of all time”.

You had me at Dave Cameron. Keep up the thought-provoking work!

I’m not sure why you didn’t just use the linear weight values. A stolen base is worth around .20 runs which would make a caught stealing -.46 using a 70% break-even rate. Vince’s 1986 comes out at +15 using these figures while

BaseballProjection.com, which also includes the non-stealing aspects of baserunning, has him at +17.Thanks for doing the research for this one! I would have actually guessed it was Coleman, b/c he was just ridiculous in his heyday.

You might consider posting your data at Editgrid.com (my tool of choice) or at Google docs. You can upload .xls files and make them world-readable, then just share the link in your post.

mh: i will send you my spreadsheet momentarily

jay: i understand the .2 is the conventional projection, and I understand the basic methodology behind it, but that methodology itself is subject to the condtions of the offense behind you, the pitcher, etc. i tried (and didn’t exactly succeed) at attempting to derive a value from a more ground-up approach, knowing that at the end of the day any analysis run would be somewhat limited.

aweb: agreed on the babe ruth. i wish we had better data. i dont know if it was a relic of the small ball era that people took off so willingly and low rates were acceptable.

redsoxtalk: i will try to get to editgrid tonight if i can. if mh uploads first i wouldnt be upset.

thanks!

In that event, Tango has custom linear weights by team for the years 1919-2000

here. Vince again comes out at +15 and there’s no environment where he comes out worse than +12.Nice work. This is more a question for the powers-that-be. I thought of it as soon as I saw the title, and then you briefly addressed it. It might be very difficult to put together historically, but would Fangraphs be able to track players’ WPA accrued purely from stolen base attempts? Maybe call it sbWPA?

I think there was a similar discussion at the Book blog a while ago comparing Raines to Rickey and some other stealers, looking at other factors that affect the value of the stolen base. A sbWPA framework would allow us to compare base robbers in their dynamic contexts. I’d think situational running would be more measurable than situational hitting. The breakeven point varies based on game state, and WPA captures that game state information.

To all: I was struggling with edit grid, admitedly I didnt get home till pretty late (11), and by then had a fair number of competing tasks to complete. I did email the spreadsheet to the fangraphs email address and i am holding out hope they can help me upload. I am starting something more ambitious shortly, that may well end in failure, but will require lots of data to manually compile and manipulate.

Thanks for the encouragement!

Roughly speaking, the breakeven point can be calculated as follows:

1. Take the runs per game

2. Divide that by 2.2

3. That gives you SB per CS breakeven point

4. To turn that into a rate, simply do x/(x+1)

So, if you have 4.4 runs per game, that becomes 2 SB per 1 CS, for a 66.7% breakeven point.

With 5 runs per game, that’s 69.4% as the breakeven.

I haven’t really tested it much, but it’s an easy enough rule of thumb to carry around.