The Luckiest and Un-Luckiest Pitchers According To Base Runs

On June 3rd Marlins pitcher Henderson Alvarez threw an 88-pitch shutout against the Rays scattering eight hits while not issuing a walk. On July 11th Marlins pitcher Henderson Alvarez also gave up eight hits while not issuing a walk but only made it five innings after surrendering 6 runs. While the circumstances surrounding these two starts aren’t completely the same they do a good job illustrating the phenomena of cluster luck.

Cluster luck, originally discovered and coined by Joe Peta in his book Trading Bases, essentially tells us how lucky teams have been by measuring the difference in the expected number of runs scored by a team based on its power (total bases), and base runners (hits/walks) and its actual number of runs scored. In Alvarez’s July start above he was a victim of poor sequencing, allowing his hits in bunches rather than spreading them out over the course of his start. For a more complete (and easier to understand) definition and some real world examples check out this and this.

What I will be attempting to do in this article is figure out a way to accurately estimate how many runs a pitcher should have allowed, and subsequently what his run average should look like, and then pinpoint certain pitchers who have been lucky or unlucky so far this season. Basically I am trying to normalize a pitcher’s RA by adjusting for sequencing and cluster luck.

Fortunately for me the heavy lifting for part one has already been done thanks to Dan Smyth. His metric, Base Runs (BsR), was developed and popularized in the early 1990’s and is an extraordinarily simple yet accurate way of estimating runs allowed using standard box score statistics. Base Runs for pitchers takes four inputs, innings pitched, hits, walks, and home runs, which are converted into four factors, A, B, C, and D. The final formula looks like A*B/(B+C)+D. For a lengthier piece on Base Runs, it’s properties, and it’s pros and cons consult this and this.

I took these statistics, including run average, for every pitcher in the majors through July 12th and figured his expected runs allowed by Base Runs, then converted it to Base Run Average or BsRA and took the difference between BsRA and his actual RA. I also calculated the pitchers’ RA- and BsRA- by taking the pitcher’s RA or BsRA and divided it by the league RA or BsRA (for reference the league RA is 4.14 and the league BsRA is 4.19). By taking the difference between the two, (BsRA-)-(RA-), we can figure out the percentage of extra runs compared to league average the pitcher should have allowed.

In the tables below you’ll see I’ve given this stat the name Luck%, a poor name admittedly since we’re dealing with percentages and I’m sure the differences aren’t completely due to luck but the name will have to do until I think of something better. For example Max Scherzer’s RA- is 80.92 (RA of 3.35/league RA of 4.14) meaning he has allowed runs at around 81% of the league average, but his BsRA- is 88.62 (BsRA of 3.71/league BsRA of 4.19) meaning he should have allowed runs at around 89% of the league average. We then get a Luck% of 88.62-80.92=7.71, so Scherzer should have allowed 7.71% more runs compared to league average, he has a Luck% of 7.71.

Whew. Now we can get to the names.

First the top ten qualified pitchers who have had their numbers most positively affected by cluster luck.

Name IP RA BsRA BsRA- RA- Luck%
Mark Buehrle 126.1 2.92 3.95 94.3 70.5 23.7
Wei-Yin Chen 104 4.24 5.19 123.8 102.4 21.4
Jason Vargas 125 3.38 4.23 101 81.6 19.4
Zack Greinke 118.2 3.11 3.91 93.4 75.1 18.2
Alfredo Simon 116.2 2.78 3.50 83.5 67.1 16.3
Josh Beckett 103.2 2.6 3.30 78.9 62.8 16.1
Masahiro Tanaka 129.1 2.71 3.41 81.5 65.5 16
Yordano Ventura 101.2 3.36 4.03 96.2 81.2 15
Chris Young 105.1 3.16 3.81 91 76.3 14.7
Henderson Alvarez 120 3.23 3.85 91.8 78 13.8

I like this list since it is very diverse. We have pitchers who have been pleasant surprises this season but who we all know aren’t really that good (Vargas and Simon). Older pitchers experiencing a late career resurgence (Beckett and Buehrle). Great pitchers (Greinke and Tanaka) and not so great pitchers (Chen). Hard throwing (Alvarez) and soft throwing (Young). High strikeout and low strikeout etc. etc. It’s good to see that not just one type of pitcher is affected giving me confidence that cluster luck does play a factor in a pitchers numbers to such a degree even this late in the season.

Now on to the top ten pitchers who have had their numbers most negatively affected by cluster luck.

Name IP RA BsRA BsRA- RA- Luck%
Anibal Sanchez 94.2 3.52 2.44 58.2 85 -26.8
Matt Garza 124.1 4.42 3.37 80.4 106.8 -26.3
Justin Masterson 98 6.06 5.09 121.4 146.4 -25
Tyler Skaggs 91 4.65 3.78 90.2 112.3 -22.2
Charlie Morton 119.1 4.15 3.36 80.1 100.2 -20.1
Roenis Elias 112 4.94 4.33 103.2 119.3 -16.1
Jorge De La Rosa 102.2 4.91 4.32 103.2 118.6 -15.4
Edwin Jackson 105.1 6.07 5.53 132 146.6 -14.7
Jose Quintana 119.1 3.85 3.31 79.1 93 -13.9
Hiroki Kuroda 116.1 4.64 4.19 100 112.1 -12.1

This is a slightly less diverse list. Most of these guys are having disappointing seasons, but perhaps they haven’t been as bad as we think. Four of these guys have a below average RA, but an above average BsRA (or perfectly average in the case of Kuroda). Then there’s Anibal Sanchez who might just be one of the most underrated pitchers in baseball as his BsRA is seventh in all of baseball.

So what does Luck% end up telling us about a pitcher? We know that pitchers have little control over what happens after a ball is put in play, but what we’re doing here is figuring out which pitchers have been victimized by poor sequencing. Perhaps we can look at Luck% the same way we look at BABIP. If the measure is abnormally high compared to a pitcher’s career rate and the pitcher hasn’t made a substantial improvement in his mechanics or pitch repertoire perhaps some regression is in order.

So is Anibal Sanchez due for a spectacular second half? Maybe not. A myriad of factors could be influencing his low Luck%. We know that in general offense goes up when runners are on base and Sanchez could be especially susceptible to allowing runs to score in bunches. He has a slow move to the plate potentially allowing more runners to steal and get in scoring position. Perhaps his stuff is less effective from the stretch due to a breakdown in mechanics. Maybe he focuses too much attention the runners on base and not enough on the one at the plate, I really don’t know.

I only have half a season of data on 100 or so pitchers so obviously more research is needed. One could find the correlation between Luck% and peripheral stats such as K% and BB%, or find year to year correlations for Luck% to find out how much variation is actually luck and how much is skill. I’d definitely be intrigued by those results and I’ll likely revisit these numbers when the season ends.

I’m still relatively new to performing this kind of analysis so any constructive criticism would be greatly appreciated or if you’ve seen something like this done elsewhere on the internet. If you have suggestions for any improvements (especially the name) or further research I’d love to here it. If you think I majorly screwed up somehow I’d love to hear about too.

Print This Post

Sort by:   newest | oldest | most voted

This passes the smell test. I’d be interested to see the results with weighted offensive statistics as opposed to total bases.


Aidan – this is very cool stuff. I think your approach using BaseRuns is a great intermediate step between FIP-type metrics and ERA/RA9 type metrics.

I think you’re wise to caveat that the term “Luck” may be misleading, since items related to basestealing and baserunning are buried in your metric alongside the cluster luck effect. Rany Jazayerli has a fantastic piece on Buehrle that breaks down the mystery of his run-prevention skills (though I’d have to say Buehrle is an extreme outlier in this regard):


There is also the issue of pitchers who are worse from the stretch than from full windup. Would be interesting to see xFIP differential with men on vs. empty for these pitchers and compare it to MLB average differential. You’d be able to isolate, to some extent, a pitcher’s change in pitching ability (physical or mental), if any, as opposed to his baserunning prevention skills.

paul dreyfus
paul dreyfus

I would say that cluster luck applies better to hitters than to pitchers. I think that the “scattering hits” is more of a skill than a sign of luck: Good pitchers make mistakes to one batter, then refocus and improve their pitching after that. Bad pitchers collapse after mistakes. I think with hitting, ability to hit in clusters is a) a reflection of a pitcher losing it and b) the overall fortune of the team against good pitchers. Thinking this through, it would be interesting to look at team hitting cluster luck vs the “cluster luck” (or pitch scattering skill) of pitchers they face. I’d be willing to be the Giants faced pitchers with very poor “cluster luck” (read: inability to scatter hits) until June, and after that, pitchers have been much better at not collapsing with men on base vs. the Giants.


I see what you’re saying and think there may be something to it. Not so much that pitchers can “will” themselves to pitch better than normal after they’ve struggled, but just that the better pitchers don’t seldom come unraveled while others might fall apart.

Also, if a pitcher loses effectiveness on a pitch in the middle of a game, feels a twinge, or just wears down, it can worsen his effectiveness over a series of plate appearances. This is another area where pitchers drive more of the clustering than hitters do, and this could also be something that certain individual pitchers are more prone to than others.


I would say that cluster luck applies better to hitters than to pitchers. I think that the “scattering hits” is more of a skill than a sign of luck: Good pitchers make mistakes to one batter, then refocus and improve their pitching after that.


I’m surprised Lance Lynn didn’t make the unlucky list. Whenever he gives up a run, it seems like it usually turns into a crooked number.