The Odds of Hitting for the Cycle

Last week, Mike Trout hit for the cycle. When asked for a comment, coach Mike Scioscia said, “If I’m a betting man, I’ve got to believe there’s another cycle in his career somewhere.” That got me wondering.

Whenever I was in a math class where probability was being discussed, the question often in the back of my mind was, “How can this be applied to baseball?” One of the things I love the most about baseball is how well it lends itself to situations of probability, compared to most sports. I’m not sure what that says about me. Anyway, I figured this would be the perfect opportunity to refresh my memory (and hopefully some of yours) on how to crunch the numbers on situations like this. Don’t worry — the principles work on useful things other than just calculating the odds of that gimmicky achievement we call the cycle.

OK, let’s get right down to the math. This won’t be too hard, really. Kind of long, but hopefully worth learning.

First example: say a batter gets a hit 40% of the time overall, and makes an out the remaining 60% of the time. Let’s break down the odds of how 2 plate appearances of his will turn out:

Results Odds Combined Odds
PA #1 PA #2 PA #1 Result PA #2 Result
Possibility #1 Hit Hit 40% 40% 16%
Possibility #2 Hit Out 40% 60% 24%
Possibility #3 Out Hit 60% 40% 24%
Possibility #4 Out Out 60% 60% 36%
Total: 100%

For example, the odds of both PAs resulting in hits is 40% multiplied by 40%, which equals 16%, or 0.16. But there are two ways (“permutations”) that can result in getting a hit and an out between the two PAs, and each has a 24% chance of occurring… so, together, there’s a 48% chance this player will bat 0.500 over his two PAs. The remainder is the 36% chance of going 0-for-2.

The example is a really simple one… it gets a lot more complicated when you’re dealing with, say, 7 PAs, and considering the odds of a single, double, triple, etc. This being math and all, of course there are formulas you can use as shortcuts for coming up with the number of permutations. The formula for coming up with the total number of permutations is: n^r (n to the power of r), where n is how many types of things we’re considering (in the simple example, it’s 2 — hits and outs) and r is how many events we’re looking at (2 PAs in the example). 2^2 = 4 total permutations here. If we were considering singles, doubles, triples, homers, and outs as the only possible outcomes (there are 5 of them), and were analyzing the possible ways these could be arranged in a span of 7 PAs, the answer would be 5^7 = 78,125. So, yeah, that wouldn’t be fun to calculate by hand.

That formula, by the way, is specifically for situations where repeats are allowed (a.k.a. “with replacement”); since there’s nothing really making it impossible for a hitter to get several outs in a row, we can use this here. However, when it comes to breaking down the number of specific types of permutations (e.g. 1 hit and 1 out over 2 PAs), there’s another formula we should consider: r!/(r1! * r2! * … *rn!) . By the way, I saw this formula written with n’s instead of r’s, but I think that’s just confusing, since the variable here is the number of events. The exclamation mark stands for factorial, which tells you to multiply that number by all the positive integers that come before it; e.g. 4! = 1 * 2 * 3 * 4 = 24 … in Excel, =FACT(4) will do the trick. All the different r’s in the denominator represent how many instances there are of each type of event. I think that could use an example:

So if we’re talking about a cycle happening over the course of six plate appearances, since the cycle is achieved in only four of those PAs, we have two “spare” PAs to consider. Let’s simplify the possible outcomes to 1B, 2B, 3B, HR, and non-hits. Possibilities for those two spares include:

  • 2 singles
  • 1 single, 1 double
  • 1 single, 1 non-hit
  • 2 non-hits

… and you can imagine the rest. But let’s look at each of those. If the two spares are both singles, then there are a total of three singles in the six-PA sample. There are only one each of doubles, triples, and HR, and no non-hits in that situation.

1! and 0! both equal 1, which means we can ignore everything but singles in the denominator. If we wrote it all out, though, it’d look like 6!/3!1!1!1! … notice the different r’s in the denominator add up to the big r in the numerator. Simplifying down, the formula we end up with is 6!/3! = 120 permutations. That means there are 120 possible sequences of 6 PAs that could result in 3 singles, 1 double, 1 triple, and 1 HR. You’ll see why that’s relevant in a second.

OK, let’s say we’re dealing with a hitter who singles in 20% of his PAs, doubles 5%, triples 1%, and homers 9% of the time. We start finding the odds of him hitting for the aforementioned combination by doing: .2 * .2 * .2 * .05 * .01 * .09 = 0.00000036 . Not very likely, right? Well, that’s really the probability of each possible arrangement of that combination, which we discovered are 120 of them. So multiply that result by 120 to show that he has an overall 0.0000432 chance (or 0.00432%) of hitting 3 singles, 1 double, 1 triple, and 1 HR over the span of 6 PA.

If we’re talking about a 6-PA sequence with 2 singles, 2 doubles, 1 triple, and 1 HR, that’s 6!/(2! * 2!) = 180 permutations. The odds are therefore .2 * .2 * .05 * .05 * .01 * .09 * 180 = 0.0000162.

If it’s 2 singles, 1 double, 1 triple, 1 HR, and 1 non-hit, then there’s only one repeat, and it’s 6!/2! = 360 permutations. The odds of a “non-hit” are 1 – .2 – .05 – .01 – .09 = 0.65. So our odds are .2 * .2 * .05 * .01 * .09 * 0.65 * 360 = 0.0004212. A lot likelier than going 6-for-6, right?

Finally, let’s consider that the 2 PAs other than the cycle are non-hits. The non-hits are the repeat this time, so it’s again 6!/2! = 360 possibilities. Now, .2 * .05 * .01 * .09 * .65 *.65 *360 = 0.0013689. Our likeliest way to get a cycle by far.

Now you have to repeat the process for combos involving things like 3 triples (not likely), 2 doubles & 2 HRs, etc., but hopefully you get the point. The permutations will follow the same patterns, but the odds calculations will differ. But after you figure them all out, you add them all up and it gives you the total odds of that player hitting for the cycle given that many PAs.

The next step is doing the same sort of procedure for different given PA levels. You know that a cycle is impossible if you only get 3 PA in a game, so you can skip that. At 4 PAs, it’s really simple — there’s only one combination that allows you to get the cycle, and there are no repeats among them. There are 4! = 24 permutations of {1B, 2B, 3B, HR}. At 5 PAs, we’ll either have 1 non-hit in the mix — 5! = 120 — or there will be 1 repeat of a 1B, 2B, 3B, or HR — 5!/2! = 60. And so on. The next step is finding out how likely a player is to get 4 PAs, 5 PAs, 6 PAs, etc. in a game. I just did an analysis on 2012 Retrosheet data and found (by lineup position):

# of PAs in Game Leadoff 2nd 3rd 4th 5th 6th 7th 8th 9th Average
2 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.1% 0.1% 0.1% 0.0%
3 0.2% 0.5% 1.4% 3.2% 5.8% 10.7% 17.0% 25.7% 34.8% 11.0%
4 44.5% 52.6% 60.1% 65.7% 69.6% 69.9% 67.7% 62.7% 56.2% 61.0%
5 48.6% 41.5% 34.2% 27.8% 21.8% 17.0% 13.3% 9.8% 7.4% 24.6%
6 5.5% 4.3% 3.4% 2.7% 2.3% 1.9% 1.6% 1.4% 1.2% 2.7%
7 0.9% 0.7% 0.6% 0.5% 0.4% 0.4% 0.3% 0.3% 0.3% 0.5%
8 0.3% 0.2% 0.2% 0.1% 0.2% 0.1% 0.1% 0.1% 0.1% 0.2%
9 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%
10 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0%

As you can see, hitting towards the top of a lineup can make a huge difference in how often a player will get those crucial 5+ PA games.

The Results

Using the 2012 PA breakdowns from above and MLB averages for 2010 through last week, I found the average odds of hitting for the cycle for a hitter with an “average” lineup spot to be about 0.0044% per game, or about once every 23,000 games. Well, maybe you can bump those odds up a little bit, because I didn’t consider the results of 8 PAs in a game and beyond. For a leadoff hitter (with the same MLB average stats, not with typical leadoff-hitter stats), it would be about 0.0071% per game, or close to once every 14,000 games.

But it turns out that Trout appears to indeed be the likeliest batter in the majors to hit for the cycle, with a neutral lineup position. He’s been hitting 2nd in the lineup recently, but if he were hitting first, you might figure him for about a whopping (relatively) 0.0375% chance of the cycle per game, or better than once per 2,700 games, based on his career rates. Add in the fact that he’s in a good lineup — and should therefore get more PAs than average — and things look even better for him. But even if we optimistically put him at a cycle per 2,500 games, that’s about once every 16 seasons, on average. Since triples are the hardest part of hitting for the cycle, we have to wonder how easy it will be for a bulky Trout to leg out a triple as he advances in age. It’s not going to get any easier — that’s for sure. So, sure, there’s a pretty good chance he’ll have another cycle, relative to most players, but it’s probably a 50-50 shot, at best.

Oh, 4th place on that list, by the way — Bryce Harper, at better than once per 3,600 games. The projections see him hitting fewer triples than he showed us last year, though.  Here are your top 25 by chance of a cycle per game (based on 2010-present historical numbers, signified by “H”, or by the average updated Steamer and ZiPS 2013 projected numbers “P”, with a 400 PA minimum):

Leadoff (H) Leadoff (P) Mid-lineup (H) Mid-lineup (P)
Mike Trout 0.03754% 0.04023% 0.02331% 0.02495%
Tyler Colvin 0.03452% #N/A 0.02133% #N/A
Carlos Gonzalez 0.03360% 0.03402% 0.02090% 0.02112%
Bryce Harper 0.02824% 0.01958% 0.01748% 0.01212%
Carl Crawford 0.02759% 0.01398% 0.01712% 0.00867%
Jose Reyes 0.02351% 0.02053% 0.01461% 0.01276%
Josh Hamilton 0.02349% 0.01411% 0.01459% 0.00871%
Dexter Fowler 0.02271% 0.02206% 0.01405% 0.01364%
Seth Smith 0.02102% 0.00653% 0.01300% 0.00404%
Yoenis Cespedes 0.02065% 0.01379% 0.01279% 0.00852%
Shane Victorino 0.02062% 0.01106% 0.01276% 0.00685%
Ryan Braun 0.02028% 0.01888% 0.01262% 0.01172%
Corey Hart 0.02027% #N/A 0.01256% #N/A
Carlos Gomez 0.02012% 0.03331% 0.01245% 0.02067%
Todd Frazier 0.01996% 0.01141% 0.01234% 0.00704%
Curtis Granderson 0.01885% 0.01216% 0.01163% 0.00750%
Robinson Cano 0.01842% 0.00807% 0.01145% 0.00501%
Logan Morrison 0.01800% #N/A 0.01111% #N/A
Peter Bourjos 0.01797% #N/A 0.01110% #N/A
Will Venable 0.01792% 0.01449% 0.01107% 0.00894%
Brett Lawrie 0.01754% 0.01707% 0.01086% 0.01054%
Stephen Drew 0.01736% 0.01452% 0.01072% 0.00895%
Troy Tulowitzki 0.01699% 0.01187% 0.01055% 0.00738%
Andrew McCutchen 0.01676% 0.01215% 0.01039% 0.00753%
Melky Cabrera 0.01604% 0.01220% 0.00997% 0.00757%

Here are some actual, historical stats:

Cycles

And a downloadable spreadsheet for you, if you want to see all of my calculations (watch out — it’s not pretty, and the numbers are from last week):

Some Caveats:

  • We can’t really be sure how relevant a player’s past rates are to their future odds (especially triples, since there are so few of them). You can try out Steamer or ZiPS projections in place of past performance, if you download the spreadsheet.
  • Park and lineup effects can make a big difference (changing teams can change the odds)
  • I’m sure PA frequency breakdowns aren’t entirely consistent between years, yet mine are based on only 2012 data
  • I only worked this out through 7 PAs in a game.  Obviously, if you get 8 or 9 PAs in a game, your odds of a cycle go up considerably… but that rarely happens, especially over 9 innings.
  • Triples rates are basically the deciding factor here, and they are hard to predict.


Print This Post



Steve is a robot created for the purpose of writing about baseball statistics. One day, he may become self-aware, and...attempt to make money or something?


Sort by:   newest | oldest | most voted
Thufir
Guest
Thufir
3 years 4 months ago

Babe Herman laughs at your calculations. From the grave.

Excellent post, my money is on Trout.

JC
Guest
JC
3 years 4 months ago

Really cool stuff, it’s cool when stats breakdown stuff is shown on here, thanks.

Mike
Guest
Mike
3 years 4 months ago

Awesome piece… Just the kind of math I was in the mood for after a long weekend.

Would it be fair to say, based partially on your last sentence (Triples rates are the deciding factor) that the 3 things that matter most, in order, would be 1) Speed, 2) in-play% and 3) better-than-non-existent power?

As I see it, if a guy has wheels, puts the ball in play often enough, and can fathomably put one over the wall, he’s a candidate for the cycle.

Dave
Guest
Dave
3 years 4 months ago

Awesome post. It would be fun to do the same things for perfect games, no-hitters, 4-homer games, etc.

Jake
Guest
Jake
3 years 3 months ago

Calculating the odds of a perfect game are actually pretty straightforward. All you need to know is the odds of any hitter reaching base and raise it to the 27th power. I tried this just for fun and found a few interesting things to consider.

1) You need to subtract IBB% from OBP, because a perfect game assumes there are no IBBs.
2) You need to add the percentage of plate appearances that end in an error to OBP.
3) An error can be recorded on a pop foul and a pitcher can still through a perfect game (yes?)
4) Throwing errors due to players attempting to throw an advancing runner out are hard to remove from base data.

I came up with roughly 0.00000000001% chance of a perfect game occuring(on average).

Jake
Guest
Jake
3 years 3 months ago

I really need to proofread before I click “Post Comment”. Woof.

Jake
Guest
Jake
3 years 3 months ago

And check my math, because I totally did that backwards. You need to determine the odds of a plate appearance ending in an out. I recalculated and came up with 0.002%. That means that there should be roughly one perfect game per decade.

A fifth thing to consider: These odds assume that the team who throws the perfect game also score a run before there are extra innings (I’m looking at you ’95 Expos).

Gabriel Syme
Guest
Gabriel Syme
3 years 3 months ago

It also assumes all pitchers are of average ability, and that their skill level on a given day does not vary.

Ian R.
Guest
Ian R.
3 years 3 months ago

In response to #3, yes, an error that does not allow a baserunner (i.e. a dropped foul ball) does not spoil a perfect game. As far as I’m aware that’s never actually happened in a perfecto at the major league level, but it could.

Jason H
Guest
Jason H
3 years 4 months ago

This all assumes that the likelihood of batter outcomes are the same in each plate appearance. But of course, they vary depending upon lots of things, especially the pitcher they are facing. In actual individual games, the odds must go up and down. It would be interesting to know the magnitude of the variance. It would also be interesting to know how the odds change during a game depending on the events of the game. For example, if you hit your HR, 3B and 2B early, there is a good chance your team has scored a lot of runs, and you might be facing mop up relievers later in the game, etc.

Evan DS
Guest
Evan DS
3 years 4 months ago

This is true. This analysis should only work on a large scale, that is, something along the lines of: if there have been x player games this year, what is the likelihood of there having been greater than 1 cycle hit? Percentages on the order of these, near a hundredth of a percent, are kind of meaningless for dealing with one or two games.

Spit Ball
Guest
Spit Ball
3 years 3 months ago

Works both ways though. You might be facing mop up relievers, You might be facing Mariano Rivera. He did quite a bit of Math to come to the conclusion he did. Their will always be questions with this stuff. That’s part of what makes it fun.

evan
Guest
evan
3 years 4 months ago

since hitting for the cycle is the only thing i care about, i have no choice but to insist that tyler colvin is the second best player in all of baseball.

Ransom
Member
Ransom
3 years 4 months ago

Agreed. Anticipating this column, I had already traded for Colvin in my 1 category cycle fantasy baseball league. It’s a keeper league, so now I’m pretty happy with that move.

evan
Guest
evan
3 years 3 months ago

unless you traded trout for him, good move.

i’m in a similar league, but it includes one pitching category.

balks.

TB
Guest
TB
3 years 4 months ago

Nice post! Reminded me a bit the perfect game odds discussion in Tango’s forum.

bcholm
Guest
bcholm
3 years 4 months ago

So what are/were the odds of Aaron Hill hitting for the cycle twice last year? Especially given that it was, well . . . Aaron hill?

Pinstripe Wizard
Member
3 years 4 months ago

Nerd!

chuckb
Member
chuckb
3 years 3 months ago

Isn’t easier, and at least as accurate, to take the number of times a hitter has hit for the cycle divided by the total number of games played to determine its probability. Using the play index at baseball-reference, I found that players have hit for the cycle 239 times since 1916. Divide that by the number of games played since 1916 gives one a pretty strong sample with which to work.

Todd
Guest
Todd
3 years 3 months ago

If you wanted to get a sense of the probability that any player will hit for the cycle in a given season, this may be useful. But the analysis here is better for dealing with specific players. Additionally, the raw number of cycles over a 100 year period could be misleading since it is not adjusted for context (i.e., cycles are more common in some eras than in others).

river-z
Guest
river-z
3 years 3 months ago

Congratulations on all that crazy math which impresses me though am sad to say I didn’t understand much of it. The reason it seems to me that Trout is likely to hit for another cycle sometime is that he gets quite a few triples, the hardest part of the cycle. He has 6 already this season.

Ballfan
Guest
Ballfan
3 years 3 months ago

cool post – but you guys are missing the most incredible part of Trout’s cycle

he posted negative WPA – first time ever a player hit for the cycle, and did not contribute to the win….amazing

http://blogs.thescore.com/mlb/2013/05/23/mike-trout-cycle-wpa-zero-angels/

Nickname Damur
Guest
Nickname Damur
3 years 3 months ago

Bit of trivia: In 2234 games in the major leagues, John Olerud hit 13 triples. He hit for the cycle twice.

Jason H.
Guest
Jason H.
3 years 3 months ago

Question: How many times did he hit for the cycle minus the triple? Is hitting for the cycle twice unexpected given that he only hit 13 career triples?

Jason B.
Guest
Jason B.
3 years 3 months ago

On a related note – it irks me when announcers say a player is “a triple short of the cycle”. While technically true, it’s just not notable of an accomplishment, since it’s the rarest/hardest part that they’re lacking. “Well gee, we almost got the weather cycle this September day – we had sunshine, high winds, and rain, we were just lacking the snow!” or “I almost completed the cycle – I played little league, high school, and college ball, just didn’t make the majors!” No crap! You’re not special!

/Digression ended/

Dan
Guest
Dan
3 years 3 months ago

Bengie Molina hit for the cycle in 2010, still the most incredible thing I have ever seen. The odds of him even hitting a triple at that point must have been one-in-a-million.

Tim A
Guest
Tim A
3 years 3 months ago

One other interesting variable is that players look too cycle. Once you have that HR and triple it gets a lot easier, and players will sit at first on a double to complete it. I don’t know how this would effect things, but I have too assume when your random distribution gives you the hardest parts the odds spike. I would wonder how many times last season Trout hit HR 3B in his first 4 PA so that he was hunting for hits that were easier for him to complete. If for instance he had 5 hard side cycle starters last season, then I would put the odds at like 5% he completes one. I would maybe put it at 1/4 the original odds you purposed, since he will get at least 3-5 games this season with HR/3B, and since he will want to cycle if they fall early, he will try for it when ever distribution puts him in a position too. If you have the other 3 your hitting for the fourth part. One other curiosity, has anyone hit for the cycle with an inside the park as the HR?

Joe
Guest
Joe
3 years 3 months ago

The last “inside-the-park” cycle was apparently Harry Danning of the New York Giants, on June 15, 1940. He was a catcher of all positions!

http://www.fangraphs.com/statss.aspx?playerid=1002978&position=C

Jays_Sask
Guest
Jays_Sask
3 years 3 months ago

Players are also more likely to hit a triple if that is the last hit they need because they will often try to stretch doubles into triples in that situation.

shapular
Guest
shapular
3 years 3 months ago

I’m being pretty nitpicky, but odds and probability are not the same thing.

MrKnowNothing
Guest
MrKnowNothing
3 years 3 months ago

Trout will do it five more times in his career, at least. Because he’s Mike Trout and he’s awesome.

65Kyle08
Member
65Kyle08
3 years 3 months ago

like Jay Gatsby im here to f*ck your b*tch
put moves on her like Will Smith in Hitch

Zach
Guest
Zach
3 years 3 months ago

So how would you figure out what the probability is, given that Mike Trout homered and tripled in his first two at bats, that he will hit for the cycle that game?

Jack Meough
Guest
Jack Meough
3 years 3 months ago

Steve,
what are the chances andrew mccutchen will hit for the cycle tonight?

Thanks,
Mr. Meough

Eric Hainline
Guest
Eric Hainline
3 years 3 months ago

If Mike Trout “…gets a hit 40% of the time overall, and makes an out the remaining 60% of the time…” I will be thrilled beyond belief, and not care at all if he ever hits for another cycle.

Aaron Steindler
Guest
Aaron Steindler
3 years 3 months ago

I’m not sure that you can calculate the odds of a perfect game that simply. I understand the concept, but why not use the starting pitcher’s on base percentage against instead of the average hitter’s OBP? Or perhaps finding the percentage of Quality starts and multiplying by the OBP against of the quality starts? I think there are several approaches to making this determination, and I don’t know if they are self consistent.

gdc
Guest
gdc
3 years 3 months ago

Seems that someone has looked at probability using randome distribution here as opposed to actual. With the streakiness of players it should help here but hurt things like extra-long hitting streaks. Just like the odds of striking out 20 batters is pretty thin if teams had even distribution of talent but more likely with the existence of the Astros.

FHOFRI
Guest
FHOFRI
3 years 3 months ago

Ever consider that the probability of two or more hits are not independent of each other? Meaning, a player gets a hit or two, generates momentum (gets “hot”), and the probability of doing the cycle increases. Basically capitalizing on early game success. In this way, you could think about it as a conditional probability (not just multiplying the two probabilities together). Great post, great read, cheers.

chief00
Member
chief00
3 years 3 months ago

Steve, this is a great piece. I’m a trained history, geography, and theology guy, so most of the math is lost on me but I enjoyed the article!

A (very shallow) thought occurred to me as I’ve read the article and the comments: since, say, 1990, being a Blue Jay at some point in your career significantly increases the probability of hitting for the cycle. I’m thinking of Jeff Frye (a cycle I witnessed!), Aaron Hill (x2), Dave Winfield, John Olerud (x2), Bengie Molina, Paul Molitor, Tony Fernandez, Jeff Kent, Jose Reyes, Fred Lewis, Melky Cabrera, Orlando Hudson, and Kelly Johnson. 81 cycles since 1990, 15 by players who played with the Jays at some point in their career.

Or maybe I’m looking at it backwards: if you’ve hit for the cycle, the probability that you have been/will become a Blue Jay is significantly increased. That makes it a management issue. :)

If only there was some genuine correlative factor. Well, perhaps I shouldn’t swim in the deep end of the pool…

Jason B.
Guest
Jason B.
3 years 3 months ago

Probably just AA seeking out new market inefficiencies…those cycle guys are way undervalued…

Homer Simpson
Guest
Homer Simpson
3 years 3 months ago

Me no function beer well without.

Scott Youngbauer
Guest
Scott Youngbauer
2 years 4 months ago

What would be the odds that someone would hit for the cycle in their very first collegiate game as a freshman? I know someone who accomplished this.

bill mason
Guest
bill mason
2 years 4 months ago

My son hit the cycle in order with 4 at bats what’s the odds on that its his junior year.

wpDiscuz