﻿ Tools: Game and Series Win Probabilities | The Hardball Times

# Tools: Game and Series Win Probabilities

Just how frequently would the Astros beat an All-Star team? (via Keith Allison)

Last month, FanGraphs implemented its very cool Game Odds system, which estimates the chance of a team winning a particular game while factoring in the fact that home teams win about 54 percent of the time.  A couple of months earlier, I’d shared a tool with the Hardball Times/FanGraphs crew that aimed to do basically the same thing, though using a different method.  Well, here I am, finally, with an article to introduce you to that tool.  For fellow lovers of probability, I’m including a second tool of mine that complements the first by figuring out what specific game win probabilities mean to a team’s overall chance of winning a series. FanGraphs’ Game Odds factors in the 54 percent Home Field Advantage (HFA) by simply adding 4 percent to the home team’s chance of winning.  That makes sense, if both teams are completely league-average, .500-level teams, as the 54 percent HFA is the league average.  But to take things to a ridiculous extreme, a home team that would be expected to win 97 percent of the time before taking home field advantage into account couldn’t be expected to win 97 percent + 4 percent = 101 percent of the time.  Of course, it’s not realistic to expect one major league team to beat another 97 percent of the time — I’d imagine even the Astros could beat an All-Star team more than three percent of the time.  Even so, you can see how there are going to be diminishing returns on home field advantage toward the extremes.  That’s why I think there’s a more accurate (though more complicated) way to do the calculations than to simply add four percent to the home team’s chances.  I believe that way might be the odds ratio formula.

### Astros vs. All-Stars

Just for fun: what would be the expected result of the Astros playing an All-Star team?  Well, first of all, what is an All-Star team’s chance of beating a league average team — in other words, what would be an All-Star team’s expected winning percentage over a season (or more)?  So that existing teams don’t lose their best players to this team, let’s assume the roster is composed of exact duplicates of the majors’ best players.  Off the top of my head, I’ll guess they’d likely win somewhere between 70 percent and 80 percent of the time against an average team.  Let’s go with 75 percent.  Let’s also presume that the Astros’ .364 record as of this writing represents their true winning percentage, even though FanGraphs’ projections page has their rest-of-season win percentage at a more conservative .429. So if the All-Stars would beat a .500 team 75 percent of the time, how often would they beat a .364 team?  The log5 formula popularized by Bill James, which FanGraphs’ Game Odds uses, would say 83.98 percent of the time, to be precise.  That’s without home field advantage being considered.  If the All-Stars were the home team, FanGraphs’ system would put their chances of winning at 87.98 percent; if they were visiting, that would drop to 79.98 percent.  My tool, which you’ll see shortly, assuming a HFA of 54 percent would instead predict these replicants to win 86.02 percent of their home games vs. the Astros, versus 81.70 percent away.  It’s a difference between the two methods of around two percent in either direction — not huge, but worth considering.

### The Math

The odds ratio formula, the use of which was introduced to me by a number of sabermetric all-stars in a piece on batter-pitcher match-ups last year, is one of my favorite tools nowadays.  It’s an offspring of the famous Bayes’ Theorem held dear by many a statistician, and a more sophisticated sibling of the aforementioned log5 formula.  The advantage the odds ratio formula (or at least this type of odds ratio formula) has over log5 is that it can take into account league averages, and can therefore apply to circumstances other than the one log5 was developed for. What is that circumstance?  Well, take a look at the formula: log5:  Chance of Team A winning against Team B = A*(1 – B) / (A*(1 – B) + (1 – A)*B) …where “A” and “B” are the respective overall winning percentages of Teams A and B. Now compare that to the odds ratio used in my calculations (in a simplified version): Chance of Team A winning against Team B = 1 / (1+B*(1-A)*(1-H)/A/H/(1-B)) …where “A” and “B” are the respective overall winning percentages of Teams A and B, and “H” is Team A’s Home Field Advantage (by default, A’s HFA is 54 percent if it’s the home team, or 46 percent if  visiting). In an easier-to-read form:

With a little algebra, you can see that if you were to take the (1-H) and the H out of the odds ratio equation, you’d end up with the log5 formula.  Or, if you had H= 0.5, then 1-H would also equal 0.5, cancelling each other out and also resulting in the odds ratio formula giving the same result as log5. What’s happening in the formula, in a nutshell, is this:

• (1-A) is the basic chance of Team A losing
• B is the basic chance of B winning
• (1-H) is the basic chance of a home team losing

If team A is the home team, then all three of these events have to happen simultaneously for team A to lose the game, right?  That means you have to multiply them all out, and together they combine to relate to the chance of team A losing. By the way, by “basic chance,” I mean as an overall average — i.e., not specific to this match-up. Meanwhile:

• A is the basic chance of Team A winning
• (1-B) is the  basic chance of Team B losing
• H is the basic chance of a home team winning

Multiplied out together, the combination relates to the chance of Team A winning.  That means the equation overall is equal to: 1 / (1 + (A’s losing factor) / (A’s winning factor)) The two factors together make up the L:W ratio for Team A.  For example, if Team A has a “true” .620 record, Team B has a true record of .550, and if home field advantage for Team A is 60 percent, then the losing factor comes to about .0836, and the winning factor comes to about twice that much, at .1674.  You could simplify this to a 1:2 L:W ratio (or 2:1 W:L), or you could run it through the rest of the formula to convert that to a 66.7 percent chance of A winning the match-up. The rest of the equation simply serves to convert the odds ratio into a probability.  The result of the losing factor divided by the winning factor can range between 0 (when there’s no chance of losing) and infinity (when there’s no chance of winning), meaning the end result of the overall equation can range only between zero and 100 percent. Has home field advantage been integrated into the odds ratio formula in this way before?  I’m sure it has, but I bet you haven’t seen a handy tool to do it like this before:

### Exact Series Breakdown Tool

This follow-up tool shows the expected outcomes of series that are best of three, five or seven games, according to how many wins are needed for a team to clinch the series (two, three or four, respectively). As before, the white cells are the ones you can overwrite.
If the chance of a team winning each game in the series is identical, then there is no point of using this tool; the binomial distribution method used in the first tool will give you identical results.  However, to provide a ridiculous example of a type of situation where using this tool will give you much more accurate estimates than simply running the overall average win rate through the binomial distribution method:  say your team is made up of Terminator androids from the future that run out of batteries after a couple of games in the series.  In a five-game series, they have a 100 percent chance of winning the first two games, but zero chance of winning each of the next three.  Overall, this team would have zero chance of winning the series, right?  Taking the average win rate of (100%+100%+0%+0%+0%)/5 = 40% and feeding it into the first tool would tell you the team has a 31.7 percent chance of winning the series.  The second tool will get it right, though, at zero percent.

On a more practical level, this tool would become more useful when there are some lopsided starting pitching matchups on both sides.  It will also take any other significant game-to-game differences, such as home field advantage, better into account.

I tried to make this tool with a logic that can be expanded to larger series lengths, should any of you have some Excel skills and feel like doing it yourselves.  The tools can be downloaded via the green and white Excel icons at the bottom of each.

### Next on the Agenda

[Probably] coming in the future will be another tool that attempts to make predictions based on run-scoring and run-prevention traits of the teams involved, along with some historical testing.  Until then, happy statisticking!

Print This Post
Steve is a robot created for the purpose of writing about baseball statistics. One day, he may become self-aware, and...attempt to make money or something?
Guest
tz

I will not play with this at work today.
I will not play with this at work today.
I will not play with this at work today.
I will not play with this at work today.
I will not play with this at work today.
I will not play with this at work today.
I will not play with this at work today.
I will not play with this at work today.
.
.
.
.

Guest
Cliff Blau

In his 1984 Baseball Abstract, Bill James showed that the home-field advantage was proportionate to the team’s record. Thus, a team that won 60% of its games had a larger home-field advantage than a team which won 40% of its games. Jim Heg expanded on this in the February 1989 issue of the Baseball Analyst: https://sabr.box.net/shared/static/gc4uubnd5a5mzg1xig8o.pdf

Editor
Member

Awesome, Cliff. Thanks for the reminder. This is a study that’s definitely worth updating and trying to incorporate into Steve’s spreadsheet.

Guest

Nice. I’m a hockey analyst that built a very similar tool for the NHL playoffs that worked out the home-ice implications of the NHL’s 2-2-1-1-1 series format. Two big implications: 1) the team starting the series at home gets a huge advantage by winning the opening 2 games; 2) if the team starting the series on the road is better, sweeps or six-game series wins are more likely than five- or seven-game outcomes.

Guest
Alan McIntire
Bill James’ formula assumes an infinite number of teams in the league. There should be a correction for the number of teams in the league. Starting simple, consider a 2 team league, where one team won 60% of the games, the other won 40 % Bill James’ formula gives the win to loss ratio as 0.6*0.6/0.4*04 or 36 to 16 ratio when the actual ratio is 0.6 to 0.4 To get the ACTUAL power ratings, for a two team leage, take the win/loss ratio to the 1/2 power to get the actual power rankings. For the two team league, (0.6/0.4)^… Read more »