Game theory and baseball, part 1: concepts

Game Theory is the study of strategic decision-making, which lends itself very well to baseball. At the player level, the manager level, and the general manager level, individuals make decisions based on beliefs about the expected actions of others, all of which can be better understood using game theory. Having studied game theory for a decade and published sabermetrics for five years, I figured it was about time to apply the former to the latter.

Using a game theoretic framework will justify some common baseball strategies but often will suggest that teams and players should behave differently. However, before any of the specific applications can be discussed, it’s important to get some concepts together. Today’s article will talk about some basic game theory concepts that we’ll need going forward and some basic applications associated with each to illustrate them. This gives us a broader framework for discussing pitch selection for a few articles after this.

Strategies, best responses, Nash Equilibrium, and dominant strategies

The principal difference between game theory and other economics is the incorporation of other people’s actions into determining one’s own. A strategy is defined as an option that a player can choose at any point in time, conditional on the information that he has at the time. In a simultaneous move game, the strategies are just the allowable actions.

In baseball, a set of strategies could be all possible lineups that a manager could choose at the beginning of the game. Choosing lineups is a simultaneous move game that mangers play at the beginning of the game. A sequential move game involves strategic substitutions with a dynamic aspect.

The most important concept in game theory to understand is “best responses,” as incorporated in a Nash Equilibrium. A Nash Equilibrium is an outcome to a game such that all players have “best responded” to each other’s strategies. In other words, no one would regret his strategy, based on the revelation of other players’ strategies.

In previous work, I have discussed the concept of why slotting was an effective way to keep bonuses down for drafted players, but let’s use a one-shot game (with no future drafts to follow) of this as an example and explain why slotting would not work without future drafts.

Suppose that the Dodgers and the Giants, two high-spending division rivals, are fighting for the National League West, and they are considering either:

(A) going over slot for a highly touted first-round pick out of high school and paying a large bonus
(B) paying the recommended slot bonus for a college senior in the first round

There are a few similar players at each spot, too, so the Dodgers and Giants can each select a high school or college player as they see fit. And let’s suppose for simplicity that all contracts are secretly hammered out in advance (i.e. decisions are not actually sequential). Both teams must decide whether to work out a contract with a highly touted high school prospect well in advance without knowing what the other team will do.

Assuming all else equal, the added advantage in future division championships leads to payoffs as follows:

Table 1A

Using Recurrent Neural Networks to Predict Player Performance
Technology is rapidly advancing possibilities in decision-making.

Dodgers\Giants College draftee High-school draftee
College draftee 5,5 1,7
High-school draftee 7,1 3,3

In the above table, we have the sets of strategies available to each team represented by the rows (for the Dodgers) and by the columns (for the Giants). In other words, the Dodgers get to pick the row that we end up in (making the Dodgers the “row player”) and the Giants pick the column (making them the “column player”).

The payoffs listed are the Dodgers (in blue), followed by the Giants (in orange) for each outcome. So, for instance, the payoff to the Dodgers of picking a high-school draftee is seven if the Giants pick a college draftee, but the Giants’ payoff would be just one.

To figure out what the teams will do, we need to figure out what the best response would be to each pick. Let’s be the Dodgers here. If the Giants pick a college draftee, which payoff is higher—choosing a college draftee or a high-school draftee? The payoff of seven for a high-school draftee exceeds the payoff from drafting a college player of five—so the Dodgers would prefer a high-school draftee if the Giants pick a college draftee.

What about if the Giants pick a high-school draftee? In that case, the Dodgers’ payoff to a college draftee would be just one, while a high-school draftee would be worth a payoff of three. The Dodgers would be better off picking a high-school draftee in either case.

This means that the Dodgers have a dominant strategy, which means their best response is the same for any Giants action. The Giants’ decision is symmetric, so they also have a dominant strategy. Both teams pick high-school draftees and get payoffs of three and we have a Nash Equilibrium, since both teams would not regret their strategies upon learning the strategy of the other player.

Sometimes, it’s easiest to underline and bold the best responses so that you can find the best answer. An equilibrium occurs when both numbers are underlined and bolded in a cell.

Table 1B

Dodgers\Giants College draftee High-school draftee
College draftee 5,5 1,7
High-school draftee 7,1 3,3

Extensive form vs. normal form

The table format above is called a “normal form,” but there is another way of displaying games, which is called the extensive form. This is most useful for sequential games because it shows the order of decision-making.

Let’s revise some assumptions. Say the Dodgers pick first, and then the Giants choose after observing the Dodgers’ pick. In the normal form, simultaneous-move game above, I fudged this assumption by pretending that everything can be negotiated in advance, but let’s take that back.
Figure 1A


Let’s break down the extensive form above. There are three decision nodes at which decisions can be made:

(I) Dodgers choose high school or college.
(II) Giants observe Dodgers’ choice of high school and choose high school or college.
(III) Giants observe Dodgers’ choice of college and choose high school or college.

If the Dodgers choose high school, and the Giants see this and choose high school, they both get a payoff of three (blue for the Dodgers, orange for the Giants). If the Dodgers choose high school, and the Giants choose college having observed this, the Dodgers get a payoff of seven and the Giants get a payoff of one. And so on. Now, let’s solve this game.

Backwards induction

Another important concept in game theory is backwards induction. This is a concept in sequential games that describes how players will solve the equilibrium by working out all decisions backwards from the end of the game to the beginning.

To find the equilibrium using backwards induction, you figure out all of the “final decision” moments during the extensive form game. That means all of the “decision nodes” where a player has a new “information set” for the last time. These are Decision Node II (after Dodgers have picked high school, when the Giants must pick high school or college) and Decision Nodes III (after Dodgers have picked college, when the Giants must pick high school or college). In each of these instances, the Giants will pick high school, for the same reasons as before.

Knowing this, we backwards induce to the Dodgers’ decision. The Dodgers know that if they pick high school, they will end up with a payoff of three, and if they pick college, they will end up with a payoff of one.

Sometimes it is easiest to draw over top of the extensive form of the game with bright-colored lines to show the decisions that will be made. In the end, the equilibrium will be such that you can follow a red line from the top to the bottom.

Figure 1B


Below, you can see how we draw the extensive form of a simultaneous-move game. This is not always useful, but it is another way of looking at the game. If the Giants do not know what the Dodgers have selected by the time they have to make their decision (as in my original example), we draw a dashed line between the two possible decision nodes at which the Giants might reside. We label the nodes and the dashed line between them “II.”

Figure 2A


We’ll use red lines to solve this problem again, only we require that the Giants pick the same decision regardless of what the Dodgers picked … because, after all, they don’t know what the Dodgers picked! Here is the solution: as above, both pick high school draftees and they get payoffs of three.

Figure 2B


This is just the tip of the iceberg when it comes to “prisoner’s dilemma” games and various assumptions that can be made, but it is sufficient for now. Next, let’s consider mixed strategies.

Mixed strategies

Nash Equilibria are named after John Nash, because he proved that there are solutions to all games that meet certain criteria. However, the “solutions” sometimes entail “mixed strategies.” Mixed strategies are characterized by selecting probabilities of taking a given action. For example, a (dumb) mixed strategy above could entail choosing “college” 70 percent of the time and “high school” 30 percent of the time.

So, let’s suppose that a batter is up and is deciding whether to swing or take on a full count, and let’s suppose the pitcher is deciding whether to throw a ball or strike. If the player swings at a strike, he’ll get a hit and win the game; if he swings at a ball, he’ll miss and strike out, losing the game. Naturally, taking strike three will also lose the game, and let’s just say that taking a ball will effectively win the game because Superman is on deck. Thus, the effect on each player’s team’s record is reflected in the below table:

Table 2A

Pitcher\Batter Swing Take
Strike -1,1 1,-1
Ball 1,-1 -1,1

For the sake of brevity, I will save the calculations to this for another article, but this game requires mixed strategies to find the Nash Equilibrium. Why is this true? You can see it when you realize that for every cell on the table above, one player wishes he selected the other action—no cell involves both players “best-responding.” This may be easier to see by underlining and bolding the correct best responses for the batter and the pitcher.
Table 2B

Pitcher\Batter Swing Take
Strike -1,1 1,-1
Ball 1,-1 -1,1

The way to view this game properly is to define strategies by “p” and “q” such that the batter selects “Swing” with probability “q,” and the pitcher selects “Strike” with probability “p.”

Table 2C

Pitcher\Batter Swing (with probability “q”) Take (with probability “1-q”)
Strike (with probability “p”) -1,1 1,-1
Ball (with probability “1-p”) 1,-1 -1,1

In the end, the only strategies that will work will be when the batter selects a strategy of having a 50 percent chance of swinging and a 50 percent chance of taking, and the pitcher selects a strategy of 50 percent strikes and 50 percent balls. They will each win half the time, and neither player could be any better off by selecting a different strategy. The batter knows that the pitcher is throwing 50 percent strikes, so he’ll win half the time whether he always swings or never swings.


The pitch-selection example above is an important one, and it is one of the areas of baseball analysis that has barely been analyzed. As I will show in the coming articles, players are probably not picking optimal strategies in terms of pitch selection, and they could improve their winning percentages by following a more strategic framework.

Tomorrow: Introduction to pitch selection

Print This Post
Matt writes for FanGraphs and The Hardball Times, and models arbitration salaries for MLB Trade Rumors. Follow him on Twitter @Matt_Swa.
Sort by:   newest | oldest | most voted
Alan Nathan
Alan Nathan

Interesting article, Matt.  I look forward to reading the next one in the series.