Archive for Strategy

No Pitch Is an Island: Pitch Prediction With Sequence-to-Sequence Deep Learning

One of the signature dishes of baseball-related machine learning is pitch prediction, whereby the analysis aims to predict what type of pitch will be thrown next in a game. The strategic advantages of knowing what a pitcher will throw beforehand are obvious due to the lengths teams go (both legal and illegal) to gain such information. Analysts that solve the issue through data have taken various approaches in the past, but here are some commonalities among them:

  • Supervised learning is incorporated with numerous variables (batter-handedness, count, inning, etc.) to fit models on training data, which are then used to make predictions on test data.
  • The models are fit on a pitcher-by-pitcher basis. That is, algorithms are applied to each pitcher individually to account for their unique tendencies and repertoire. Results are reported as an aggregate of all these individual models.
  • There is a minimum cut-off for the number of pitches thrown. In order for a pitcher’s work to be considered they must have crossed that threshold.

An example can be found here. The goal of this study is not to reproduce or match those strong results, but to introduce a new, natural-fitting ingredient that can improve on their limitations. The most constraining restriction in other works is the sample size requirement; by only including pitchers with substantial histories, the scope of the pitch prediction task is drastically reduced. We hope to produce a model capable of making predictions for all pitchers regardless of their individual sample size. Read the rest of this entry »


Frankenstein and the Rays’ Sister City Concept

In 2018, the Tampa Bay Rays introduced the Opener, a novel concept in which a relief pitcher started a game with the purpose of shutting down an offense in the first few innings. The Opener would then hand the ball to a bulk pitcher, who went three-to-four innings before giving way to the usual bullpen corps.

When the Rays introduced the Opener strategy, many in baseball thought it was blasphemy. Starting pitchers have roles and this is the way the pitcher order has been for generations. How dare the Rays upset the natural order of roles, titles, and statistics?

When analysts looked at the Rays roster, however, they quickly understood what the team was doing. By not recognizing a “pitching rotation,” the Rays were looking a level deeper. They were stacking pitchers on a per-game basis, with the intent to win each game and hence build enough wins to make the playoffs. Once it was understood, the Opener was applauded and eventually copied throughout the league.

Besides being a sly way to neutralize lineups, the Opener represented the “Rays Way” amidst financial necessity. The team could not afford a typical major league rotation of four or five quality starters. Relief pitchers are cheaper and easier to find. They couldn’t find five aces, so they built ace performances using multiple relievers, with the additional bonus of paying them less. If you can’t find a hundred-million-dollar starter, build one. Read the rest of this entry »


Are Third Base Coaches Too Hesitant in Sacrifice Fly Situations?

Imagine you are coaching third base. Your team is at bat with a runner on third and one out. There is a flyball caught in marginally shallow left field. You think your runner has about a 50/50 chance of scoring if you send him. Do you send him?

Many of you would probably say no. This is a risky call. There is a 50% chance the runner would be out, which would be a huge momentum killer. Furthermore, if he gets caught and your team loses by a run, you are going to be the person blamed by the media.

My hypothesis is that third base coaches are leaving runs on the table. Over the past four seasons, third base runners scored 98% of the time when sent in sac fly situations, suggesting that coaches are sending them only when they have a very high degree of confidence of success. I hypothesize they won’t send runners unless they feel they have at least an 80% chance of scoring, but my analysis says they should be sent even with much lower chances. Read the rest of this entry »


Pitch Mix Effectiveness

In a previous project, I attempted to determine what types of pitches are most effective in 1-2 and 0-2 counts based on suspicions that wasting pitches was not inherently strategic. I did this by analyzing league average wOBA values of different types of pitches in and out of the strike zone. The findings showed that on average, breaking and off-speed pitches outside of the zone were the most effective pitch to throw in order to minimize wOBA in both 0-2 and 1-2 counts.

While using league-average data produced some interesting results, I was still unsatisfied, since trying to project pitching strategy to a single pitcher doesn’t work when the data is league-wide. My goal was then to write an algorithm that could use a specific pitcher’s career pitching history to analyze the results of each of their pitches and determine every pitcher’s most effective pitch mix.

After a long time writing and editing code, I believe I have written a script that can do just that: evaluate each pitcher who has thrown more than 1,250 pitches since the start of 2019 and determine the wOBA value of each of their pitches at every count. Read the rest of this entry »


Using Clustering To Generate Bullpen Matchups

In today’s game, reliever usage may be more important than ever. As starters go less deep into games, more emphasis is placed on bullpen strategy to survive the mid-to-late innings. Teams can use data to streamline this process, strategizing relief pitcher usage based on their pitch repertoires and batter ability. My goal is to produce a matchup tool that can potentially give us some insight as to how the big league teams “play the matchups.”

The basis of a bullpen matchup recommender will be at the pitch level: what types of pitches does a particular hitter struggle against, and how do they align with what a particular pitcher throws? To do this, I will first use clustering methods in order to redefine pitcher arsenals based on pitch flight characteristics. Matchups will then be selected according to which pitcher is expected to perform the best against a given batter, optimizing pitcher strengths against batter weaknesses.

Data

To conduct this research I used available Statcast data from 2016-2021 (through this year’s trade deadline). My variables of interest are as follows: pitch location (plate_x & plate_z), perceived pitch speed derived from release extension (effective_speed), pitch movement (pfx_x & pfx_z), spin rate (release_spin_rate), and the newly introduced spin axis (spin_axis). I elected to include spin axis in order to account for how the batter may see the pitch as it’s released. All in all, the variables selected measure the stuff and location of each pitch so that we may classify them more accurately beyond the basic pitch type labels. After cleaning this dataset and removing outliers, I was ready to move on to the modeling process. Read the rest of this entry »


Which Pitch Should Be Thrown Next?

There are few things I enjoy in baseball more than the pitcher vs. hitter dynamic. Everyone likes to see highlight plays like a great catch or a mammoth home run, but those plays are few and far between. I believe that the tension created in a drawn-out plate appearance is where baseball is most enjoyable. Every pitch is meaningful, and the strategy of the game is on full display. The pitcher is trying to decide the best way to get the hitter to produce an out and the hitter is doing everything he can to thwart the pitcher.

This dynamic of baseball has always fascinated me. I was curious how pitchers and catchers decided which pitch was correct to throw in a situation. There are plenty of tools available to them that were not readily available when I was a child, like heat maps made from pitch-tracking data, but they show results without the context of what previous pitches were thrown in the plate appearance. Heat maps provide useful data, but the real art of pitching is being able to set up a hitter to take advantage of their weaknesses. If a pitcher throws the same pitch in the same location every time, eventually the hitter is going to catch on and change his strategy accordingly. So which sequence of pitches is the most effective at retiring hitters? This is the question I attempted to answer with this article. Read the rest of this entry »


Jake McGee: The One-Pitch Pitcher

One of the newest members of the San Francisco Giants, lefty reliever Jake McGee, is coming off one of his best years in the major leagues throwing one pitch: a fastball. Seemingly by magic, McGee twirled a fastball 97% of the time he threw in 2020 on the way to a 2.66 ERA, 0.836 WHIP, and 11 strikeouts for every walk. I will be taking an in-depth look into McGee’s success and failure over his career, which might give better insight as to how he can continue to perform and how a major league reliever can succeed with only one pitch.

McGee was drafted in 2004 by the Tampa Bay Rays and made his major league debut with them in 2010. After his first full season in 2011, McGee posted extremely strong numbers in 2012, 2014, and 2015 with an ERA+ (it will become clear why I use ERA+) of 148 and a K/BB of 5.02 within those four seasons. After the 2015 campaign, McGee was traded along with Germán Márquez to the Colorado Rockies in exchange for Corey Dickerson and Kevin Padlo.

McGee immediately regressed in Colorado, as his ERA+ went from 163 to 103 (ERA+ adjusts for ballparks, which is particularly useful at Coors Field) and his K/BB sunk from 6 to 2.38 in the transition from the Rays to the Rockies (2015-2016). Of course, some of this decline is attributed to the difficult conditions of Colorado, but there is also additional evidence to show that McGee’s style of pitching contributed to his declined performance. Following 2016, McGee remained a strong-yet-aging reliever and was ultimately released by the Rockies in July of 2020.

Four days later, McGee signed with the Los Angeles Dodgers and proceeded to outperform even his 27-year-old self with an incredible season. McGee finished in the 99th percentile in K%, 96th in BB%, 95th in xERA, and 95th in xwOBA. So what exactly was the cause of this change and what did McGee do to get there? Read the rest of this entry »


A Lineup Construction Experiment

Who should bat second? This question has been debated quite a bit in recent years, as the modern approach has become to slot the best hitter in the 2-hole to increase their total plate appearances in a season. Others argue that the second hitter, like the leadoff man, should be a table-setter and the goal should be to get the best hitters to the plate with runners on base. So which is more valuable: getting your best hitter to the plate with men on or getting them to the plate more often? A simple experiment suggests that we are wasting a lot of energy arguing either side, and it would be time better spent thinking about other elements of lineup construction.

Overview

I created nine fictional players that will be referred to by position. I arbitrarily provided probabilities for the players based on seven possible plate appearance outcomes: single, double, triple, homer, walk, hit by pitch, and out. To simulate the lineup playing a game, I used a simple base-to-base style (the runners on base move up the same number of bases as the batter). An oversimplification of play to be sure, but the goal is to get an approximation of potential lineups relative to each other. Each lineup “plays” 100,000 nine-inning games so that the run distribution is virtually identical on multiple simulations. Read the rest of this entry »


Pitch Count Efficiency is Undervalued

During Game 6 of the World Series, Kevin Cash infamously replaced his cruising starting pitcher, Blake Snell, with reliever Nick Anderson. Anderson would give up the lead before registering an out, and the Los Angeles Dodgers won the Series for the first time in 32 years.

A heavily criticized decision by many, both in the moment and in hindsight, the move is representative of the new direction many clubs have been heading towards. This is calculated and analytics-heavy decision-making on reliever usage that has caused both a major shift in the value of relievers and a steady increase in pitchers used in games.

The consistent incline of pitchers used per game notably paired with the decline of average pitches and innings thrown by starters begs the question: how should pitch count factor into removing pitchers from games? If starters are removed for the fact that they are facing the top of the order for the third time rather than because they are fatigued or have seen a decline in their outing performance, is it important to pass on hittable pitches in order to drive pitch count up? Alternatively, is there value in being a pitcher who can record outs quickly if by the time Mookie Betts comes to the plate in the 6th inning, the threat of impending doom will chase an ace at 73 pitches out of the game? Read the rest of this entry »


The 3-0 Count Dilemma

While it might not appear so, baseball games constantly portray economic thought, such as in the mathematical model of game theory. There are many ways game theory takes place, but a classic example is the prisoner’s dilemma. Imagine a police officer is interrogating two suspects of robbing a bank together. The police officer has some evidence to put them in jail, but a confession would go a long way. Each suspect is contemplating confessing to the crime. If both suspects keep quiet, they will each receive five years in jail. If one suspect confesses and the other keeps quiet, the one who kept quiet will receive 20 years in jail while the suspect who confessed will receive just one year. If both confess, they each receive 10 years in jail. The logical choice for each suspect is called the dominant strategy. The end result, or the combination of each suspects decision, is called the Nash Equilibrium. By using game theory, we come to the conclusion that each suspect should confess to the crime, meaning they will each get 10 years in prison. I won’t go much into why this is the case, but feel free to research more about game theory and the Nash Equilibrium on your own.

What does this have to with baseball? We can think of each pitch as game theory, with each suspect as the pitcher and batter. Instead of confessing to a crime, the pitcher is contemplating throwing a ball in the strike zone while the batter is contemplating swinging. While the prisoner’s dilemma has a Nash Equilibrium, not only does a pitch to a batter not have a Nash Equilibrium, but the combination of decisions is constantly changing. If the batter’s dominant strategy is to swing, then pitchers will throw more balls outside the batter’s reach. If the pitcher’s dominant strategy is to throw a ball, then the batter will take more pitches.

We could observe this thought process for every pitch thrown. However, let’s look at one type of pitch: 3-0 counts. If you are the batter, it might seem obvious to take the pitch. The worst-case scenario is you end up with a 3-1 count. If you are the pitcher, it might seem obvious to throw an easy strike. You do not want to walk the batter, and you know the batter doesn’t want to swing and risk giving you an easy popup to get out of good count. So I guess the batter should take every pitch and the pitcher should throw the ball right down the middle every time. Read the rest of this entry »