- FanGraphs Sabermetrics Library - http://www.fangraphs.com/library -

# Heat Maps: What They Show, and Mistakes to Avoid

When David Appelman dropped his newest bomb on us the other day and announced that you could now find customizable heat maps here at FanGraphs, I think it’s safe to say that most of us saber-nerds had our minds blown. Personally, I’ve always admired the work that Dave Allen and other Pitch F/x gurus have done, yet being unskilled in the art of SQL and R, I figured this was a type of analysis that would always be beyond my abilities. Following in the footsteps of other FanGraphs updates, though, this analysis has now been democratized and made available to even the newest of saber newbies. You don’t have to know how to string together code or manipulate huge data sets: all you need is a mouse and a pointer finger.

But heat maps are like any other tool: before you can add them to your toolbox, you have to understand how to use them. Pitch F/x data can be a tricky thing to interpret, and many experienced saberists (myself included) have made mistakes because they didn’t know what they can and can’t do with that data. What exactly are heat maps? What do they show, and how should we use them? Let’s go exploring:

Heat Maps are rather intuitive: quite simply, they’re a strikezone plot that shows how often a pitcher throws a pitch in a certain location. To use David Appelman’s example, if you want to know where Mariano Rivera throws his cutter against lefties (or righties), this is the tool for you. These specific heat maps won’t tell you anything about a pitch’s movement, velocity, or effectiveness – they’re strictly plots of pitch location – but that doesn’t mean they’re without their uses. By looking at heat maps for pitchers, you can learn how a pitcher’s repertoire varies between lefties and righties, where in the zone they attack lefties and righties (and with what pitches), and if they hit the corners or leave many pitches over the plate.

When you look one of these heat maps, you’re looking at the strike zone from the catcher’s perspective. This is a detail that confuses many people, since we’re used to watching baseball from the pitcher’s perspective, but this is how all Pitch F/x charts are set up by default. If you need a visual to understand, here’s my (very) rough attempt:

(I know, I know, I have mad Paint skills. This is years of practice, folks.)

Now that you know what Heat Maps are and what they show you, let’s discuss some common analytical mistakes to avoid:

– Pitch Classifications: As Dave Allen told me, “The graphs are only as good as the pitch classifications.” There are hundreds of thousands pitches thrown over the course of a baseball season, and therefore there are hundreds of thousands of lines of data that need to be classified and organized. The people at MLB Advanced Media (MLBAM) are responsible for creating the algorithm that sorts pitches – this is a cutter, this is a fastball, this is a slider, etc. – and that algorithm has changed and improved over the years.

Due to these changes in the algorithm, you should be very careful when looking at a pitcher’s heat maps over time. While it may look like a pitcher has dramatically changed his pitch selection and location over the past few years, that’s likely the result of his pitches being reclassified. For example, look at Mariano Rivera’s fastballs and cutters over time. It looks like Rivera used to throw few cutters and lots of fastballs, but we all know that’s not true: Rivera has always predominantly thrown a cutter. Those cutters were just classified as fastballs in the past.

– Sample Sizes: As with all statistics or graphs, you should always be careful about drawing conclusions from a small amount of data. This shouldn’t be a concern with starting pitchers, who throw a large amount of innings and pitches each season, but I’d hesitate before using these heat maps to draw conclusions about pitchers that only had a few starts in one season or pitched a small number of innings in relief. These heat maps are good at showing us large, overarching trends, and it’s difficult to have a “trend” if you’ve only pitched a small number of innings.

– Over-Smoothing: Creating a perfect heat map that displays useful data is more an art form than a science. Since David Appelman gave us all control over the definition and color scheme of the heat maps, you can make the exact same data look a variety of ways. For example, here are a number of different ways to look at Rivera’s 2010 cutter usage versus lefties:

Each map is displaying the exact same data, yet each one looks vastly different and tells you different information. You want to make the data look pretty by smoothing out the heat map and making the color gradient flow, but at the same time you don’t want to over-smooth your map and remove all valuable information, like in the first two charts on the top row. It takes time to find the happy medium – the one that looks best while also still remaining true to the underlying data.

– Over-Stating Results: Remember, these heat maps will only show you information on pitch location – NOT on pitch movement, velocity, or effectiveness. They can tell you which pitches a pitcher throws against each hand and where those pitches are normally located, but you shouldn’t use these charts to make exaggerated claims. If you’re trying to understand why a pitcher has been effective or ineffective in certain situations, it’s best to look not just at these heat maps, but also at pitching statistics and all the Pitch F/x charts available. Pitchers are notoriously tricky to properly evaluate, so if you think you’ve found a simple solution/answer to a pitcher’s problems, you’re probably wrong.

These heat maps can be a great evaluative help when used properly, and if nothing else, they’re a heck of a lot of fun to look at. Keep all these above caveats in mind if you try using them for analysis, but if you’re just looking to have some fun, enjoy!

For more information, check out the new Heat Maps page in the FanGraphs Library.