﻿ On curveballs | The Hardball Times

On curveballs

The advent of the PITCHf/x system has opened so many new doors to baseball analysis, as I’m sure you know by now. With great resources like Josh Kalk’s player cards and PITCHf/x tool, everyone has easy access to pitch classifications, usage, speeds, movement graphs, and a number of other useful things. There are so many aspects of pitching, though, that have received very little coverage over the past year.

One of these things is location. Fellow THT-er John Walsh wrote an excellent piece regarding fastball location a few months back and took a preliminary look at change-ups a few weeks ago, but I’ve yet to see much regarding other pitches. Location is a very important part of pitching, which is why I’d like to take a look at how location affects the effectiveness of the various other pitches in a pitcher’s repertoire. Today, let’s look at the curveball.

Types of curveballs

Before we get into location, though, we should take note that there are different types of curveballs.

The 12-to-6 curve is the one that most fans seem to be familiar with, many because of the hype Barry Zito’s got while he was with the A’s. 12-to-6 refers to times on a clock, so this is a curveball that has lots of vertical movement but little horizontal. Another type of curve, the 11-to-5, also refers to a clock, and has more horizontal movement but not necessarily as much vertical. There are also curveballs without a lot of vertical movement but with a good deal of horizontal movement. They’re somewhere between a slider and a 12-to-6 or 11-to-5 curve. We’ll call these “slurvy” curves.

Using PITCHf/x data, I’ll classify curveballs into four groups: 12-to-6 curve, 11-to-5 curve, the more “slurvy” type of curve, and the crappy curve that doesn’t move much more than a ball thrown without spin. Here’s a table breaking down how I defined these four types of curves.

```+---------+-----------------+----------------+
| Type    | Horiz. Movement | Vert. Movement |
+---------+-----------------+----------------+
| 12-to-6 |     < 4         |      < -4      |
| 11-to-5 |     > 4         |      < -4      |
| Slurvy  |     > 4         |      > -4      |
| Crappy  |     < 4         |      < -4      |
+---------+-----------------+----------------+```

If it's easier to understand, here's where these curves might appear on the traditional movement graphs that accompany most PITCHf/x articles.

Which type of curveball is best, though? Some scouts will argue that the Zito-esque 12-to-6 curve is best because it has tons of vertical movement and can be used against both lefties and righties. Others will argue that the 11-to-5 break is preferential because it can have the same vertical movement as the 12-to-6 curve but also can have ridiculous horizontal movement. Much-hyped Clayton Kershaw throws one. Still others will argue the benefits of throwing a harder, slurvy ball like Francisco Rodriguez.

So what is actually best? Here are the runs100 values (used to measure the quality of a pitch, introduced by John Walsh here) for each of these types of curveballs broken down by whether the pitcher is facing a same or opposite-handed batter (as well as the overall values). Remember that negative is good for runs100.

```+---------+-----------------+-----------------------+-----------------------+
| Type    | runs100 Overall | runs100 vs. Same Hand | runs100 vs. Opp. Hand |
+---------+-----------------+-----------------------+-----------------------+
| All CBs |      -0.77      |         -0.68         |        -0.85          |
| 12-to-6 |      -0.37      |         -0.96         |         0.48          |
| 11-to-5 |      -1.00      |         -0.93         |        -1.10          |
| Slurvy  |      -0.80      |         -0.60         |        -1.02          |
| Crappy  |      -0.68      |         -0.84         |        -0.53          |
+---------+-----------------+-----------------------+-----------------------+```

We first see that even a crappy curveball is an effective pitch with a -0.68 runs100 value. For context, a four-seam fastball's runs100 value is 0.43. This likely has something to do with the unique movement a curveball gets (only a slider has much of a chance of moving into the bottom right quadrant on our movement chart) and the reserve with which pitchers throw it (very few pitchers throw the curve more than 15 or 20 percent of the time—that is less than once per at-bat).

Looking at the league as a whole, though, we see that there is a disadvantage to throwing a 12-to-6 curve. Crappy curveballs are next (I found it strange that they weren't worse than the 12/6 variety; they are essentially 12-to-6 curves without much vertical movement). 11-to-5 curves grade out best.

Confirming conventional wisdom, though, the 12-to-6 curve is best used against a same-handed batter but terrible against opposite handed batters (again, surprising that they do worse here than crappy curves). 11-to-5 curves actually turn out to be the most effective against different-handed batters followed closely by slurvy curves.

So for starting pitchers, this data would indicate that it is probably best to throw an 11-to-5 or slurvy curve (assuming it's thrown to all batters). If a starting pitcher, however, sees that the 12-to-6 curve thrown to a same-handed batter is effective and strictly use the pitch against those batters, then one could argue that the 12-to-6 is a comparable choice to an 11-to-5 or slurvy curve. Of course, the pitcher would also need to have a quality pitch to throw to opposite handed batters, and the 11-to-5 curve is almost identical to same-handed batters and best to opposite-handed batters.

I saw the potential for some bias here, though. What if there are more practitioners of say, the 12-to-6 curve, and there are a lot that don't throw them particularly well. That would drag down the overall results. So I decided to run the study again with different criteria so that only pitchers with what could be considered a great curveball are included. Here is the criteria.

```+---------+-----------------+----------------+
| Type    | Horiz. Movement | Vert. Movement |
+---------+-----------------+----------------+
| 12-to-6 |       < 4       |     < -8       |
| 11-to-5 |       > 7       |     < -4       |
| Slurvy  |       > 7       |     > -4       |
+---------+-----------------+----------------+```

And here are the results.

```+---------+-----------------+
| Type    | runs100 Overall |
+---------+-----------------+
| 12-to-6 |     -1.34       |
| 11-to-5 |     -1.17       |
| Slurvy  |     -0.81       |
+---------+-----------------+```

So for elite pitches (or close enough to elite to give us a decent sample size), we see that 12-to-6 and 11-to-5 curves are significantly better than their league average counterparts and that there is now little difference between them (the 12-to-6 curve is actually best). They are both, however, significantly better than the slurvy curve, which is roughly the same as the league average.

After reading this article on K-Rod, though, he says that one of the keys to his slurvy curve is the speed (he throws it just over 80 mph). If we look at slurvy curves over 77 mph, the runs100 value jumps up to -1.18, almost the same as the 11-to-5 curve.

Automatic Scouts: The Changing Landscape of Scouting
Some major league organizations are cutting their scouting staffs. What might they lose?

So overall, we might give the advantage to an 11-to-5 curve unless you throw an elite curveballs, in which case they are all similar but the 12-to-6 curve has a small advantage.

Components of an effective curveball

Because sample sizes would be pretty small if we split curveballs up, going forward we will have to lump all curveballs together. At the end of the year, maybe I'll combine all 2007 and 2008 data and look at each type separately.

Fellow THT-er Josh Kalk already penned a nice article on curveball effectiveness, but he used 2007 data. I'd highly recommend reading that article before proceeding if you haven't already.

I ran the 2008 data to make sure things have stayed the same, and they mostly have. The difference in speed between a pitcher's fastball and curve had the highest correlation with runs100 (0.36 — the lower the speed differential, the better), followed by the hump size (0.29—the smaller the better). Josh found these two to be most important as well. Josh found the horizontal value where the hump occurs to be quite important too, though it wasn't as strong in 2008 (0.11). The difference in horizontal release point was actually third at 0.21.

The sample sizes here aren't incredibly large (I included all pitchers with at least 200 curves and 200 fastballs), and correlations aren't the most elaborate process, but these should give us a good starting point for looking at location.

Location

Now that we know what curveball components correlate with runs100, we can use the highest scoring ones to examine location. We'll first look at speed differential.

Notes on reading the charts and the data included in the charts:
To increase the sample sizes, I broke the charts down into four zones instead of nine like John Walsh did in his fastball piece. I also included both 2007 and 2008 data up until this point.

Furthermore, I lumped both right-handed and left-handed pitchers together and made the zones relative. The first graph is shown from the perspective of a right-handed batter facing a right-handed pitcher, and the second is shown from the perspective of a left-handed batter facing a right-handed pitcher. To simplify, the batter would be standing to the left of the left chart and to the right of the right chart. This was a necessary step to take to get sample sizes up to a healthy level, and it keeps things succinct as some readers have requested.

Finally, since the sample sizes aren't incredibly large, I removed all pitches that fell within a small slice in the middle of the zone (the zone had a width equal to 1/4 the total width of the strike zone and a height equal to one-fourth the total height on the strike zone). This still left me with big enough samples and made me more confident that one zone didn't have a disproportional number of pitches that were thrown closer to the middle, skewing the results.

Location by speed differential

So what can we learn from these charts?

Well, first, it seems that speed differential matters very little when facing an opposite-handed batter.

All of the dots are between -2 and -4 runs100 (or very close to it) except for the 0-11 MPH pitches in the low and away section and the 17+ mph pitches in the low and inside section. The sample sizes here are relatively large, so that probably doesn't explain all of it. Interesting to note, but the only thing that's very much out of place is that 17+ dot. So if you throw a curveball with a huge speed differential, don't throw it down and in to an opposite-handed batter.

Now let's look at the charts to same-handed batters, where things get more interesting. Throwing the ball down and in seems like the worst place to throw it. High and inside is a great location if you have a small speed differential and one of the worst locations if you have a high speed differential. High and away looks like it is generally a good idea, but the lower the speed differential the better. The same goes for down and away, though pitches with very low speed differential seem to break the trend and do a little worse (but still decent).

Location by hump size

Keep in mind when looking at these two charts that hump size is influenced by curveball speed, so there is a good deal of overlap. Because of that, we'll keep this section short. You could skip it if you wish.

Against opposite-handed batters, like with speed differential, hump size doesn't matter much. The very large humps do a little more damage when thrown everywhere but up and in. The other three size humps all seem to cluster pretty similarly.

Against same-handed batters, things differ a little from the speed differential graph. Up and in, the 0 to .40 foot hump is incredibly effective at -7 runs100 (didn't even fit on the chart). It's also interesting to note that the largest hump is effective here as well while the middle two are less effective. This could be random fluctuation, although the middle two have over 500 and 1,000 data points (respectively) and the ends each have over 200. Down and away looks pretty much the same except that the smallest hump is more effective than the smallest speed differential is. Down and in and up and away look relatively the same.

Location by vertical movement

Against opposite-handed hitters, the more vertical movement the better down and away, though the differences aren't large. Down and in, those without much vertical movement do worse. Up in the zone, the dots seem to be all over the place. The best combination seems to be lots of vertical movement high and in at -5.7 runs100.

Against same-handed hitters, throwing away is generally the best idea, especially if you have a lot of vertical movement. Throwing down and inside seems like the worst idea, especially if you don't generate much vertical movement. Interestingly, we see the complete opposite trend high and in. Pitches without much vertical movement actually do very well here, second only to those with lots of movement down and away.

Location by horizontal movement

When looking at horizontal movement, we can draw most of the same conclusions we did with vertical movement (probably not a great surprise given the equality we found between 12-to-6, 11-to-5, and slurvy curves).

Against same-handed batters, throwing to the outer half of a zone seems like a good idea regardless of movement. Down and inside is effective with lots of movement but very ineffective without any. High and inside, we again see the opposite with lots of movement being bad and little movement being good. Really good, actually, at -4.5 runs100 (the most effective dot on the chart).

Against opposite-handed batters, things are smoother than with vertical movement but generally give the same impression. It seems the only way to gain an advantage here is to throw a curve with lots of horizontal movement high and away.

Location - Nine zones
We can draw some more conclusions by splitting the strike zone into nine compartments, although we can't break it down by speed, hump, or movement because sample sizes would be too small. We can look at all curveballs, though, which is shown below:

I've sort of hinted throughout that location didn't seem to matter much when facing an opposite-handed batter. Once we break it down into nine zones, though, this doesn't seem to be entirely true. Here, we see that it is best to throw to the lower two-thirds away or to the upper two-thirds inside. The middle of the zone is the worst place to throw, obviously, but at -2.0 runs100 it is still an effective pitch and close to the other four zones.

Now take a read of this quote from Keith Hernandez’s book, Pure Baseball, quoted in John Walsh's piece on the work.

The right-handed pitcher facing the right-handed batter wants to throw the breaking ball on the outside corner. Why? If any breaking ball misses the target, it's usually to the left, outside, as the right-handed pitcher sees the plate. Locating the breaking ball inside in this righty-righty matchup is even tougher psychologically because the pitcher has to aim almost behind the batter. So the tendency is even more to miss to the left. And if you aim at the inside corner but miss to the left, where does that leave the pitch? Over the inner half—the heart of the plate...

Here's the rule: When the catcher sets up over the inside corner in a righty-righty or lefty-lefty matchup, gasoline is on its way—cheese, cheddar, the fastball—because the catcher is not going to request either the change-up or the breaking pitch on the inside corner.

Keith was probably referring more to the slider, but I think his opinion would remain the same for a slurvy curve or even an 11-to-5 curve since he refers to it as a breaking ball and the concept still seems to apply.

Looking at the above chart, Keith's quote seems to make sense. While the inside curve is very effective in the upper two-thirds of the zone, it is awful down and in. And if the pitcher does miss and it lands in the middle of the zone, we see that it won't be very effective at all. On the outer third of the plate is, as Keith said, the best place to throw the curve.

I should note, however, that 29 percent of pitches land in the inner-third of the zone against same-handed hitters. That seems like a high number for what Keith essentially calls mistake pitches, especially considering that the middle third (where those mistake pitches will land) makes up 47 percent of in-zone curves. Pitches away have the lowest percentage at just 25 percent.

Location and sequencing exercise

I recently found an interesting quote from Gary Sheffield regarding curveball sequencing and location and wanted to take a quick look. First, the quote:

It's about pitching up in the zone with the fastball, then dropping the curveball down in the zone, changing your eye level. If the guy has the same motion and release point and he can make the curveball look like the fastball coming out of his hand, that's the key.

Is this true? Well, while a regular old curveball has a runs100 value of -0.77, a curveball that lands in the upper half of the strike zone has a runs100 value of -2.79.

Examining Sheffield's situation, a curveball that lands in the lower half of the strike zone that follows a fastball that lands in the upper half of the strike zone has a runs100 value of -2.93 (I excluded all situations where there were two or three balls following the first fastball in a crude attempt to limit bias). Not a whole lot of difference.

Sheffield's quote, however, combined with a comment to Josh's curveball article by reader Ike, got me thinking. Ike's quote:

Most pitchers tend to work down in the zone with their fastballs, but come upstairs every so often. A good knee-buckler would give the appearance of a high fastball before having the bottom drop out of it.

So instead of looking at a low curveball following a high fastball, what if we look at a high curveball following a high fastball? Here, the runs100 value jumps all the way to -4.70. Sheffield seems to be right on that deception is important, but it seems that a high curveball is more easily confused with a high fastball and is more important than having the hitter change his eye level.

It is interesting to note that Sheffield does seem to be correct about release point being important (horizontal release point, anyway) and about making the curveball seem like a fastball coming in (smaller hump).

Concluding thoughts

That about wraps it up. At the request of some readers, I tried not to include too many of the unimportant details, although I know this article ran kind of long. I know there was a lot of information in here, but that was sort of the point of it, to see where curveballs are most effective. There was just so much data and so many interesting things to look at. As always, your feedback is appreciated, as are any suggestions for future improvements.

References & Resources
MLBAM and Sportvisions' PITCHf/x data was used throughout this article. Josh Kalk's collection of this data, corrections to it, and pitch classifications were also used. Josh also helped me in several other fashions, so a big thanks to Josh!

Potential Issues
Here are a few issues that might have influenced the results of this study. Make of them what you will and keep them in mind when considering the implications of the results drawn in this article.

As I briefly mentioned in the article, runs100 isn't a perfect metric. Because it is partially context driven, it doesn't perfectly isolate the raw effectiveness of a pitch. It also doesn't make any adjustments for BABIP, though hopefully when we reach high enough sample sizes that will even out. I'm not quite sure the some of the sample sizes used here reached that point, though. The methodology is still pretty sound, however, and it's better than anything else we have to work with right now.

The raw PITCHf/x data still has kinks in it (though Josh Kalk's corrections should go a long, long way in accounting for that), and the pitch classifications used to identify the curveballs aren't 100 percent perfect. They are quite good though, especially since curveballs are one of the easier pitches to classify (though it's still possible a few sliders were classified as curves and visa-versa).

The classification of curveballs as 12/6, 11/5, etc. obviously aren't perfect either. These were done by me, and I made some very general, sweeping distinctions between them. This should have served our purposes, though.

I don't think there are too many other large issues. All sample sizes consisted of at least 200 pitches unless otherwise noted, and most were much higher than that.