Rider, slurve and… Titanic

It was said that I threw, basically, five pitches—fastball, slider, curve, change-up, and knockdown. I don’t believe that assessment did me justice, though. I actually used about nine pitches—two different fastballs, two sliders, a curve, change-up, knockdown, brushback and hit batsman.

Bob Gibson, Stranger to the Game.

Since PITCHf/x day one, I have been thinking about how to accurately label each pitch thrown. MLBAM’s Ross Paul came up in 2008 with a solution that does the job (fairly well) in real time, but when people write articles about a single pitcher (or a small number of them), they usually perform their own ad hoc classification.

Currently Ross is working on improving the accuracy of his classifying method and we should expect the entire PITCHf/x database to provide labels more similar to the ones we produce on a pitcher-by-pitcher basis (I suppose).

Thus, we just have to wait. End of the article.

Well, not really.

I have a sinking fastball to either side of the plate, a cutter (which changes the direction of my fastball so it breaks instead of sinking) to either side of the plate, a curveball I throw at three speeds and three angles, a straight change—using the same arm speed and position as a fastball but with grip and release that slows it dramatically, and change-ups to different locations that I throw off my sinker and which look like batting practice fastballs.

Orel Hershiser, Out of the blue.

Since PITCHf/x day one, I have been thinking about the buckets into which pitches must fall. Currently we have:
{exp:list_maker}Fastball
Four-seam fastball
Two-seam fastball
Sinker
Change-up
Curve
Slider
Cutter
Splitter
Knuckle {/exp:list_maker}

Are these categories all we need? Do we need all of them? A.J. Burnett throws a hard curve, clocked at above 80 mph; Chris Carpenter‘s number two is a 12-6 deuce, much slower (75 mph), with a great arch (-8.9 inches of vertical movement). I’m not sure I want to classify them as the same pitch.

I’m fine with classifying them both as curveballs when writing a top 10 list, like “here are the 10 best curveballs in MLB ranked by Run Value.” On the contrary, I’m not so comfortable when evaluating batters’ abilities against certain type of pitches. For example, when facing right-handed pitchers, Mark Reynolds seems to have success against hard curves (+0.87 runs per 100 pitches in 2009), but suffers against uncle-charlies (-0.26). On the other hand, Pablo Sandoval holds his own facing slow curves (+0.24), while being helpless against tight ones (-0.76). I don’t care whether Burnett and Carpenter call their pitch simply curveball: Hitters are definitely seeing two different animals and reacting to them in different ways with different degrees of success.

image

So how many buckets?

This is a question I tried to answer before plugging data into the statistical software (you always have to know what you’re doing before pushing that button; also you should avoid the Texas Sharpshooter’s fallacy. (For those not willing to follow the wiki link, he is the guy who used to shoot, then paint the target around the hole produced by his bullet.)

Fastballs.
Four-seamers and two-seamers are terms that suggest how a pitcher grips the ball, but batters are more focused on how the ball behaves after it has left the hurler’s fingers. And so was the vernacular in the past, as Rob Neyer and Bill James illustrate in their Guide to Pitchers. I expected to find sinking fastballs (the two-seamers), rising fastballs (I know they don’t actually rise; they’re those with a high positive vertical movement), riding fastballs (significant horizontal movement to the throwing side) and cutters (moving on the glove side of the pitcher). Thus, four different pitches, maybe one more if you think the high-90s heaters need to be classified in a league of their own).

Curveballs.
My guess was two, as shown in the example.

Changes of pace.
Let’s not consider Jamie Moyer for now, otherwise we should say three, four … 10?
Probably two again. One harder, with horizontal movement similar to the fastball; the other one more of a pure slow pitch. Then there are splitters and forkballs. Thus, there might be as many as four different change of pace pitches, as long as splitters and forkballs, other than having their own name and their way to be delivered, have a peculiar behavior on their flight toward the plate.

Sliders.
The slider made me stop for a while, to consider the big picture: Maybe we can classify all pitches in 10-15 buckets; but chances are that, when we perform cluster analysis on all the pitchers’ pitches, they form a continuum that’s hard to separate in few categories. Slurves anyone? Slutters?

Odds and ends.
We are left with knuckleballs, pitches coming from extreme angles (think Chad Bradford), and other strange beasts (bloopers, palmballs, shuutos, gyros).

So when I launched my favorite statistical software, I was expecting to get all the pitches classified in 10 to 15 groups.

Retroactive Review: Ace
Looking back at some of Justin Verlander's most interesting moments.

A shortcut.

There’s no way to perform a cluster analysis on the millions of pitches in the PitchF/X database. I took a shortcut that has many limitations, but I believe it can be used for a first take on the subject. Using MLBAM classification, I took for each pitcher his average fastball, change-up, and so on, then erformed the cluster analysis on these average pitches.

Following are results for 2009 right-handed pitchers.

Results

Fourteen buckets came out of the* cluster analysis (using speed, horizontal and vertical movement, release point as the classifying variables). So far we are quite in line with our initial hypotheses.

* In studies like this there’s not one classification, so it should read “the cluster analysis I’ve chosen to show.” Later I would hint at results from other takes on the issue.

The following table shows the translation of pitchers/pitches combination from the MLBAM classification to the one produced by the analysis I performed. I included only those combinations whose group, according to the clustering algorithm, is nearly certain (probability > 95 percent).

                                      "new classification"
                        1   2   3   4   5   6   7   8   9  10  11  12  13  14
  MLBAM
  Change-up            59   1   6   0   0   0   1   1 148   2   7   3   0   0
  Curve                 0 113  17   0   0   8   1   0   0 140   9   0   4   0
  Cutter                0   0   1   1   0   0  57   0   0   0   0   5   0   0
  Fastball              0   0   0   1   4   0  27  14   5   0   5   7   0   9
  Four-seam fastball    0   0   0  10   6   0   6  36   0   0   7  14   0  82
  Knuckle               1   0   0   0   0   4   0   0   0   0   0   0   0   0
  Sinker                0   1   0   0  16   0   2  29   0   0   0   0   0   0
  Slider                0   6 262   0   0   3   8   0   1   1   1   0  41   0
  Splitter              0   1   0   0   0   0   6   0   3   0   0   1   0   0
  Two-seam fastBall     0   0   1   1  23   0   4 152   0   0   0   0   0   1

To make sense of the new classification, some familiar labels have to replace the numbers from 1 to 14. To do that we need another table, reporting the average characteristics of each of the 14 pitches.

  class. speed   h.mov.  v.mov.
       1 79.5   -5.3     6.9
       2 81.2    3.1    -3.8
       3 84.5    1.7     2.3
       4 96.1   -7.8     6.6
       5 90.8   -9.2     1.2
       6 70.5    3.3     2.8
       7 89.8    0.1     6.5
       8 89.8   -9.4     7.0
       9 84.3   -7.3     4.2
      10 75.1    6.0    -6.0
      11 84.2   -6.2    -3.8
      12 86.6   -3.9    10.3
      13 78.5    5.3     1.3
      14 92.8   -4.1    10.5

They put a radar gun on the kid’s fastball a few minutes ago. […] Ninety-three point four miles per hour. That’s how they tell you speed now. They don’t try to show it to you: ‘smoke,’ ‘hummer,’ ‘the high hard one.’ I miss the old clichés. They had life. Who wants to hit a fastball with a decimal point when he can tie into somebody’s ‘heat’?

William Least Heat Moon, Blue Highways.

The curveballs we outlined in our initial example are pitch No. 2 (the tight one) and No. 10 (the slow one).

No. 4 is clearly a fastball, one that very few pitchers can throw, stopping the radar gun in the high 90s. Iin this bucket fall the well known heaters of Joel Zumaya and Kyle Farnsworth.
There are several other fastballs in the above tables. No. 8, which makes up most of MLBAM’s two-seamers and two-thirds of the sinkers, actually shows, on average, a vertical movement similar to the one of the heaters (No. 4); what differentiates it from nearly all the other pitches is the great horizontal movement (on the throwing arm side). Some of the pitchers throwing this one are Joba Chamberlain, Joe Nathan, John Lackey and the Cardinals’ one-two starters.

No. 5 is the other pitch with a lot of horizontal movement. Again, the speed is good for a fastball and MLBAM sees it as either a two-seamer or a sinker, and the very low vertical movement value confirms the sinking action. Here are the sinkers of Brandon Webb, Fausto Carmona and Roy Halladay.

No. 12 and No. 14 share the highest vertical movement, though they come at very different speeds. MLBAM sees both of them mainly as four-seamers. The former pitch has Paul Byrd, Livan Hernandez and Trevor Hoffman among its adepts; the latter Joakim Soria, Grant Balfour and Brad Penny.

Finally, No. 7 leaves the pitchers’ arms at around 90 mph, and has no lateral movement; this means that the right-handed batter sees the pitch as tailing toward the outside corner, like a slider or a cutter. The velocity and a look at MLBAM’s classification indicates we are dealing with the latter. If you need another clue, yes, Mariano Rivera is in there.

They call that a cut fastball now, but it’s what we used to call a sailer.

Charlie Metro, quoted in The Neyer/James Guide to Pitchers.

Sliders (as defined by MLBAM’s algorithm) go into two buckets. No. 3 shows lateral movement similar to the cutter (No. 7) but lower vertical break; No. 13 breaks more like curves (No. 2 and No. 10) in the horizontal plane, and it’s also clocked at less than 80 mph. We might consider this last pitch as a slurve: the sliders of Jason Schmidt, Bronson Arroyo and Francisco Rodriguez are to be found here.

Changes of pace are split between groups No. 1 and 9. The former runs into right-handed batters like some of the fastballs (No. 7 and No. 8) and is very slow; the latter travels in the mid-80s and exhibits more lateral movement. I wonder whether the pitches also differ in the way they are thrown (circle vs straight change?).

Speaking of slower pitches, there isn’t a cluster identifying the splitters—they end up classified as No. 7, together with the cutters, or as No. 9, the “hard” change-ups.

We are left with a couple of groups. No. 11 collects a mix of change-ups, curves and fastballs. What stands out in the average measures of the pitch is the negative vertical movement. Once we look at the players delivering this kind of pitch, Chad Bradford, Brad Ziegler, Cla Meredith, Peter Moylan, … well you get the point: They are mainly low-arm-angle hurlers.

Finally, No. 6 mixes various really slow pitches. The knuckleballs are there, along with a few leftovers from the submarine/sidearm group and some roundhouse curves (Bronson Arroyo’s and Koji Uehara‘s). In case you are wondering whose is the knuckleball ending up in group No. 1, among the slow change-ups, well, that’s how some of Red Sox Dusty Brown pitches were classified by MLBAM.

Let’s name them!

Now the LORD God had formed out of the ground all the beasts of the field and all the birds of the air. He brought them to the man to see what he would name them; and whatever the man called each living creature, that was its name. So the man gave names to all the livestock, the birds of the air and all the beasts of the field.

Gen 2:19-20

Okay, before laying out a few caveats, future planning and concluding observations, let’s give a name to those numbers. I’m going to suggest one or more for each pitch type, and I expect you to either approve one of them or suggest something else in the comments section.

{exp:list_maker} No. 1 – Slow change or, as they used to say in the past, simply slow ball.
No. 2 – Hard curve, tight curve.
No. 3 – Slider.
No. 4 – Heater (hummer, blazer…).
No. 5 – Sinker.
No. 6 – Floater, junk, feather.
No. 7 – Cutter, sailer.
No. 8 – This one tails to the throwing arm side. I would suggest tailing fastball, but according to Neyer and James, they used to call a pitch from a righty that runs into a right-handed batter a riding fastball.
No. 9 – I really don’t like the terms hard change and slow change, so I expect good suggestions from you for this and No. 1.
No. 10 – Slow curve, drop curve.
No. 11 – Low-arm-angle pitches. How do we call them as a group? Sidearmers? Submariners?
No. 12 – Okay, this is a fastball that’s not quite fast (high 80s), but stays up. I go with rising fastball.
No. 13 – Slurve.
No. 14 – Similar to No. 12, but 4-5 mph faster. Hopper comes to my mind. {/exp:list_maker}

Alternate classifications.

The classification I chose to present is the one I found easier to interpret and more in line with the initial hypothesis—thus yes, I did some Texas sharpshooting in the end! However, alternate clustering (obtained with different parameter settings and choice of explanatory variables) produced similar results. In particular, removing the release point information (as many have shown to be inconsistent/unreliable) doesn’t prevent the clustering algorithm from detecting the submariners/sidearmers. That, together with removing the pitcher/pitch combinations with a sample size of less than 30 produced a classification in 19 groups.

The differences with what I outlined in the previous paragraph consisted in
{exp:list_maker}three change-ups instead of two (two different hard change-ups are identified, one that stays up—+6in. of vertical movement—and one that rides into right-handed batters—-8in. of horizontal movement);
a third group for the reeeeally slow curves, leaving the pitchers’ arms in the low 70s;
the low arm angle pitches split into two groups, with velocity being their main separator (87mph vs 78 mph);
one more fastball, something between the hopper (slower), the rising fastball (less “rise”) and the riding one (faster but with a smaller tail)—a straight fastball?;
one group mixing fast change-ups and slow fastballs (this is the hardest to digest, but we should have anticipated such a beast—think about the change-ups that look like batting practice fastballs in Hershiser’s quote). {/exp:list_maker}
The rest of the classification matches with the 14 groups previously described.

Some examples of repertoires.

Doc Halladay’s repertoire, according to this extended labeling of pitches, consists of a riding fastball, a sinker, a cutter, a hard change-up and a slurve.
Josh Beckett gets tagged with heater, sinker, cutter, a slow curve and a fast change-up. Justin Verlander‘s classification is quite easy and not much different from MLBAM’s: high heat, slider, tight curve, riding change-up. His two-seamers get labeled as riding fastballs.

The pitchers who really put this classification to a severe test are those continuously varying release angles. Arroyo’s fastballs either rise or run into batters (the rider into righties, the cutter into lefties); his slider gets tagged as a slurve, his curve ends up among the floaters and his change-up is considered a slow one. Jeff Weaver comes out with a slider, a sinker, a cutter, a rider, a hard change and a slow curve.

To-do list.

Classifying pitcher/pitch combinations using average values and starting with MLBAM’s labeling exposes the results to many shortcomings. A better, but way longer, approach would be to first perform a cluster analysis on each pitcher (going game by game would be even better), then perform the “meta-classification” on what comes out of that.

Some research is needed to assess whether the pain of going through all the work has any value. Other than checking whether some hitters perform well on the slow curves, but poorly on the tight ones, I believe it would be interesting to check if some combinations make pitchers better: Is riding fastball/hard change a better one-two sequence than heater/slow change? Which one is the better addition for a pitcher who already possesses a slider: a tight curve or a roundhouse one?

Furthermore, I would like to look at injury data in the near future. Are the pitchers with a hopping fastball more likely to make trips to the DL than those gifted with a tailing motion in their number one? (completely made-up example).

Opera naturale è ch’uom favella;
ma così o così, natura lascia
poi fare a voi secondo che v’abbella.

Dante, Par. XXVI, 130-132

Translation: “A natural action is it that man speaks;
But whether thus or thus, doth nature leave
To your own art, as seemeth best to you.”

Meanwhile, let’s get some fun out of this lengthy article. Let’s have some major leaguers make the calls on the pitches’ names.

Tug McGraw:
No. 8 Bo Derek fastball (“nice little tail”);
No. 7 Cutty Sark (“it sails”);
No. 5 Titanic (“it sinks”).

Bill Lee:
No. 2 Toilet seat (“They [Bert Blyleven, Nolan Ryan and Camilo Pascual] threw curveballs that were called “toilet seats.” A lot of hitters would buckle both knees when they first saw it. It gave the appearance that they were on the throne taking a cr**”);
No. 6 Pus.

Satchel Paige:
No. 8 Midnight rider;
No. 1 Nothin’ ball;
No. 14 Jump ball;
No. 6 Bat dodger.

Ricky “Wild Thing” Vaughn:
No. 4 Terminator.

Now it’s your turn, in the comments section below.

References & Resources
Data.
PITCHf/x data from MLBAM.

Books
Bill James, Rob Neyer – The Neyer/James Guide to Pitchers: An Historical Compendium of Pitching, Pitchers, and Pitches
Bill Lee and Jim Primer – Baseball Eccentrics: The Most Entertaining, Outrageous, and Unforgettable Characters in the Game.
Orel Hershiser, Jerry B. Jenkins – Out of the Blue: Orel Hershiser.
Bob Gibsom, Lonnie Wheeler – Stranger to the Game: The Autobiography of Bob Gibson.
William Least Heat Moon – Blue Highways: A Journey into America.


Print This Post
Sort by:   newest | oldest | most voted
Mike Fast
Guest
Mike Fast
Aryn, you don’t think there’s any value in knowing, for example, “C.J. Wilson added a cutter in 2009 that gives him a fastball to throw to each side of the plate with movement off the plate.  This helps him pitch inside to RHB without getting hit around so hard and explains why his platoon splits improved in 2009 and why he has a good chance to succeed as a starter in 2010.” I’m glossing over some details in my paraphrased C.J. Wilson analysis there, but I don’t see how you get that sort of understanding of pitchers without knowing what… Read more »
Gilbert
Guest
Gilbert

Since Norris Hopper is not a pitcher, it would be consistent to name a pitch for him, in the baseball oxymoron tradition of DH Cecil Fielder and all the players named White.

Zack
Guest
Zack

The slow, breaking side-arm delivery is often referred to as a frisbee slider.  Think Jeff Nelson.

Matt Lentzner
Guest
Matt Lentzner
This all brings up the question of how we should be classifying pitches. It is by how it was held or by how it was perceived by the batter? For example, if a sidearmer throws a four-seam *grip* fastball the batter is going to see sinker. Was it a four-seamer or a sinker? I think I would tend to agree with Max here that it should be by perception or result and not by the mechanics of it. That means we can’t call something a four-seam fastball since that just confuses things by concentrating on mechanical issues. Likewise we can’t… Read more »
lincolndude
Guest
lincolndude

Very nice article.  I especially like the graph that points out varying ability to hit the two different types of curves.

I think, though, that we’ll have to be careful when we use this information to determine how well hitters do against pitches in each bucket.

For example, wouldn’t Duchscherer’s fastball and Lincecum’s change both fall into bucket 9?  Hitters are going to have vastly different experiences trying to hit these two pitches because of the mix they come with.

Aryn
Guest
Aryn

Mike,
The vagueness is in how much does the name indicate the effect.  Might as well focus instead on the physical components of the pitch, and qualify a players capacity to throw/hit them directly.

Whats more important, that its a “curve”?  Or that its a 80 MPH ball, that breaks down and in?  Not that I want to hear a sports announcer spout off pitch f/x during the game, but from an analytical perspective, there is little reason to abstract it with an arbitrary label.

M
Guest
M

Impressive article. Any ideas how we can “guess” what kind of pitches that pitchers threw before this excellent PitchFx Data?

Mike Fast
Guest
Mike Fast

M, we read the Neyer/James Guide to Pitchers.  smile

Luis
Guest
Luis

Satchel also had a Long Tom fastball if I recall.

ElBonte
Guest
ElBonte

#1 sounds like what I mostly hear called a straight change

#9 sounds like a circle change or maybe a power change

It might help to see a more complete list of who is throwing some of those pitches.

Aryn
Guest
Aryn

I guess I’m a bit vague on the value added of classifying pitches.

Outside of being able to make the statement: “Player A sucks at hitting Pitch B from Pitcher C”.

It seems like an extra unnecessary step.

jim
Guest
jim

among others satch also had short tom and i believe the five day creeper

Nathaniel Dawson
Guest
Nathaniel Dawson

Very Excellent!

Much improved over lumping a lot of pitches together as if they were the same pitch.

What about reports from some batters that they can see the spin on pitches? That may have an effect on how a hitter perceives a pitch independent of it’s actual movement. I.E. a hitter might see “slider spin” but be faced with different movement from one pitcher to the next.

Patrick
Guest
Patrick
Aryn, Since the way many of these pitches are THROWN is similar, I’d disagree – It’s very important to be able to label them, as a way of understanding what pitchers do on the mound, rather than just evaluate individual pitches. While I see your point & it’s a good one, remember that you’re forgetting that these aren’t created by some randomizer – Each pitch type is created in a distinct way, and while it’s true that some amount of what’s lumped together (OK, maybe a lot ) wasn’t thrown in the same way as other pitches in that group… … Read more »
Kenny
Guest
Kenny

Could this data be made available? I would enjoy going through it to figure out what your system has to say about my favorite pitchers.

Sal Paradise
Guest
Sal Paradise
Patrick, I think the jury’s still out on that. After all, is a batter bad at hitting a hard curve, or is he bad at hitting a ball that drops down and away from the hitter (assuming same-handedness)? Does it matter more how the pitcher throws it, or to know that with the pitches he does throw, pitch X is the best type to get this type of batter out? Is a 4-seam fastball thrown sidearm going to get significantly different results from a standard angle sinker with the same speed/drop? For all batters? Some batters? Is the effect consistent… Read more »
marc
Guest
marc
Thoroughly enjoyed the article. To echo several of the other comments, I’ve always considered number 9 the circle change. I know the “circle” does indicate the grip of the ball, but it also (obtusely) describes the motion of the ball. Maybe. Having a special designation for top-level fastballs is a great idea. Its equally interesting to see how few pitchers reach that mark. I am also a fan of “cutter,” since it describes the action of the ball. I would resist loosing the category “knuckleball” if only because of its history. A knuckleball certainly floats, but it is not junk.… Read more »
Max Marchi
Guest
Max Marchi
Thank you for all the comments. Kenny, I definitely plan to make the data available, as soon as I’m more confortable with individual classifications. Aryn (and Patrick and Sal), I thought about simply removing all the labeling and just use the speed/movement characteristics of pitches; but when a batter faces a pitcher, he doesn’t take into account all the pitches thrown by every pitcher, but only those pertaining to the one he is facing (plus maybe some information coming from comparables); thus I believe the discrete way should not be abandoned. Increasing the number of labels seemed to me a… Read more »
Greg
Guest
Greg
A four-seam fastball thrown by a pitcher with a sidearm slot is not going to look or behave like a sinker or a two-seam fastball thown by a pitcher will a more traditional three-quarters arm slot.  A pitcher’s arm slot can affect the movement of a pitch and the way a batter picks up the ball out of the pitcher’s hands.  But a pitcher’s fingers are going to be on top of the ball regardless of the arm angle. This is true even in submarine pitchers. If a pitcher fails to keep his fingers on top of the ball, then… Read more »
wpDiscuz