Hudson & Perez: BIS & Pitchf/x
I thought it’d be interesting to take a look at how Baseball Info Solutions pitch type data matches up with the Pitchf/x’s pitch type identification data for just the two starters in the Nationals Opener. Before I begin, Pitchf/x has a new field in their data called “type_confidence” which appears to be a measure of how accurate their classification is. All the Pitchf/x aggregates were found on Josh Kalk’s player cards.
Fastball - BIS: 70.5% (89.3mph) | Pitchf/x: 71.8% (91.9mph) Slider - BIS: 19.2% (84.5mph) | Pitchf/x: 18.0% (86.9mph) Changeup - BIS: 5.1% (78.2mph) | Pitchf/x: 7.7% (80.5mph) Cutter - BIS: 2.6% (86.5mph) | Pitchf/x: 2.6% (89.1mph) Splitter - BIS: 2.6% (78.0mph) | -------------------------
Upon closer inspection the pitch logs are incredibly similar for Hudson, with BIS and Pitchf/x disagreeing on a mere 3 pitches. Two pitches classified as changeups by Pitchf/x were classified as splitters. The third was classified as a fastball by Pitchf/x, but as a slider by BIS. That third pitch in disagreement had the lowest Pitchf/x type_confidence of any pitch in the game.
Fastball - BIS: 59.7% (86.0mph) | Pitchf/x: 78.8% (88.7mph) Cutter - BIS: 28.4% (83.6mph) | ------------------------- Changeup - BIS: 10.4% (79.3mph) | Pitchf/x: 4.4% (81.5mph) Curve - BIS: 1.5% (74.0mph) | ------------------------- Slider - -------------------- | Pitchf/x: 14.5% (86.5mph) Splitter - -------------------- | Pitchf/x: 2.9% (85.3mph)
Needless to say, things were not as rosy for Perez. There were 27 pitches in disagreement, not counting 3 pitches that disagree because of non-identification on either BIS or Pitchf/x part. 9 pitches that BIS classified as cutters were classified as sliders by Pitchf/x. The average type_confidence for the other 18 pitches in disagreement was .56, where it was .68 for pitches in agreement.
For what it’s worth, Perez claims to throw a fastball, cutter, changeup and curveball.
That’s all I got for now. I haven’t had time to take a hard look into why the pitches might have been classified differently by using any of the break data, and nothing is standing out as different about those pitches at first glance.
Update: Josh Kalk over at The Hardball Times took an indepth look at Pitchf/x’s pitch classification on Tim Hudson’s first start.
John Beamer said,
April 2, 2008 @ 3:58 am
Great analysis dave. It is really important we can cross reference the data sources. Is the gameday pitch analysis done by sight or through an algorithm
David Appelman said,
April 2, 2008 @ 9:24 am
Thanks John. I’d guess the gameday pitch type is done with an algorithm since they have the type_confidence stat.
ultxmxpx said,
April 2, 2008 @ 12:42 pm
Thanks for this.
Before seeing this I had analyzed Odalis’ pitches from the pitch f/x data as 44% fb, 38% ct, 12% ch, 1% sl. But if Perez says he throws a curveball, the “slider” must be a curveball, even if it is a pathetic one. The pfx_x value did appear to be too positive, so once that’s fixed gameday will probably call more cutters (or at least call more sliders). There’s not much of a difference between Odalis’ fastball and changeup, so it’s hard to draw the line between the two. All-in-all he’s a hard pitcher to analyze. The division between the cutter and fastball is not clear either.
Mike Fast said,
April 2, 2008 @ 2:44 pm
Here’s how I classify the pitches from Odalis Perez:
sv_id type
080330_202246 fastball
080330_202313 cutter
080330_202337 fastball
080330_202359 cutter
080330_202418 cutter
080330_202453 fastball
080330_202508 fastball
080330_202525 fastball
080330_202541 changeup
080330_202606 changeup
080330_202632 cutter
080330_202657 changeup
080330_202716 changeup
080330_202741 changeup
080330_202821 fastball
080330_202858 fastball
080330_202913 cutter
080330_202934 fastball
080330_202952 cutter
080330_203012 fastball
080330_203040 cutter
080330_205149 cutter
080330_205241 fastball
080330_205257 fastball
080330_205327 cutter
080330_205410 fastball
080330_205438 cutter
080330_205452 fastball
080330_210209 cutter
080330_210222 fastball
080330_210258 fastball
080330_210310 changeup
080330_210328 changeup
080330_210431 fastball
080330_210507 curveball
080330_211307 cutter
080330_211322 changeup
080330_211351 fastball
080330_211407 changeup
080330_211424 fastball
080330_211458 cutter
080330_211511 cutter
080330_211526 fastball
080330_211612 fastball
080330_211631 cutter
080330_211651 changeup
080330_211707 cutter
080330_211740 cutter
080330_211755 cutter
080330_211815 changeup
080330_211831 cutter
080330_211849 cutter
080330_211931 fastball
080330_211954 cutter
080330_212019 changeup
080330_212100 fastball
080330_212123 fastball
080330_212156 cutter
080330_213101 fastball
080330_213115 cutter
080330_213157 cutter/changeup?
080330_213255 cutter
080330_213315 fastball
080330_213342 cutter
080330_213437 fastball
080330_213521 fastball
080330_213629 fastball
080330_213645 cutter
080330_213704 fastball
The x-z component of the spin rate is a very useful tool for classifying Perez’s pitches. His cutters have a low spin rate in the x-z plane, and his change-ups can then be separated from his fastballs based on speed (as well as spin parameters). I have 41% fastballs, 39% cutters, 17% changeups, and 1% curves.
Perez’s curveball is pretty slurvy. I’m comfortable calling it curveball based on the speed and his personal identification, but its spin characteristics are somewhere between slider and curve.
I am unsure about the pitch with Sportvision ID of 080330_213157. It has some characteristics of both a cutter and a changeup. If I had to choose, I’d say it was a cutter.
David Appelman said,
April 2, 2008 @ 3:08 pm
Mike, from your list, BIS differs on 14 pitches, mostly pitches where you identify it as a cutter and a few changeups BIS will identify it as a fastball. Two pitches you identify as changeups were classified as cutters.
ultxmxpx said,
April 2, 2008 @ 4:25 pm
I classified the pitches the exact same as Mike, except I called the cutter/changeup mystery pitch a fastball.