﻿ Batted balls and cheese | The Hardball Times

# Batted balls and cheese

Stringers are in every major league park, and most levels of minor league ball, too. They manually record various aspects of the game as it progresses. If you’re watching an MLB.com Gameday feed, you’re seeing a combination of PITCHf/x data (speed, location, pitch type, etc.) and stringer observation (where the ball went, how it got there, who fielded it, etc.). There’s a level of detail that’s not often discussed—or present—in the Gameday information, that could provide assistance in evaluating pitches, and the pitchers who throw them.

Two weeks ago, I looked at the variation, or bias, shown by stringers when classifying batted balls. I take interest in such things since I’d like to calculate pitch-by-pitch run values based on the type of batted ball, not the outcome. In other words, I’m interested in line drives and fly balls, not outs and hits directly. While Gameday consistently provides classifications for line drives, fly balls, pop-ups and grounders, it often provides an nice little descriptor—soft or sharp. If you look closely, you’ll find that it is possible, according to some stringers, to bunt the ball sharply, or even hit a sharp pop-up.

### Refresher on batted balls

Since clicking the link to an old article (two full weeks!) is taxing, here’s a breakdown of batted ball types and value and likelihood of various outcomes. (Outcomes being hits and outs, values being Linear Weights.)

 Home Run Single Double Triple Out Line Drive .022 .524 .174 .015 .224 Ground Ball .000 .219 .018 .001 .693 Fly Ball .119 .057 .082 .013 .597 Pop Up .000 .013 .014 .000 .975 LW 1.468 .489 .768 1.052 -.289

### Contact types

Now let’s layer on the cheddar cheese. Soft or sharp, or none of the above. Home runs are never sharp nor soft, and any play that has an error results from a normal batted ball. Allegedly. I give home runs their own contact tag, errors (and bunts) I’m usually ignoring.

The Incompleat Starting Pitcher
The end of the nine-inning start and how we got here.

 event contact # Bunt soft 6 Bunt normal 2,550 Bunt sharp 1 Error normal 1,203 Fly ball soft 1,109 Fly ball normal 27,867 Fly ball sharp 165 Fly ball Home run homer 3,924 Ground ball soft 752 Ground ball normal 46,718 Ground ball sharp 1,209 Line drive soft 1,644 Line drive normal 18,787 Line drive sharp 812 Line drive Home run homer 471 Pop-up soft 147 Pop-up normal 8,475 Pop-up sharp 1

Not every park (i.e., stringer) tags batted balls at the same frequency. The deeper we go, the more we need HITf/x.

 Park Tag Freq. sln .0957 bal .0815 flo .0766 nya .0741 was .0732 nyn .0684 bos .0671 phi .0670 kca .0635 cin .0629 atl .0625 mil .0576 chn .0574 tor .0542 sfn .0520 tba .0514 cha .0448 cle .0442 col .0395 oak .0391 sea .0388 tex .0371 min .0364 ana .0324 det .0293 pit .0274 ari .0206 hou .0198 sdn .0196 lan .0159

AT&T is smack on the average (.0524). The difference between Busch III and Chavez Ravine is six-fold. That’s a problem, but I’ll forge ahead.

### What’s a sharp liner worth to ya?

Breaking down the batted balls by contact type (and ignoring home runs), here are the event probabilities by batted ball type.

 Line Drive # Single Double Triple Out(s) all 21,243 .537 .178 .015 .211 normal 18,787 .527 .188 .016 .213 sharp 812 .413 .209 .018 .249 soft 1,644 .707 .052 .001 .167

Line drives are the most likely to be tagged—nearly 12 percent. The sharp line drive yields more outs than the other types. It also gets fewer singles and more extra base hits. The soft line drive is turned into fewer outs, more singles and far fewer extra base hits. Intuitively, beyond the sharp liners being turned into more outs. I can speculate about the human factors involved, but I’ll leave that for the comments.

 Pop Up # Single Double Triple Out(s) all 8,623 .013 .014 .000 .975 normal 8,475 .011 .014 .000 .977 sharp 1 .000 .000 .000 1.000 soft 147 .122 .034 .000 .844

I suppose the soft pops are the bloops over the infield. Less than 2 percent of pop-ups are tagged, so not much to see here.

 Ground Ball # Single Double Triple Out(s) all 48,679 .220 .018 .001 .644 normal 46,718 .198 .016 .001 .666 sharp 1,209 .775 .123 .008 .065 soft 752 .697 .003 .000 .198

More than 9 percent of grounders are tagged. Not surprisingly, the sharp grounders have good outcomes—so good, they’re the best of the lot. Home runs not included, of course. A soft grounder is a good thing, too. This is the only type of the four that has more sharps than softs.

 Fly Ball # Single Double Triple Out(s) all 29,141 .064 .092 .015 .622 normal 27,867 .038 .092 .015 .642 sharp 165 .042 .333 .103 .352 soft 1,109 .707 .069 .002 .162

Fly balls are tagged as often as grounders, but lean heavily toward soft over sharp when tagged. Ground balls are tagged more on the sharp side, but the majority isn’t overwhelming. Sharp fly balls are, essentially, extra base hits. Soft fly ball outcomes are very similar to the same contact outcomes for both line drives and ground balls. I wonder if a soft fly ball and a soft line drive are actually the same thing.

### Conclusion

We really need HITf/x. Well, we do have some data: 15,000 batted balls from April. Next week I’ll wrap up this series by comparing HITf/x data to various stringer tags—batted ball type and contact.

References & Resources
Batted ball classifications from MLBAM’s Gameday, data from 2009 MLB regular season through Sept. 13

Print This Post
Guest
Brian Cartwright

…and the problem is even more extreme in the minor league. My Gameday data is 2006-2009, but shows the same patterns at the major league level.

Guest
Stevenell

Wow, these are even more biased than the normal classifications.  Obviously, A stringer wants to point out that a guy hit a “sharp” line drive when he makes an out, but doesn’t worry about it as much if he gets a hit.

Same thing with “soft” ground balls.  they want to make sure t is known that it was a swinging bunt, so they make sure to tag it.  If the person was thrown out, they might not think to tag it as such.

At least those are my theories.  Looking forward to the hitf/x.

Guest
Colin Wyers
I think that over time the data quality issues with Hit F/X will prove more malleable than the data quality issues with human stringer data, although I could be wrong. And once you actually track the ball along the entire flight path (which is part of the overall DRE they were showing us in SF) the issues with Hit F/X become a moot point. But yes, distance, vector and hang time will tell you pretty much everything you want to know. I know you at BIS are starting to track that, and I believe MGL is working on a project… Read more »
Guest
Alan Nathan
The problems with hitf/x that various of you refer to are indeed tractable.  A couple of months ago, I started work on a technique to correct the hitf/x data for the fact that the ball is tracked over a region that does not include the contact point, then extrapolated to the contact point assuming constant velocity.  The velocity is not constant, but the the change in velocity (i.e., the acceleration) can be estimated and the data corrected.  When this happens, then the reported data will more accurately reflect the velocity of the ball (magnitude and direction) at the impact point.… Read more »
Guest
Harry Pavlidis

jedlovec3 – no, I did not, but I should. Maybe that will go into the next follow-up.

Alan – you’re retired, allegedly, so get to work on this! You have the autonomy

Guest
fjm(anuel)

I echo Stevenell’s sentiments.  Namely, that there may be an inherent bias in the classification of balls that are turned into outs, so that many “sharp” line drives that are hits are unreported as sharp.

Guest
Harry Pavlidis

I agree with Colin. The human factor is far less tractable than the problems with HITf/x