Jeremy Hellickson and Re-Defining BABIP

There aren’t many mainstream writers out there who are willing to tackle sabermetric concepts with players, so I have to give Marc Topkin from the the Tampa Bay Times a tip of the hat for mentioning BABIP to Jeremy Hellickson. Not only did he mention a sabermetric statistic to a player, but he brought one up that makes Hellickson look bad:

“Yea, I just got lucky on the mound,” Jeremy Hellickson says dryly. “A lot of lucky outs.” […]

“I hear it; it’s funny,” Hellickson said, not quite sure of the acronym. “I thought that’s what we’re supposed to do, let them put it in play and get outs. So I don’t really understand that. When you have a great defense, why not let them do their job? I’m not really a strikeout pitcher; I just get weak contact and let our defense play.”

First of all, I have to agree with Craig Calcaterra on this one: I couldn’t give a rats patootie if Hellickson knows about or understands BABIP. Sabermetrics is a field most helpful to front office personnel and managers, and while some players find it useful, players don’t need be saberists in order to be good players. And anyway, it’s never going to be the successful players that stumble upon sabermetrics; it’s always going to be the borderline players, the ones looking for any sort of possible advantage to help them get ahead. So should I be annoyed that Hellickson is poo-pooing BABIP? No, not in the least. Good on him.

Instead, this article caught my eye for a different reason: it refers to BABIP primarily as a measure of luck. Hellickson had a low BABIP, which therefore meant he was lucky on balls in play last year. Any pitcher with a low BABIP is therefore “lucky”, and any pitcher with a high BABIP is “unlucky”.

This is a common perception about BABIP, and one that used to be in favor among sabermetric circles. Heck, I subscribed to this philosophy three or four years ago, and I used “luck” as a quick way of describing BABIP to the uninitiated. But these days, that’s an outdated mindset and, quite frankly, misleading.

BABIP is one of the most important sabermetric concepts, but it’s also one of the most misunderstood. What does BABIP tell us? What doesn’t it tell us? Let’s explore.

In the FanGraphs Library, I state that there four main variables that affect a player’s BABIP: defense, talent level, skill set, and luck.  Obviously, pitchers who play in front of good defenses are going to have more of their balls in play turned into outs, and that’s especially true if managers know how to position their defenders optimally (see: Maddon, Joe). That’s an easy variable to see, and it’s why the Rays had an MLB-best .265 team BABIP last season. Not a single starting pitcher on the Rays last season had a BABIP higher than .284, considerably lower than the MLB average rate of .293.

Next, you have to consider how a player’s talent level affects their BABIP. If I were to step into a major league game, there’s no way I’d post a .300 BABIP; players would rip the ball off me, and I’d be lucky to get an out. The only pitchers who will survive in the majors are those that are good enough to retire major league hitters at a league-average rate. Pitchers may see their talent levels decrease if they’re playing through an injury, and their BABIPs will typically spike as a result.

Different skill sets can also influence BABIP rates. Pitchers with high strikeout rates tend to generate weaker contact and, therefore, allow fewer hits on balls in play. The same is generally true of relievers, as they can dial up the intensity over shorter outings. Also, fly balls fall in for hits less often than ground balls, and extreme ground ball pitchers (55% and above) are better at inducing ground balls that are easy for their defense to turn into outs.

And then, after all these other variables, comes luck. There are always going to be times when a pitcher performs well, but weakly hit balls sneak through the infield and multiple bloop hits fall in back to back. That happens; it’s part of the game. Over the course of a full season, some players may find themselves on Lady Luck’s bad (or good) side more often than others, and that’s something we simply can’t quantify. In my opinion, though, I think that luck has a lot less to do with BABIP fluctuations than we tend to assume.

So instead of the word “luck”, I prefer to use “random variation” to describe BABIP. If a pitcher posts an abnormal BABIP, it doesn’t necessarily matter why that player’s BABIP was so extreme — it’s still liable to regress. Based on all the above criteria, certain pitchers will regress toward different means. Strikeout pitchers will do slightly better than non-strikeout pitchers. Players pitching in front of good defenses should have lower BABIPs than players on defensively-challenged teams. These variables give each pitcher a different “expected” BABIP, with most major-league starters falling within the .275-.300 range.

If you look closely at Jeremy Hellickson, his .223 BABIP from last year doesn’t seem quite that scary going forward. He’s playing in front of a spectacular defensive team, so we should already expect his BABIP to be closer to .270-.280 than .290. Not only that, but Hellickson is an extreme fly ball pitcher, and fly balls fall in for hits less often than grounders. He also has shown a tendency in his young career to generate an extreme number of infield pop-ups, a la Jered Weaver (.276 career BABIP). While Hellickson may not be a high strikeout pitcher, these variables suggest we should expect a BABIP closer to .270 than .290 — a third less regression than he’d see if he regressed all the way to league-average.

It may look simple at first blush, but BABIP is actually one of the more complex sabermetric statistics. It’s not nearly as simple or cut-and-dried as many make it out to be, and I wouldn’t be surprised if we decrease the importance of “luck” even further once HITF/x data becomes available (if it ever does). So the next time you see or hear someone refer to BABIP as a luck statistic, be sure to mention that luck has little to do with it — it’s random variation.



Print This Post



Steve is the editor-in-chief of DRaysBay and the keeper of the FanGraphs Library. You can follow him on Twitter at @steveslow.


Sort by:   newest | oldest | most voted
Ender
Guest
Ender
4 years 5 months ago

BABIP also fluctuates pretty significantly based on pitch selection, being ahead or behind in the count and park factors. It is a hard one to nail down which is why you tend to regress it towards a players recent average more than the league average.

Josh
Guest
Josh
4 years 5 months ago

There hasn’t been a more must-read article on Fangraphs recently than this one. It’s insane how often BABIP gets misused.

44
Guest
44
4 years 5 months ago

Yes, Steve gets a standing O on this one

Clark
Guest
Clark
4 years 5 months ago

Agee. Best article I’ve read on here is a long while.

williams .482
Member
Member
williams .482
4 years 5 months ago

I agree. You should link to this in the Library BABIP page.

JimNYC
Guest
JimNYC
4 years 5 months ago

My only problem with this article was his saying that something he believed three or four years ago is now “outdated.” It normally takes decades (if not centuries) of peer-reviewed study to determine whether a theory is valid or not, but hey, let’s decide over the course of a few short years what’s true and what’s not, by all means.

MrKnowNothing
Guest
MrKnowNothing
4 years 5 months ago

OK. You wait a few more centuries before buying into whether various baseball theories are valid or not.

baty
Guest
baty
4 years 5 months ago

That’s true, but what’s important is that this article acknowledges a need to not present truth within something that we don’t yet understand well enough.

James Gentile
Member
4 years 5 months ago

Yeah, if DIPS theory still holds true by 2312, I’ll finally cave.

cs3
Member
cs3
4 years 5 months ago

sure JimNYC, lets wait around for a period of time longer than baseball itself has even been around to start trying to quantify anything.

Michael
Guest
Michael
4 years 5 months ago

You seem to be referring to science theories. The reason science theories take so long to validate or change is because you need the data sets which take a long time to develop (especially things like cancer treatments, etc.) Baseball has much of the data in front of you currently. Not to mention what he is really saying is that originally DIPS may have suggested that variation in BABIP was just noise and assumed luck. More recently with line drive rates and other more in depth knowledge about how balls are put in play the theory SHOULD BE TWEAKED.

bstar
Guest
bstar
4 years 5 months ago

I certainly hope all the other authors on this site read this article also. They’re just as guilty of assigning “luck” to variances in BABIP as anyone.

Dekker
Guest
Dekker
4 years 5 months ago

There’s also the extreme minority of pitchers like Marcum, Jurrjens, Cain that have unimpressive peripherals, batted ball types, and mediocre defenses but can still generate weak contact in their careers. Babip is a great tool, but it’s not the end all be all that explains pitching.

Evan
Guest
Evan
4 years 5 months ago

I’m not sure why people keep classifying the Giants defense as mediocre. Since Cain debuted in 2005 the Giants team UZR is first in MLB and almost 50% ahead of 2nd place.

Baltar
Guest
Baltar
4 years 5 months ago

Man, that’s an amazing statistic. You blew me away.

Roel Torres
Guest
4 years 5 months ago

“And anyway, it’s never going to be the successful players that stumble upon sabermetrics.”

Zack Greinke quote from the New York Times, Nov. 17, 2009: “‘That’s pretty much how I pitch, to try to keep my FIP as low as possible,’ Greinke said.”

The article: http://www.nytimes.com/2009/11/18/sports/baseball/18pitcher.html

Tom
Guest
Tom
4 years 5 months ago

And if he didn’t know what FIP was he would be pitching differently?

And if he’s intentionally pitching for a flyabll instead of a groundball (which was one of the examples given)…. that has nothing to do with FIP; and is a questionable strategy even in a big park with a good fielder

It’s a nice story….

vivalajeter
Guest
vivalajeter
4 years 5 months ago

Roel is just saying that a successful player has stumbled upon sabermetrics. He’s not claiming that it made Greinke a better pitcher.

jim
Guest
jim
4 years 5 months ago

thanks for your insight…

billybob
Guest
billybob
4 years 5 months ago

As I recall, it was Brian Bannister who got familiar with it and then explained it to Greinke. Greinke is passively agreeing: it was Bannister who was the more intense user.

jrogers
Member
jrogers
4 years 5 months ago

Bannister “was the more intense user” — I like how you just made sabermetrics sound like a drug. :)

Bryan
Guest
Bryan
4 years 5 months ago

Wait a minute…

“Also, fly balls fall in for hits less often than ground balls, and extreme ground ball pitchers (55% and above) are better at inducing ground balls that are easy for their defense to turn into outs.”

And then regarding Hellickson:

“Not only that, but Hellickson is an extreme fly ball pitcher, and fly balls fall in for hits less often than grounders.”

What is it?

Bryan
Guest
Bryan
4 years 5 months ago

Meaning, are extreme ground ball pitchers just as good as getting outs as fly ball pitchers? Is it a case of both extremes?

Derek
Guest
Derek
4 years 5 months ago

You almost said it, but didn’t, so I’ll mention two things regarding the babip of pitchers with high strikeout rates. It is very rare to be both a strikeout pitcher and a groundball pitcher: so the outs that a strikeout pitcher doesn’t get via strikeout are going be flyballs, thus lowering his babip. Also, with many strikeouts comes a smaller sample size of balls in play in general, so that there is less opportunity for bloops, etc., to fall in for hits. Obviously the converse argument is there as well, but it still gives cause to more variation than a pitcher who relies on and receives more balls in play.

Nivra
Guest
Nivra
4 years 5 months ago

Don’t you mean “It is very rare to be both a strikeout pitcher and an extreme groundball pitcher?”

There are plenty of strikeout pitchers with good groundball rates. Just not many in the super-extreme Lowe area.

Hamels, Holliday, Lester, King Felix, Lincecum all come to mind.

Derek
Guest
Derek
4 years 5 months ago

Yes.

BigNachos
Guest
BigNachos
4 years 5 months ago

Since Pedro Martinez’s 1999 and 2000 seasons, the 2 most dominate seasons for a pitcher in the history of baseball, I’m utterly convinced that BABIP is somewhere around 99.9999% luck. From 99 to 00, his BABIP had a 100(!) point swing. His HR/FB rate also had a similar swing, though in an opposite direction of his BABIP. This stuff just fluctuates wildly depending on luck, and this article failed to convince me otherwise.

A pitcher like Hellickson would be wise to learn this, so that he could work on regaining the strikeout ability he had in the minors to compensate for when his luck inevitably runs out.

RationalSportsFan
Guest
4 years 5 months ago

Not necessarily disagreeing with your conclusion, but a stat can be susceptible to massive swings within small sample sizes without being 99.9999% luck. Perhaps we just need a bigger sample than a year or two of BABIP data to reach a predictive level.

BigNachos
Guest
BigNachos
4 years 5 months ago

Uh, aren’t massive swings in small sample sizes the definition of luck? Anyway, we can agree that one year samples of BABIP are essentially meaningless.

In other words, Hellickson should stop deluding himself that he’s somehow the single pitcher in MLB history to sustain a .220 BABIP and instead work on improving his approach and increasing his strikeouts since his BABIP will increase (and Tampa’s defense won’t always be that good).

RationalSportsFan
Guest
4 years 5 months ago

“aren’t massive swings in small sample sizes the definition of luck?”

Not really. Consider the following example (from the world of basketball): Imagine player X makes 70% of his free throws over his career. When we take one 10 shot sample, however, he makes 4 of 10. We decide to take another 10 hot sample, and he makes 10.

It would be wrong to conclude (on the basis of these two tests) that free throws are 99.9999% luck. Rather, we should expect those random variations when are samples are small enough.

So, the question remains: what is a proper sample size for BABIP? I am not sure. And perhaps it will turn out (with more study) that BABIP is 99.9999% luck. However, we can;t conclude based just off wild swings in certain players’ BABIP from year-to-year. Perhaps we should actually EXPECT a certain numbers of players each year to have their BABIP fluctuate that much.

Agreed on Hellickson though (unless he has some hit/fx data that says he is the best pitcher in the league at creating weak contact or something).

Josh
Guest
Josh
4 years 5 months ago

Luck [luhk] noun: good fortune; advantage or success, considered as the result of chance.

Random fluctuations that benefit you = good luck.

baty
Guest
baty
4 years 5 months ago

@ josh

But I think the point is that the word “luck” and its use at times in the past has been more superficial than analytical.

good fortune?
advantage?
result of chance?

It’s all muddy if you ask me, because its significance can be almost anything in any circumstance.

The more you avoid using a word like that within SABR analysis, the more you come to terms with the, still, billions of unknowing and immeasurable truths out there in baseball SABR analysis.

jim
Guest
jim
4 years 5 months ago

did you even read the article, or did you just come immediately to the comments to post your troll-like comment?

Aaron
Guest
Aaron
4 years 5 months ago

-Dominant-
Dominate is a verb

bstar
Guest
bstar
4 years 5 months ago

@Bignacho, Greg Maddux in ’94 and ’95 was better than Pedro in ’99. Look at their ERA+:

Maddux ’94 – 271
Maddux ’95 – 262
Pedro M ’99 – 243

Okra
Member
Okra
4 years 5 months ago

“Pitchers with high strikeout rates tend to generate weaker contact and, therefore, allow fewer hits on balls in play.”

Hellickson is very hard to measure when it comes to strikeouts, give that he’s only pitched a little over 200 innings thus far in MLB. For his career, he has an elite SwStr% of 10.2%, but a rather pedestrian mark of 5.99 K/9.

Should we expect his strikeout numbers to increase and match his swinging strike %, or is it common for pitchers to have K/9 & SwStr rates that disagree?

Chad
Member
Chad
4 years 5 months ago

Is there not an xBABIP for pitchers based on hit types allowed?

channelclemente
Guest
4 years 5 months ago

The facet of Matt Cain’s performance is how a RHP induces LHBs to hit 0.185 against him. Today, it seems his contract talks with the Giants went way south. Maybe he’s a Yankee ‘in slow transit’ in 2013?

vivalajeter
Guest
vivalajeter
4 years 5 months ago

This is a nice, well-written article, and I agree with the main points. But it also touches upon a couple of things that make people skeptical of sabermetrics.

1) “This is a common perception about BABIP, and one that used to be in favor among sabermetric circles.” A lot of sabermetric-oriented fans can be pretty arrogant with their opinions (a ‘holier than thou’ attitude). With BABIP, sabermetric circles were wrong VERY recently. There were articles that completely chalked up BABIP fluctuates to luck. Heck, I first heard about it while reading a Neyer article on espn.com and I distinctly remember him calling it luck. Now we know better, and we have other issues that impact BABIP. What’s to say we’re not wrong about other things, and 2 years from now xFIP will be looked upon very differently? It can be hard for a non-stats guy to have faith in sabermetrics when we talk so confidently about an issue, only to be wrong a couple years later.

2) “So instead of the word “luck”, I prefer to use “random variation” to describe BABIP.” I know what you’re saying – but for the average person, this seems like spin. “It’s not racism, we prefer to call it profiling!”

Paul
Guest
Paul
4 years 5 months ago

I think both of these points are dead-on. I too do not question Steve’s sincerity, but this felt a little Twilight Zonish as I was reading it.

When exactly did the SABR community decide BABIP has very little to do with luck? As Viva states, I see articles on FG and elsewhere all the time claiming that if a guy had a low BABIP one year, he’ll regress the next because he got lucky. It’s the entire Alex Gordon regression argument.

And since I used the example of a hitter instead of a pitcher, doesn’t this new line of thinking completely eliminate the meme that pitchers have no control over balls in play? It was always nonsensical, but apparently we are now at this point.

I’ve seen a lot of backpedaling in this community over the past year. Claims that not considering traditional scouting methods was ditched by SABR eons ago (riiiiiight), that BABIP fluctuation might be due to fielders being tired on the road for one pitcher and not for others on their own team (Verlander), defenders can magically turn a bum of a pitcher into a consistent number 3 in the toughest division in baseball (Garza), and now this. In all cases, the presentation was in the vein of “yeah, we changed our minds about this stuff long ago.” Well, you might want to tell your writers, and somebody should also issue a bulletin to the army of “stats writers” at local papers and numerous blogs, who I guarantee are writing this very moment about BABIP being due completely to luck. Sorry, but for this concept to be so “misunderstood,” it would have had to be perpetuated over a very long period of time.

Perhaps someone should have sent out the memo before now.

JDA
Guest
JDA
4 years 5 months ago

Oh dear. Where to begin with this one…

HasonJeyward
Guest
HasonJeyward
4 years 5 months ago

While saberists could tone down the snark, there should never be any hesitation about admitting you’re wrong and correcting your hypothesis. That’s exactly what sabermetrics is all about. It doesn’t matter if “non-stats guys” have faith in sabermetrics or not – their faith is inconsequential. Sabermetrics is about generating better results. If a hypothesis is tweaked and yields better results, it is a success.

What is the alternative? Rigidly holding to a certain mindset despite contrary evidence? I don’t think that’s the path we want to go down in order to get “converts” from the lay folk.

bstar
Guest
bstar
4 years 5 months ago

And since using FIP to determine pitcher WAR has NOT yielded better results, man up and admit it’s wrong.

bstar
Guest
bstar
4 years 5 months ago

@vivalajeter, why not avoid the Christmas rush and go ahead and be skeptical of FIP now? What are you waiting on?? It’s almost certainly not going to last the test of time. You can’t just look at three outcomes of pitchers and use it to determine pitcher WAR.

Tom
Guest
Tom
4 years 5 months ago

The bottom line on this…. even if you adjust Hellickson for various effects (defense, flyball pitcher, park, popup rates) a .223BABIP still screams regression and it is only a matter of how much. I applaud the author for pointing out that it’s not just about “luck”, but this is probably the wrong example to use.

If he doesn’t get his K rate up his ERA should jump rather significantly. He had a 4.44 FIP and a 4.72 xFIP (Brad Penny and Wade Davis were the only qualified AL pitchers with a worse xFIP)

I think TB’s reputation for young pitching bleeds a bit into the evaluation of some of the pitchers. Wade Davis has been one of the worst starting pitchers in baseball over the last 2 years for guys with 300+ innings, yet folks continue to say he’s a valuable trade piece or is one of their 348 legit starting options. (I think they should either put him in the pen where his stuff seems to play up a bit or see if they can sucker some opposing GM into thinking he’s a good starter)

Ender
Guest
Ender
4 years 5 months ago

“even if you adjust Hellickson for various effects (defense, flyball pitcher, park, popup rates) a .223BABIP still screams regression and it is only a matter of how much”

True, but his age and experience level also screams improvement in the peripherals which will offset some of it. Hellickson will likely have slightly worse results than last year but with better background skills that keep the regression from being as bad as people assume it will be.

Tom
Guest
Tom
4 years 5 months ago

His ERA was nearly 2 runs lower than his xFIP… do you have any idea what kind of K rate he’d need at his GB ratios to support an ~3 ERA?

Brandon Morrow had a similar GB rate, similar walk rate and a massively higher K rate… and had a 3.53 xFIP (4.72 ERA). Hellickson is not going to be striking out >10/9IP; even with a better defense and a much more friendly home park I would not be surprised to see his ERA jump 0.7-1 run higher; possibly more if his K rates don’t improve.

Ender
Guest
Ender
4 years 5 months ago

His K/9 was over 9 in the minors and his BB/9 was under 3 and he most likely will trend towards those numbers in both directions. It won’t surprise me to see him end up with a high 7 K/9 and high 2 BB/9. Something like a Colby Lewis type of player given the high FB%, only with a pitchers park instead of a hitters park so not as many of the HR fly out. While his 2011 stats don’t support his ERA you also have to keep in mind his 2011 stats were way under expectations given his minor league track record as well.

There is definitely a gray area between him being an ace and being terrible. He’ll get a high 3 to low 4 ERA most likely, with ok K, decent W and a decent WHIP. How valuable you find that is up to what you value most. He isn’t really being drafted like an ace though.

Tom
Guest
Tom
4 years 5 months ago

A high 3- low 4 ERA is a rather major regression from someone who was under 3.0 last year; your comment was “slightly worse results”…. 1 run is slightly worse? I hate to think what a moderate regression would be.

In essence you are validating my point… even if you project the K rate improving significantly, BB rate improving significantly; he’s still due for a rather significant regression from last years results (+1 run is not “slight”). If his K rate doesn’t improve it will be a massive correction…

It doesn’t seem like you are disagreeing with me, just creating strawman (“As bad as people think”), and trying to put a positive spin on it. I never said or thought he was an ace, nor do I think he’s a fringe starter….my point was/is that he is due for a significant regression on both BABIP and ERA.

Ender
Guest
Ender
4 years 5 months ago

I am looking at this from a fantasy standpoint and yes I think his ERA jumping to a high 3 is a lot less of a difference than a lot of people seem to think we’ll see. I don’t think he approaches last years xFIP but I also don’t think he repeats the ERA from last year. I haven’t seen anyone that thinks he’ll repeat under a 3.20 ERA but I have seen people who say he will be well over a 4 ERA and the truth is somewhere in between.

Tom
Guest
Tom
4 years 5 months ago

This is not a fantasy baseball article… I think there is another section of Fangraphs for that.

His performance, even if he improves his K and BB rates still screams regression, no?. If they don’t improve, it might be a massive correction. Forget the strawman of what “a lot of people” think, do you not believe a nearly 1 run change in ERA is significant?

It seems you are disgreeing with “those people’s” perception, so perhaps instead of applying that response to my comments, maybe you should respond to those folks. Honestly I don’t care about the fantasy spin… I care about his expected performance this year

James Gentile
Member
4 years 5 months ago

I like this quote: “When you have a great defense, why not let them do their job?”

It’s so innocent and adorable. Like an idealistic young collegian studying Marxism for the first time.

bstar
Guest
bstar
4 years 5 months ago

He has a point. BABIP/FIPheads act like every time a ball is put into play, it’s a slap in the face to the defense, like they shouldn’t have to be bothered to field balls since it’s the pitchers’ job to strike people out. Give me a break.

James Gentile
Member
4 years 5 months ago

Dude, what is a “BABIP/FIPhead”? Or are you making up a whole race of people like Star Trek? Are they like the Ferengi? Do they have big ears and endlessly cite the Rules of Acquisition, too? Tell me more about this imaginary race of people you’ve invented…

bstar
Guest
bstar
4 years 5 months ago

They are blind followers that BABIP is all luck and FIP should be used to determine WAR. Kinda sad I had to explain this.

Andrew
Guest
Andrew
4 years 5 months ago

If Hellickson pitched in front of a poor defense he’d be John Lackey.

JoeC
Guest
JoeC
4 years 5 months ago

This is ignorant. When we eventually get HitFX data, you’ll see why. Hitters LAUNCH the ball off Lackey. That’s why he blows. Hitters are making much weaker contact against Hellickson. That’s why he doesn’t.

Andrew
Guest
Andrew
4 years 5 months ago

It’s ignorant to point out that they had the exact same xFIP last year?

I like how you feel comfortable making factual statements based on information that you admit isn’t even available yet.

So, wild speculation vs hard evidence, looks like I win this round.

Woodrum's UZR Article
Guest
Woodrum's UZR Article
4 years 5 months ago

are you implying you have access to hit fx data?? do share.

bstar
Guest
bstar
4 years 5 months ago

No, it’s ignorant to suggest one guy who has a 2.95 ERA and another who has one over 6 are really the same pitcher because the former has a better D behind him. You didn’t win anything.

Pat
Guest
Pat
4 years 5 months ago

I’m pretty new to the sabermetric side of baseball, so i apologize in advance if this question seems trivial or completely off base. After reading the above article, shouldn’t it be more important to compare BABIP of certain pitchers or pitcher types and see if there is an average among those types and how that would influence their other peripherals… i.e. separating the starters from the relievers, guys with K/9 of greater than 7, guys with K/9 of 5-7, etc. It seems to me that while we are still comparing fruit to fruit, it is still apples to oranges.

Paul
Guest
Paul
4 years 5 months ago

You’re completely correct, and it’s why so many people with interest in SABR concepts have objected to not just the “BABIP is all about luck” line, but also calls for Player A to “regress” based solely on BABIP. It’s just lazy thinking, which is not characteristic of so many really bright people who work in or share interest in SABR.

So now we’re back to actually listening to the hitting coach of Joey Bats who predicted something close to his breakout, or actually considering that Kevin Seitzer might know something when he says Alex Gordon can absolutely repeat last season. Doesn’t mean they’re correct, or that their opinions should be relied on completely, but in the past (this will apparently stop as of now), they have been attacked and mocked. And it’s all because of this silly little stat that now has nothing to do with luck.

bstar
Guest
bstar
4 years 5 months ago

Hell yes it’s lazy, but its been done the entirety of this year by FANGRAPHS WRITERS.

baty
Guest
baty
4 years 5 months ago

Agreed… Someone earlier mentioned the “inconsequential faith of a non-statistician”. There’s a better way of spending time with this idea…

Non-statisticians sometimes have trouble believing in SABR baseball concepts because they are appear invisible, but statisticians help the non-statistician seek and understand much more about what’s occurring on a baseball field.

There are non-statisticians out there that have the ability to understand the meaning of SABR concepts without completely understanding the math concept it’s built from. This is a very important perspective for the SABR community to value and use in testing and measuring their concepts…

Because many are not statisticians, they can’t articulate (argue) very well (in math terms) when they sense a math point of view is misguided, acute, and or obtuse. But they can tell from time to time when something isn’t quite right and needs to be investigated from another point of view.

It’s a pretty important exchange… Keeps everyone in check…

Paul
Guest
Paul
4 years 5 months ago

Thanks for this. Great comment.

obsessivegiantscompulsive
Guest
4 years 5 months ago

Great article, finally, somebody here writes on this seemingly taboo topic, and notes the Emperor’s clothes.

This leads to a much more sanguine conclusion: WAR calculations for some pitchers are way off of where they should be, based on these factors, because most WAR assumes league average BABIP regression somewhere in their calculations. Same with FIP and anything else assuming DIPS is inviolate.

It would be helpful if reference sites like here and Baseball-Reference.com would provide applications which allows users to fiddle around with the valuation formulas like WAR, where there is disagreement over how certain aspects of those formulas work. Or if that is too hard, then at least provide the methodology used at their sites so that readers can make their own custom calculations, since MLB data is generally freely available.

This issue about BABIP isn’t even a new phenomenon either, Tom Tippett noted around ten years ago that there were pitchers who were capable of keeping the BABIP on their pitches significantly lower than the league mean, yet there hasn’t been any acknowledgement that I’ve seen anywhere until this article that these pitchers produce significant value and yet are undervalued by all sabermetric value calculations.

I have found that most sabermetricians are as stodgy as the old baseball men who were first derided when the saber-movement first began. Which, I guess, is human nature, but still ironic.

Hank
Guest
Hank
4 years 5 months ago

The other hidden issue with pitcher WAR is the park corrections. As far as I know these are done with park run factors, which can vary from HR factors. I’m also not sure if Fangraphs considers the handedness of the pitcher (and the potential impact on parks that have significantly different left and right park factors); though that is probably more of a secondary effect.

For example take Fenway which has a significant positive run factor, but a below average HR factor. Unless I’m mistaken (which is certainly possible!) this helps the pitcher in the adjustment as he gets credit for pitching in a higher run environment, yet the pitcher is also ‘helped’ from pitching in a HR suppression park (as FIP is used in the WAR calculation and that obviously is dependent on HR’s allowed)

This also doesn’t need to be opposing park factors; just a park where there is a significant delta between the run factor and the HR factors. Off the top of my head SF, Bos, NY come to mind as a few examples.

Toffer Peak
Member
Toffer Peak
4 years 5 months ago

1. rWAR at Baseball-Reference.com uses RA so just go ahead and use that if you don’t like FIP based WAR.

2. The “discoverer” of DIPS himself (Voros McCracken) noted that some pitchers, particularly knuckleballers, have lower than normal BABIP so it really is not a new idea. BABIP = All luck, is really just something that newbie SABR people have spread around as a misunderstanding. Most of the people who have done the research and writing have known for a long time that BABIP is influenced by a bunch of factors other than just luck.

That being said I still think that most studies have shown that luck accounts for about 50% of the variation in annual BABIP fluctuation, ie most pitcher’s annual BABIP is in the range of .260-.330 whereas skill ranges are only ~.280-.315.

bstar
Guest
bstar
4 years 5 months ago

So you’re calling the majority of fangraphs writers “newbies”.

Jim
Guest
4 years 5 months ago

i’ll throw out 4 pitchers who got “lucky” last year or the year before that. Matt Cain, Jared Weaver, Hellickson, Clay Buchholz. What do they have in common? all have been strikeout pitchers in the past. all are dominant pitchers who happened to have a lower k rate and therefore have high FIP’s. while i’m a stat guy myself, i also happened to play baseball. not all batted balls are the same. logic says that a dominant pitcher pitching to contact is going to give up weaker contact than your average joe. here are your poster boys.

Chike
Member
Chike
4 years 5 months ago

I enjoyed this article. I wonder if the author would consider writing something similar for BABIP and what it means for hitters. I know LD%/GB%/FB% has something to do with hitter BABIP, but I’d be curious to find out what effect things like speed score, park factor, defensive alignment, or situational position factors (hitting leadoff vs. cleanup) have on BABIP.

Come to think about it, BABIP alone shouldn’t be taken as a measure of luck. The Jeremy Hellickson example shows expected BABIP is just as or even more important than BABIP itself. If it hasn’t happened already, the sabermetric community should get together and devise a standardized way of measuring xBABIP for pitchers and hitters. Then, we could use (xBABIP – BABIP) to really see just how much random variation applies to a pitcher’s success.

Baltar
Guest
Baltar
4 years 5 months ago

Whoa! Once the sabermetric community agrees to standards, what happens to all the innovative thinking that is the value of sabermetrics?

dave
Guest
dave
4 years 5 months ago

People still talk about Pythagorean Expectation vs. Actual Won-Loss records as though it owes to nothing but luck, don’t they? I’ve never understood that conclusion either.

Nice article. BABIP is a stat just like any other and definitely worth consulting when you’re looking at a fella’s numbers. After all I’ve heard about Hell-boy and such, this guy hasn’t been all that impressive so far, to me.

I think everything being done with pitch f/x and fielding f/x is great, and eventually they may produce some great unifying theory that will include BABIP. I’ll remain skeptical between now and then.

grazie,
dc

Baltar
Guest
Baltar
4 years 5 months ago

If the difference between a team’s W-L and Pythag isn’t luck, then what is it? Do some teams have special abilities to shift their net runs scored to close games?

James Gentile
Member
4 years 5 months ago

Teams with strong bullpens have been shown to have an ability to out-perform their pythags.

bstar
Guest
bstar
4 years 5 months ago

So have teams managed by Mike Scioscia.

Antonio Bananas
Guest
Antonio Bananas
4 years 5 months ago

Why don’t we just measure how hard balls are hit? Harder hit balls vs softly hit balls are all balls in play. However, it’s not luck how hard they hit it, it’s where they hit it that’s (sort of) luck. You pitch someone inside and have your D shifted, and the guy grounds out into the shift, I don’t think that’s luck.

Fun article though, I just wish our stats would evolve more. We have the equipment to do this.

Dan Greer
Guest
Dan Greer
4 years 5 months ago

You’re absolutely correct, which is why HitFx is going to be groundbreaking, should it ever be publicly available.

NS
Guest
NS
4 years 5 months ago

The Rays have been doing this for awhile.

Paul
Guest
Paul
4 years 5 months ago

I think the reason for the dramatic shift that has taken place recently in SABR with respect to sacred cows like BABIP, is that most of the really great talent went off to work for teams. And realized that what they’d been doing, while solid with what they had to work with, was laughable. And then they called up guys still working for outlets like FG and said something like, “Dude, that piece you just wrote was really readable, but if you only had the technology, you’d be embarrassed.”

Baltar
Guest
Baltar
4 years 5 months ago

I think you stated your point too extremely, but there is something to it. I really wonder what kinds of metrics teams have available for their GM’s and managers?
Of course, it’s a moot point if those people ignore them and use their “gut,” which I think is the norm.

fergie348
Guest
fergie348
4 years 5 months ago

I’m not sure what Tango is doing with HitFX or if this is top secret stuff at this point, but what I would love to see eventually is the rating of each ball in play by contact type and trajectory. This would serve three purposes: to rate a pitcher’s ability to induce weak contact, to rate a batter’s ability to create hard contact and to inform us as to how well positioned and able to execute the defenses truly are. Once we can analyze these three factors with game data I think the results will be truly groundbreaking. Right now, we have data that rely too heavily on the ‘three true outcomes’ and other data that essentially aggregate elements by the eventual outcomes of the play.

DonDraper
Guest
DonDraper
4 years 5 months ago

An example of the author’s point… I’m sure there are others, but one that stands out for me is Maddux.

Greg Maddux had BIBIPs of <.250 during his Cy Young seasons. Maybe because of this, his ERAs are about 1 run lower than his FIPs.

Are we to believe that over 800 innings his FIP is more accurate than his ERA based on his "low" BIBIPs? Of course not. Luck played a part but a much smaller part than his skill.

CJ
Guest
CJ
4 years 5 months ago

Nobody, and I mean nobody, who is worth their salt would take FIP over ERA over a career. I’m not sure the threshold, but 800+ innings, I’d take park-adjusted ERA (prefer RA, to be honest) over FIP, SIERA, xFIP, etc. etc, and so would most people.

The problem is that over a career, the breaks even out. You play in front of good and bad defenses. Screaming liners get caught and bloop hits fall in for doubles. Then, you use ERA to capture EXACTLY what you’re talking about, BABIP suppression, HR/FB suppression.

The problem is that, year-to-year, FIP predicts ERA better than ERA does. This seems to imply, at least to me, that FIP is something more innate about a pitcher than ERA, which further implies that the stuff filtered out of FIP is either a) less skill-based or b) too small a sample over 200 innings (probably a combination of both).

Baltar
Guest
Baltar
4 years 5 months ago

I don’t see anything magic about ERA as the ultimate criteria for a pitcher over the long run. If some other stat has a consistently better predictive ability, then why should it be used only for one year’s data and not for a career?

CJ
Guest
CJ
4 years 5 months ago

There’s a point where ERA predicts ERA (urgh, should be RA) better than FIP does, because after awhile low BABIPs or high BABIPs or HR/FB suppression starts to look less and less like a possible fluke.

I mean, it’s like if a callup hits for a .400 wOBA in 50 PAs (so it’s like demonstrating low BABIP). You go “neat, but no way I’m projecting him to hit for .400 wOBA next year, because he hit .396 and has no business doing so with a 9% LD rate” or something (you anticipate regression based on peripherals and the current distribution of MLB talent).

But then, you see a guy who’s hit .400 wOBA for the better part of a decade, with peripherals well in line with that, and you feel quite confident projecting him for .400 wOBA (assuming health, playing time, ageing, etc).

BABIP is the same deal. One year of Maddux and his low BABIP isn’t enough to say “this kid is magic, and I anticipate he will have this BABIP FOREVER”. Ten years? There’s probably something going on there.

bstar
Guest
bstar
4 years 5 months ago

Extremely well-said, CJ. By the way, you’re basically saying anyone who takes pitcher fWAR seriously is way off. I agree with you, but that’s most of the fangraphs community you’re talking about. Most people on here aren’t even looking at bWAR. @Baltar, agreed FIP is a nice little predictive tool, but how does a predictive tool help us to evaluate a players career better than ERA+?

CJ
Guest
CJ
4 years 5 months ago

Eh…

I’m not sure where I stand of fWAR for pitchers in a one-year sample. Both sides have arguments: FIP-proponents say that FIP demonstrates actual skill, dissenters say that ignoring what happens is… daft. But over a career, bWAR all the way.

I think your opinion of FIP depends on how much you link predictiveness to the concept of skill; as well as demanding full descriptiveness from your stats.

As a question, where to the runs from fWAR go because of BABIP normalisation? They have to go somewhere.

papasmurf
Guest
papasmurf
4 years 5 months ago

Nice article. BABIP is certainly not entirely luck. Yes, it’s luck to the extent of the placement of the batted balls. E.g. two liners that are hit equally hard can have different outcomes. One could be caught for an out with the other one, placed just three feet to the right of the other one, zips past the SS for a basehit.

However, inducing weak contact is at least partially a skill based on a pitcher’s ability to consistently make pitches that are hard to hit due to a combination of velocity, movement, and location.

I’ve seen too many articles online that either dismiss a strong season or assume a comeback for a pitcher/hitter based on “lucky” or “unlucky” BABIP alone.

Baltar
Guest
Baltar
4 years 5 months ago

I don’t totally disagree with you, but I’m not ready to concede that “inducing weak contact” is a skill until it is measured and proves predictable.
That may happen some day, just as with dubious skills such as “calling a good game” or “clutch hitting.”
Sabermetrics, after all, is not a body of knowledge but simply the application of scientific principles to baseball.

Tom
Guest
Tom
4 years 5 months ago

Because there’s a difference between evaluating past results and predicting the future.

If you had a model that could predict future stock market performance would you use it to evaluate the past 10 years, and say the stock prices should have been different so we’ll ignore what they were because it was luck/fear/some random crisis occuring?

Tom
Guest
Tom
4 years 5 months ago

Ughh…. reply fail… ignore the comment above. Sorry

papasmurf
Guest
papasmurf
4 years 5 months ago

Would you agree though that softly hit balls are more likely to be converted into outs than hard outs? And would you agree that certain pitches are more difficult to hit than others not because of luck but because some pitches are inherently more difficult to hit hard (whether it’s because of superior location, movement, and/or velocity)?

That is not to say that a pitcher who can make a tough pitch consistently isn’t going to be hit hard at all, but it is less likely that his pitches will be hit harder than inferior pitches.

pft
Guest
pft
4 years 5 months ago

BABIP is not rocket science. It’s just the percentage of fair balls in the park are converted to outs.

The original saber concept was that it was almost entirely due to luck. This made no sense and I used to get blasted for saying maybe a high BABIP was due to the pitcher being hit hard, and low BABIP being due to weak contact.

Now sabermetricians acknowledge its not all luck, but how much is skill/execution and how much is luck is not really known, but they try to estimate it.

BABIP is a useful statistic, but it’s concept has been intuitive for as long as baseball has been played. Bloops dropping for hits, line drives being outs, are all attributed to good or bad luck, depending on your viewpoint. Line drives going for hits are good hitting or bad pitching. Medium GB going for hits are good (bad) luck and/or bad fielding/positioning.

Paul
Guest
Paul
4 years 5 months ago

All great points. My main beef in all this is your reference to getting blasted for simply questioning the conventional SABR wisdom. In the sense that blind conformity seemed to follow objective analysis, it may have done more harm than good. Some will counter that they should not have been so snarky. But I guess we’ll just disagree on the definition of snark, because an awful lot of it looked just plain mean to me.

CJ
Guest
CJ
4 years 5 months ago

Look.

http://www.fangraphs.com/leaders.aspx?pos=all&stats=pit&lg=all&qual=y&type=8&season=2011&month=0&season1=2011&ind=0&team=0&rost=0&players=0&sort=11,a

Hellickson has the lowest BABIP of any qualified starter. Unless you think that BABIP is a number that is set in stone at birth, he WILL regress.

Who would make an even-money bet that Hellickson leads the league in BABIP next year? (Also, jeez, 80+% strand rate).

Doc Irysch
Guest
Doc Irysch
4 years 5 months ago

Has anyone looked at types of pitches and weak FB and GB rates? I would think that pitches that have movement on them would be more difficult to hit squarely and hard as opposed to straight fastballs.

chris
Guest
chris
4 years 5 months ago

Not that Steve didn’t write a good article, he did. But I thought it was pretty common knowledge in SABR circles at this point that BABIP is directly effected by the types of balls that were hit off the pitcher? Maybe I just haven’t seen it, but its been awhile since I read an article that basically said “BABIP=luck”

Also, Whenever you see a BABIP number that completely tips the scale in one direction or another, is it not reasonable to assume that regression is coming?

Paul
Guest
Paul
4 years 5 months ago

So glad for this comment. Perhaps within a tight circle of SABRists, this is so.

But here’s where you’re missing it. If “regression is coming,” is it because the defense is different? Are they cutting the grass shorter this season? Move in the fences? Strike zone change? Guy picked up a splitter? No, we just say, BABIP was low, regression follows. In other words, the explicit, unstated rationale for regression is luck.

How can luck with respect to BABIP be such a minor influence, as Steve states, but regression is almost universally expected?

CJ
Guest
CJ
4 years 5 months ago

No. It doesn’t matter what the rationale is. Unless the guy’s BABIP is exactly equal to the mean OR BABIP is 100% skill-dependent (so everyone has constant BABIPs for all eternity) he WILL regress. You don’t have to worry about calling it anything.

Look at the BABIP leaderboard. He had the lowest BABIP of any qualified starter. I mean, regression to the mean is present in things like test scores, which most people don’t consider luck.

Paul
Guest
Paul
4 years 5 months ago

You’re just confusing concepts. First, if it’s just a math problem, nobody is going to care. What exactly does noting that regression is likely accomplish? I don’t think anybody is saying that BABIP regression is not a true phenomenon. What Steve quite clearly said in the piece is that very little of it is due to luck. And that is quite a large change from accepted SABR wisdom (again, with the apparent exception of insiders who are just now sharing their enlightenment).

First of all, rationale is always important in statistics. Pretty sure that was first day in STAT201. In this case, of course we want to know his true talent level. Nobody cares what his BABIP is next year. As you noted above, it’s irrelevant over a large sample anyway. We want to know how good he is, so we need to know what to attribute BABIP fluctuations to.

In comparing test taking to batted balls, you’re really confusing two concepts, which underlies the BABIP conundrum. Test taking is a specifically individual activity where the test-taker is completely responsible for the outcome.

BABIP does not equal regression to the mean because now apparently we are going to place values on additional variables instead of just calling it luck (or random variation) And that’s appropriate.

MGL
Guest
4 years 5 months ago

How much of BABIP is random fluctuation/variation/luck (they are all the same thing) completely depends on the sample size. That happens to be true about ANY sample statistic. However, each statistic has a “baseline” percentage of luck given a certain number of opportunities (sample size). That baseline is largely determined by the spread of true talent in the population you are looking at with respect to the statistic in question.

BABIP is still one of the most luck-driven statistics that we routinely measure in baseball. Yet, in a large enough sample, the percentage of luck in BABIP approaches zero. In a small enough sample, the percentage of luck approached 100.

So it is ridiculous to talk about how much luck there is, relative to defense, and pitching skill, without reference to the sample size.

Hellickson is a horrible example if you want to opine on how wrong sabermetrics was/is in characterizing BABIP as mostly luck. For one thing, we are dealing with a 1 year sample and the bottom line is that most (like the lion’s share) of a 1 year sample of BABIP IS luck. The typical (not exact, but a good approximation) number of PA that you need to regress a pitcher’s BABIP 50% toward a certain mean (that mean could be for all pitchers in general or it could be for all pitchers given the K rate, and GB/FB rate of the pitcher in question – if it is the former, then the number of BIP for the 50% regression point is larger than in the latter case) is 3,700. That is around 6 years for a full time starter.

So, for one year, you would regress BABIP around 88% toward the mean. In Hellickson’s case, that might be around .280 including his fly ball rate and his defense (assuming the projected defense behind him is excellent again this year). That is around a estimated true talent BABIP of .273 (regressing .223 88% toward .280). If that is not the definition of a .223 being “mostly luck” then you have a different definition of “mostly luck” than I do, which is fine. They are only words and they don’t change the fact that our best estimate of Helly’s true BABIP is around .273 from everything we know.

The difference between a BAIP of .223 and .273 is over 1 run per 9 innings!

BTW, when we talk about BABIP in small samples being mostly luck we are not saying that some pitchers are not allowing more softly hit balls than other pitchers in those samples. We are saying that, one, the “luck” in a small sample of BIP consists of where batted balls happen to land AND the luck involved in how hard those BIP are hit, and two, that the range in skill among pitchers with respect to how hard they allow balls to be hit is small – that is why the range of true talent BABIP is also small! So while some pitchers DO have the actual ability to allow their BABIP to be hit softer or harder than the average pitcher, that ability has a very narrow range among MLB pitchers. So, while most of the spread in observed BABIP we see in one season or two (or even 5, since after 5 seasons, we still regress less than 50%, which sort of fits the definition of “most” – greater than 50% that is) is random variance, so it most of the hardness of softness of those batted balls. Again, the “luck” is not only manifested in where they land but also in how hard they are hit…

Conrad
Guest
Conrad
4 years 5 months ago

Awesome stuff, MGL. Thanks for the clarification(s).

MGL
Guest
4 years 5 months ago

“Also, Whenever you see a BABIP number that completely tips the scale in one direction or another, is it not reasonable to assume that regression is coming?”

Regression is ALWAYS coming (unless the observed sample mean is exactly equal to the mean of the population you want to regress towards). In fact the regression PERCENTAGE is the exact same amount given the sample size of the observed BABIP.

As I said above, it takes around 3700 BIP to regress an observed BABIP 50% toward the mean. One year BABIP for a starter should be regressed almost 90% toward a mean. 2 years, around 75%. Etc. The formula is regression = BIP/(3700+BIP).

It is as simple as that. All the rest of the discussion about luck or not luck is just semantics and adds nothing to the discussion.

vivalajeter
Guest
vivalajeter
4 years 5 months ago

I’m curious – why (3700+BIP)? Where’d the 3700 come from?

Tom
Guest
Tom
4 years 5 months ago

MGL…. great point.

I think the article spends a lot of time talking about .290 vs .270 and spends more time trying to identify what value to regress TO (league average or some “Tampa Bay” adjusted #) and doesn’t bother to look at the actual magnitude of the regression.

And as you point out it also completely ignores the sample size of the data and what role that plays in how to view/regress the data.

I suspect the player and organization involved has some impact on the analysis. I wonder if this was Bronson Arroyo whether there would be so much effort on what the BABIP baseline should be and almost no time on the actual regression to it.

James Gentile
Member
4 years 5 months ago

I’m sort of confused by this article and the reaction it’s getting. It rather clumsily comes to the conclusion that “I wouldn’t be surprised if we decrease the importance of “luck” even further once HITF/x data becomes available”, which may or may not be true, but using Hellickson’s .223 BAbip as an example is a terrible idea, and its only going to confuse people.

A regression from .223 back to .270-.290, which the author expects, after accounting for Hellickson’s batted ball profile and defense is still very much gigantic, and yet, I get the impression that some of the readers feel Slowinski is declaring Hellickson as some sort of BAbip defyer.

I, personally, can’t remember the last time I heard anyone of consequence refer to BAbip as “ALL luck”. It seems to me that this article is attacking a strawman or 15 year-olds

akex
Guest
akex
4 years 5 months ago

Hey, ah Michael Salfino copied you on Yahoo! and did the same exact story.

Andrew
Guest
Andrew
4 years 5 months ago

Lots of places have discussed this, because it was a news story in that Tampa paper. By that logic, you could say that Fangraphs is ripping off Marc Topkin.

West
Guest
West
4 years 5 months ago

My BABIP for #23 in Roulette is well above league average, I’m due for big losses.

Gallopinghost
Guest
Gallopinghost
4 years 5 months ago

Is there a good accesible source or previously compiled data on team BABIP stats?

delv
Guest
delv
4 years 5 months ago

“This is a common perception about BABIP, and one that used to be in favor among sabermetric circles. Heck, I subscribed to this philosophy three or four years ago, and I used “luck” as a quick way of describing BABIP to the uninitiated. But these days, that’s an outdated mindset and, quite frankly, misleading.”

Revisionist history at its finest. Slowinski as recently as last year was embracing the “BABIP luck dragons” misconception. Now he claims to have been enlightened for years.

http://www.draysbay.com/2011/3/4/2029409/what-is-babip
See: first comment, and all through the rest of the DRaysBay site last year. Only now that it’s clear that the genius Rays FO does not think that way does he change his perspective. What a joke.

Llewdor
Member
Llewdor
4 years 5 months ago

That’s all luck is. Luck is unexplained variance.

Baltar
Guest
Baltar
4 years 5 months ago

You’re absolutely correct. However, it’s better to use terms such as random variation, just because “lucky” is too easily perceived as a quality of a person rather than the result of random variation.

Paul
Guest
Paul
4 years 5 months ago

You can call it luck or you can call it unexplained variance or you can call it random variation. The key is how that conclusion is used, which I think it the crux of the article and most comments here.

It’s not enough to just say that Justin Verlander had an unexplained variance in his BABIP last year. The difference in whether or not it was influenced by luck or other factors, or if we just say it’s unexplainable and don’t endeavor to come up with explanations is probably somewhere around $5m per year on average. In other words, all those fancy metrics that try to give us an idea of a guy’s “true talent.”

Again, the implication by many, many, many people in the SABR community for many years has been, “Low BABIP means the guy is not as talented as his traditional stats say he is.” Had only the less definitive “may” been used all along, we probably would not be having this discussion.

Sean
Guest
Sean
4 years 5 months ago

I analyzed xBABIP diff (xBABIP – actual BABIP) and xFIP for pitchers stats over the past 5 years and the outliers are interesting:

Top 10, Low xFIP, unlucky with BABIP: MadBum, Niese, Brian Wilson, Brett Anderson, Joba, Greinke, Edward Mujica, Smoltz, Lidge, Qualls

Top 10 High xFIP, lucky with BABIP: Chuck James, Owings, Chris Young, Armando Galarrage, Glavine, Sowers, Kyle Kendrick, Kenny Rogers, Zito, Washburn.

fergie348
Guest
fergie348
4 years 5 months ago

It seems to me that we’re missing some data in regards to batted ball types that could shed light on how lucky or unlucky a pitcher or hitter is over a period of time.

Just give me reliable scouting on batted ball velocity (not actual, on a relative scale say of 1-5 where one is weak contact and 5 is that the ball was ripped) and trajectory and maybe we can break BABIP into component parts that are actually meaningful for evaluating the pitcher, the hitter *and* the defense. Maybe we have SHBABIP, WHBABIP and HHBABIP when all this shakes out.

WWMcClyde
Guest
WWMcClyde
4 years 5 months ago

Truly excellent job, Steve. Increased my understanding of BABIP exponentially.

bstar
Guest
bstar
4 years 5 months ago

This is my favorite Fangraphs article ever.

Charlie Procknow
Guest
Charlie Procknow
4 years 5 months ago

I couldn’t agree more with this article. BABIP is way more than just luck.

That said, I’m still staying away from Hellickson this year in my fantasy drafts. A fly pitcher that walks that many batters, and is bound for a regression (however large or small) is going to get burned eventually.

George S
Guest
George S
4 years 5 months ago

Question/Idear:

A few people have noted that Hellickson’s SwStr% is high and his K/9 is lower than would be expected given this SwStr%. Is it possible that this discrepancy has manifested itself as weakly hit balls in play, leading to a suppressed BABIP? Next year, we’ll still expect his BABIP to regress, but won’t we also expect his K/9 to increase? Could these two regressions balance out, and isn’t this really “luck” but used in a positive way towards Hellickson? I.e. he’s been unlucky with respect to K/9, although this bad luck is perceived as good luck with respect to BABIP. Good idea? Bad idea?

CJ
Guest
CJ
4 years 5 months ago

Reasonable, though I’m not sure about the relative magnitudes of the shifts. It’s been documented that high K/9 pitchers suppress BABIPs anyway.

In a personal WAG, I’d say that no, it won’t cancel. The amount of BABIP regression that we should expect (and by this I mean: Hellickson may be the second coming of Glavine, but WE DON’T KNOW YET, so we regress him like everyone else) is really, really big. The amount of K/9 we can see change… is not likely to prevent him going up to .250-.270 at the very least.

Couple this with a laughably high strand rate and I think you’ll see a pitcher get better w.r.t peripheral stats and post an ERA about a run to two runs higher than last year.

Tom
Guest
Tom
4 years 5 months ago

Yes, but no….The magnitude of the two don’t match up. CJ’s analysis seems spot on… To put some #’s to it and very crudely model this:

The difference betweeen .223 and say .270 (giving generous credit for TB defense and being a FB pitcher) would mean taking 96 outs on all balls in play and converting them into hypothetical strikeouts (559 balls in play, 125 hits in play would need to get to 463 balls in play with the same # of hits to get a .270 BABIP)

That would put him at 213 K’s in 189 IP or a 10.1K/9IP pace.

His SwStr% last year was between Scherzer and Romero who were in the 7-8.5K/9 range. There is not a clear cut conversion from SwStr % to K rate but he would not likely be getting to 9k/9 with that SwStr rate (9K/9 is pretty elite company for a starter – there were only 7 starters in baseball who were above that)

Just playing with some #’s even if you theorize his K rate should be around 8.5k/9 (which seems very generous – King Felix had a rate like that last year) there’s still at least 25-35 points of BABIP unaccounted for (and that’s also assuming a lower baseline for the TB defense). While that sounds small that is still a significant regression while giving ample credit for defense and a very significantly improved strikeout rate

The idea is potentially good, but it doesn’t match up in terms of magnitude.

George S
Guest
George S
4 years 5 months ago

Excellent. So assuming he can net a more modest 7.5 K/9, it could possibly temper the effects of a BABIP regression to the tune of 15 to 20 points, by no means accounting for all of the enormous gap between .223 and .270.

Thanks for the responses!

Urban Shocker
Guest
Urban Shocker
4 years 5 months ago

Let me summarize the state of BABIP research since I am so sick of seeing it abused. ‘Pitchers display very little control over BABIP. Except when they do.’

Paul
Guest
Paul
4 years 5 months ago

Pass. How did it take 135 comments to get here?

Larry Yocum
Guest
Larry Yocum
4 years 5 months ago

Isn’t it a little too early to decide where Hellickson should be on the BABIP scale? I agree that different pitchers will demonstrate different numbers ala Matt Cain, but didn’t we get this same argument last season with Trevor Cahill? I wasn’t touching Cahill last season after his .236 BABIP the year before even if he did have a plus defense and seemingly produced weak grounders. And then 2011 happened and Cahill’s BABIP jumped to .302 and he was one of the biggest disappointments of the season from a fantasy perspective. Hellickson looks an awful lot like this year’s Cahill to me with his low K/9, low BABIP and high strand rate. He will still be a nice pitcher, but expect an ERA closer to 4 than what he put up last season.

tigerfan1984
Guest
tigerfan1984
4 years 5 months ago

I’ve been saying BABIP wasn’t luck to everyone who would listen for years. If you look at the covarience between types of balls hit and other factors a trained statistician would never make such a ridiculous claim.

wpDiscuz