Beating Up the Sabermetric Strawman

It might come as a shock to FanGraphs readers, but not everyone has a use for the statistics we use on this site. Plenty of fans, and probably even some people more closely connected to the game, are comfortable with their personal observations. That’s fine. Baseball is a game, and the game is meant, first and foremost, to entertain. We all enjoy it in our own ways, and, as long as it doesn’t involve harming others, no one should disparage anyone else’s way of enjoying it. By hearing them out we might even find new ways to enjoy the game ourselves. But enjoyment is not at all the same as evaluation. That’s where we’ve run into some issues.

Earlier this year brothers Alan and Seymour Hirsch published a book called The Beauty of Short Hops, in which they argue that a reliance on sabermetrics ignores the finer aspects of the game. Yesterday on Grantland science writer Jonah Lehrer argued a similar point. While he does give nod to the virtues of sabermetrics, he spends most of his words talking about what the stats do not tell us. The crux, as with any article worth publishing, comes towards the middle:

But sabermetrics comes with an important drawback. Because it translates sports into a list of statistics, the tool can also lead coaches and executives to neglect those variables that can’t be quantified. They become so obsessed with the power of base runs that they undervalue the importance of not being an asshole, or having playoff experience, or listening to the coach. Such variables are the sporting equivalent of a nice dashboard. They can’t be quantified, but they still count.

Lehrer offers up this loaded assertion without one iota of evidence. He tells the story of the NBA Champion Dallas Mavericks and how one of their statistically bereft players helped change the tone of the series. But he doesn’t show that somehow the Heat threw aside intangibles while the Mavericks embraced them. It is patently ridiculous to think that any front office has thrown subjective analysis in the garbage solely in favor of statistics. Of course they consider if a player is an asshole and all of those other things Lehrer listed. They all affect a player’s performance on the field, and a front office’s job is to put the best possible team on the field.

Perhaps Lehrer’s assertion can be more appropriately tied to analysts, such as your gracious hosts here at FanGraphs. But even then it falls short. Yes, we argue mostly from a statistical standpoint, and oftentimes we make no mention of intangibles or even scouting aspects of the game. That does not mean that we devalue or neglect them. Rather, we’re focusing on one aspect of the argument. Other writers focus on other aspects. Sometimes these arguments will contradict each other. That’s fine, and even good in most instances. It gives us more to discuss, and with more discussion we can get closer to the heart of the matter.

This all takes us back to the age-old stats vs. scouting argument, which is actually one big misstatement. As many before me have noted, stats are not in conflict with scouting. They’re two different tools that teams can use in evaluating players. Stats are merely a record of what happened on the field. They’re sometimes put in greater context, but they’re the results of the game nonetheless. Scouting is a more subjective, closer look at those same events, and it does include other aspects such as a player’s makeup. Taken together they can provide a team with a reliable player evaluation. Either part without the other, though, can miss important factors.

Lehrer and the Hirsch brothers have done little except beat up on a strawman with a big target on its back. Plenty of people dislike statistics, and so writing about the perils of statistical evaluation is sure to catch some attention. Yet both of their arguments miss the point on many counts. Lehrer’s might be the worse offender, because he provides no evidence for his claims. No one — no one worth reading, at least — discounts the immeasurable side of the game. Since it is immeasurable, though, it is difficult to find evidence to back claims. Therefore, many statistically minded writers opt not to deal with it altogether.

Think of it like FIP. We all know that a pitcher who leaves a ball over the middle of the plate is probably going to get hit hard. And yet, we removed balls in play from the formula anyway? Why? Because we want to know one specific thing about the pitcher: how he performs in terms of the events he most directly controls. Stats-based analysts and writers know that a player’s attitude can affect himself and his teammates. But we’re more interested in the events as they occurred on the field. It’s not one to the neglect of the other. There are other writers and analysts who can better cover the other aspects of the game.




Print This Post



Joe also writes about the Yankees at River Ave. Blues.


111 Responses to “Beating Up the Sabermetric Strawman”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Colin says:

    Thanks, that was a terrible article. His weak example is especially so, since most statistical metrics love the Mavericks and their crunch-time heroics.

    Vote -1 Vote +1

    • mcbrown says:

      Yes, that was a terrible example. As I understand it, the Mavericks embraced advanced statistical analysis to help optimize their situational lineups, with (obviously) positive results. It doesn’t take anything away from the many great personal storylines behind their championship to add the true storyline of HOW the Mavs got there.

      What drivel.

      Vote -1 Vote +1

    • ademing says:

      Yes, I love that his only evidence to support his cause was of the most sabermetric forward team in the NBA.

      +13 Vote -1 Vote +1

      • Sean O'Neill says:

        I thought that was the Rockets?

        Vote -1 Vote +1

      • Bigmouth says:

        Sean, I think he meant “one of…” And he’s right.

        Ironically, Lehrer makes the same error the Mavericks did several years ago against the Warriors. They made lineup decisions based on +/- scores from a handful of games, rather than over a full season. As a result, the Dubs torched the Mavs.

        Vote -1 Vote +1

  2. Joel says:

    What would an internet be without an obligatory Richard Feynman reference:

    http://www.youtube.com/watch?v=ZbFM3rn4ldo&feature=player_embedded

    Vote -1 Vote +1

    • JimNYC says:

      What would it be? Maybe an actual enjoyable place to hang out, rather than a tiresome den of math nerds?

      Seriously, I’m all for proper statistical analysis in a baseball context, but people who mention Feynman’s name actively enjoy math, and that just creeps me out.

      Vote -1 Vote +1

  3. As someone of slight-ish build, I appreciate the existence of strawmen, as they’re one of the few things I CAN beat up.

    +20 Vote -1 Vote +1

    • What about Snake Drank? Is that a measurable intangible? What about when a player urinates on his own hands so he doesn’t have to use batting gloves? I think stats come up woefully short in measuring the value of such things.

      +12 Vote -1 Vote +1

      • Yirmiyahu says:

        Or grit. Stats can’t measure grit. I’m talking about dirt, literally.

        I love filthy, pine tar covered batting helmets.

        http://sports.espn.go.com/espn/page2/story?page=lukas/041021

        Vote -1 Vote +1

      • Preston says:

        The so called luck and intangibles that people say sabremetrics ignore I find are actually ignored by old-school analysts. One example being an anecdote in the book that refers to a ball hitting a bird. How can you measure that in stats? Well the answer is you can’t by traditional means. But by looking at anomalous BABIP’s (either high or low) you can see what roll luck is playing in that players performance (you can further parse that information to see if the change in BABIP is due to more/less line-drives, GB’s or FB’s). The point being you can take luck out of the analysis when predicting future performance. In evaluating prospects you can use stats to either confirm or deny scouts assertions of a player. Guys who are “toolsy” don’t always perform, and guys that are “gritty” sometimes do. While a player like Nick Swisher or Dustin Pedroia may not have been an impressive looking prospects yet when using an analytical stats based approach, like both the A’s and Sox use, you see that they both produce which is much more important than talking about having a pretty swing, being fluid, athletic, strong or having 5 tool potential.

        Vote -1 Vote +1

      • Seideberg says:

        Err, Preston, Nick Swisher was a first-round pick (#18 by the A’s), and Pedroia was a second-round pick (#65 by the Red Sox).

        You were probably thinking of Frank “Nitty Gritty” Menechino, who was drafted in the 45th round, two rounds after the Chicago White Sox GM drafted his own daughter.

        Vote -1 Vote +1

      • Preston says:

        My point is that if you just watched one of these guys play in a SSS you might say, too short, not athletic enough, holes in their swings, etc. But because both guys are very disciplined, cerebral players who give max effort they produce. Teams do use stats they realized how productive they were and drafted them accordingly. Watching Yankees spring training games this year Andrew Brackman and Melky Mesa look like elite players because of their measurables . But neither has ever had production to match and predictably they are not producing in the minor leagues this year either. Conversely a pitcher like David Robertson doesn’t have overly impressive velocity on his fast-ball and is posting elite K/9 numbers in the majors. Similarly Brett Gardner who is not physically impressive and has a very choppy ugly swing is one of the best in terms of WAR in the majors. Scouts often give assessments based upon watching one or two games of a player. My point is that without the context of statistics those observations are baseless.

        Vote -1 Vote +1

      • Jim says:

        Seideberg, it’s very doubtful that Pedroia would have been a second round pick even five years earlier. Between 2000 and 2004 statistical analysis gained great traction because of those unathletic A’s teams that built their offense around avoiding outs. Even still, Pedroia’s college stats dwarfed just about all of those drafted ahead of him, and yet he lasted until the #65 pick, which was derided by several in the scouting community. He has followed that up by having (by a significant margin) the best WAR of every hitter taken in 2004.

        Swisher, though, was probably a bad example – he had been highly regarded for quite awhile, and is also the son of a major leaguer. We all know how scouts love their legacies.

        Vote -1 Vote +1

  4. Seattle Matt says:

    Hypothesis: “Jonah Lehrer” is actually a pseudonym for an author who is a big time stats lover. This writer ginned up this article so that all of the stats-loving writers can easily hammer home their regular arguments.

    Vote -1 Vote +1

  5. Ingy says:

    Agreed. Sabermetrics are on the way out. This movie with Brad Pitt is like the death knell in the coffin of sabermetrics. It makes everybody realize that Billy Beane has all these fancy statistics, but he’s still yet to win a World Series, which is the ultimate goal. It DOES NOT MATTER if you can spend money wisely or you signed somebody to a minor-league contract and they had an All-Star season. If you don’t win the ultimate prize, it doesn’t matter. I’d like to see some of these Harvard and Yale educated kids play baseball for at least four or five years in the minor leagues before they work in the front office of an MLB team. This would give them the required perspective to learn how to win, instead of making decisions based solely on calculations. It was fun while it lasted sabermetrics, but you can’t win the big one.

    -18 Vote -1 Vote +1

    • J.P. says:

      Yeah, that crazy, gritty Red Sox front office burned every computer in the building before they won two titles.

      Vote -1 Vote +1

      • Ingy says:

        It doesn’t matter since the movie and all the press is about Billy Beane and the Athletics. The Red Sox have something called MONEY, which is what it takes to win – MOST of the time. Notice I didn’t say ALL of the time, but you can have your calculators and Harvard-educated kids who’ve never played the game. I’ll take the biggest payroll in baseball, and if it’s ONLY one or the other – money versus sabermetrics, guess what Skippy? I’ll win. Not every single game, but MOST games, and in the end, the big prize… the WORLD SERIES, which again is the ONLY reason to play baseball. Game, set, match.

        Vote -1 Vote +1

    • Eriz says:

      Yes. Sabermetrics are on the way out. Which is why nearly every team (including the Yankees and Red Sox) has some sort of sabermetric staffmember.

      Vote -1 Vote +1

      • FFFFan says:

        The Red Sox have had Bill James on staff for some time.

        You shouldn’t expect Beane to have an advantage anymore since most teams use advanced statistacal analysis for player evaluation in addition to scouting.

        If you’re in a competition, you use every tool available (e.g. advanced statistics, advanced medical evaluation, etc.).

        Vote -1 Vote +1

    • Eric says:

      Hahaha

      Vote -1 Vote +1

    • jim says:

      ridiculously obvious troll is ridiculously obvious

      Vote -1 Vote +1

    • Matt says:

      It’s funny that you bring up the Moneyball movie (and therefore book as well) because when I was reading

      “This all takes us back to the age-old stats vs. scouting argument, which is actually one big misstatement. As many before me have noted, stats are not in conflict with scouting. ”

      I couldn’t help but think that Moneyball is largely responsible for for this perception. Even in the short trailer, how do you see scouts portrayed? As 90 year old idiots, talking about players with ugly girlfriends and not knowing who Fabio is, until Billy Beane and Jonah Hill come to the rescue and put them in their place.

      Vote -1 Vote +1

    • Duke Silver says:

      Not sure if trolling…

      Vote -1 Vote +1

    • Chris says:

      Everyone who thumbs downed this: sarcasm doesn’t get much clearer than this

      Vote -1 Vote +1

    • Eric M. Van says:

      There’s no question that sabermetrics let down Beane in the playoffs. The A’s inability to accurately estimate Jeremy Giambi’s HTFDYNS/RS (How the F*** Do You Not Slide / Runs Scored) cost them a sweep of the Yankees in the 2001 ALDS. And given that lesson, that they failed to sweep the Red Sox in 2003 because they couldn’t get an accurate handle on either Eric Byrnes’ or Miguel Tejada’s SANHPANTTTIAGTO/RS (Standing Around Near Home Plate and Not Trying to Touch It and Getting Tagged Out / RS) is inexcusable.

      Vote -1 Vote +1

  6. AK says:

    While I largely agree with the article, this statement is blatantly untrue:

    “Yes, we argue mostly from a statistical standpoint, and oftentimes we make no mention of intangibles or even scouting aspects of the game. That does not mean that we devalue or neglect them.”

    It would be kind to say that the Fangraphs community– both its writers and readers– devalue and neglect intangible measurements. It’s on each individual to decide whether that’s appropriate, but let’s not pretend that there’s any bit of diplomacy from this community towards these skills that cannot be quantified. Neglected and devalued? More like mocked and ridiculed, if we’re being honest.

    Similarly, the Fangraphs community engages in many, many empty qualifiers in order to shield itself from criticisms of certain metrics. Anytime someone questions the validity of WAR as a catch-all measurement of a player’s value, a writer or commenter will assure that no one should look at WAR as a statistic that gives a full measurement of a player’s value… before proceeding to discuss the statistics as if it provides a full measurement of a player’s value. The qualification exists only to disarm critics, and is conveniently ignored once the statistic itself is no longer being debated. Similarly, whenever the validity of UZR as a defensive measurement is questioned, a writer or commenter will build a protective layer around the stat based on the fact that it requires three years of data to become viable. Then, once the assualt has passed, those same writers and commenters will begin comparing players defensively based on a half-season sample size. And of course UZR is the defensive component of WAR, so this argument circles back to the first, whenever half-season WAR values are suddenly discussed. There’s universally acknowledgment that UZR is unstable and unreliable with less than three years of data, but that acknowledgement is ignored whenever it’s convenient to do so.

    In chats, the writers often defend the misrepresentation of certain statistics by saying, “______ isn’t meant to be used as _______, and no one here suggests otherwise.” They then proceed to suggest otherwise.

    But, for the record, let me state that I’m someone who doesn’t believe in intangibles, and does believe in both WAR and UZR. But as a matter of observation, many “true believers” on this site like to ignore qualifications when it’s convenient to do so, and fall back on flatly untrue maxims like “no one thinks WAR should be used as the sole measurement of a player’s value.”

    Generally, there’s a lack of honest introspection into the role that SABR-friendly folks have played in the broader sports world’s misunderstanding of the SABR community.

    +59 Vote -1 Vote +1

    • don says:

      The best example of that is Milton Bradley. The team that picked him up was always applauded by the stats community because of his hitting ability but invariably he’d be kicked to the curb again in a year or two.

      Vote -1 Vote +1

      • Travis says:

        The Bradley example is demonstrably false. After 60 seconds of googling, I found the following articles discussing non-statistical aspects of Milton:
        His use of the “race card”: http://www.fangraphs.com/blogs/index.php/milton-bradley-and-the-race-card/
        An entire article about Bradley being unable to escape his intangibles, and the relationship to the SABR community: http://www.fangraphs.com/blogs/index.php/milton-bradley-and-statheads/
        Article on when he went to Chicago, with discussion of his attitude: http://www.fangraphs.com/fantasy/index.php/milton-bradley-takes-his-game-to-chicago/

        Parent comment may have some valid points, but it’s certainly not the case with respect to Bradley.

        Vote -1 Vote +1

      • Aaron W. says:

        The larger problem here is that there is still so much residual resentment from both sides that it interferes in our ability to honestly assess our own prejudices.
        Billy Beane said that we weren’t selling blue jeans. I took us stat-heads ten years to realize that, yeah, we kind of are.
        Isn’t Cameron Maybin the kind of toolsy/low-baseball-skills kind of guy that Moneyball was making fun of? And weren’t most us THRILLED to see the Padres pick him up?
        That said, it seems that the combative attitude of the post-Moneyball area has largely subsided in the sabre-inclined community. But I don’t think the Lehrers of the world have mellowed toward us at all.

        Vote -1 Vote +1

      • r says:

        please dont mention articles by alex remington in a discussion of anything sabremetric, he’s an op-ed writer.

        Vote -1 Vote +1

      • don says:

        Travis, the first two of those articles are from the last few months when he’s no longer playing and aren’t about acquisitions anyway so they’re irrelevant. The third considers it a good pickup.

        “Demonstrably false” indeed. Next time consider content instead of linking the first three things you turn up in a google search.

        Vote -1 Vote +1

    • DD says:

      I agree with this. Because these measurement tools have limitations and aren’t perfect, although they may seem perfectly logical in their construction, they lose credibility when trying to advance their use by the non-saber crowd. As tools that are made up of objective, measurable information, they are often thrown around as the gospel, the truth. However, because of their imperfections, they lose traction as factual representations of on-field performance. Therefore, as each blemish is uncovered, or glossed over, or when a valuation is qualified, the conclusion reached topples to the ground, at least in the minds of no-saber folks. Since scouting reports are largely subjective, there is less of a need or call to defend a particular conclusion (that a particular player has a “plus fastball” for example), because the understanding is that not everyone comes to the same truth. This is a large obstruction to wide acceptance that still remains to be overcome by saberists. Progress is being made everyday to find better, more definitive ways to measure things, such as defense. My worry is that the saber crowd is pushing an incomplete idea on others a bit too soon. Maybe we should work on getting better tools before we build a house.

      Vote -1 Vote +1

    • Rudegar says:

      Well said. Very well said, and I agree with this completely.

      Vote -1 Vote +1

    • John Willumsen says:

      This was a well-crafted, intelligent comment and I thank you for the quality content.

      I have a question. You wrote (in re: FG writers and commenters positing that stats like WAR are not the end all be all and promptly treating them as such): “The qualification exists only to disarm critics, and is conveniently ignored once the statistic itself is no longer being debated.”

      My question: Is this in fact such a sneaky tactic? I would argue that it is appropriate to only note that these stats are not wholly definitive and discussion-ending when someone claims that they are. At other times, when they are not coming under attack, I would argue, it is OK to act as though those reading understand that the writer is not claiming that these stats answer all the questions.

      I don’t necessarily disagree with your overall premise; writing an article that implies that Brett Gardner is the 12th best player in baseball based essentially only on his WAR and then turning around and saying that WAR lists aren’t definitive who’s-whos of the best players in baseball is a bit disingenuous, but I would suggest that such instances are far and away the exception rather than the rule.

      Vote -1 Vote +1

      • AK says:

        There’s a lot of truth in what you say. And logistically speaking, who wants to read an article that’s so stuffed with qualifiers that any conclusion it appears to be making becomes so watered down as to be nearly nonexistent? At times it can appear to be nothing more than spineless equivocation, and at others it’s just an indecipherable information dump. Stories should be accessible and direct, so I understand the impulse to make some basic statements without inserting warnings in every other sentence.

        But the fatal error, I think, is the assumption that everyone is aware of the unstated-but-implicit limitations of each statistic. These are still radically new measures, and are pretty complicated, no matter how familiar with the underlying math one person may be. There’s a lot of room for misunderstanding and misinterpretation. And from what I experience when I talk to people about advanced stats, both online and in person, it’s clear to me that the misunderstanding and misinterpretation is more than theoretical.

        Vote -1 Vote +1

    • Random Guy says:

      That’s a good and thoughtful commentary, although I’m not sure if it is as true as it used to be. I was a semi-regular on rec.sport.baseball and then the Rob Neyer message board on ESPN, but more of an observer with a passing interest than an actual purveyor of advanced metrics. Back on those forums, even a throwaway comment that a player seemed like a good guy would get you mocked and ridiculed by five different responders, since you just can’t quantify good-guy-ness. Part of this might have been hypersensitivity to a few active trolls on those boards, but it was still the prevailing sentiment. I don’t think that happens so much on Fangraphs, as far as I have seen; maybe that’s just because trolls aren’t as well-fed as they used to be.

      Vote -1 Vote +1

    • Eric says:

      Although I think your point does have some pretty valid criticisms, I wanted to bring up one specific point. In your article you reference writers building “protective layers” around the stat, only to then discuss and compare players using those stats. It’s important to keep in mind that most of the time these conversations do not happen when trying to predict what the player will do in the future, but rather what actually happened on the field. Discussing the half season WAR of two players in terms of who has performed better on the field is fine. Discussing who is the best player or who will be the best moving forward using half season WAR is not, but I don’t see those conversations occurring often here.

      Vote -1 Vote +1

      • todmod says:

        Well, I think the issues of whether WAR is an accurate measurement of who is the better player in season is highly debatable as well. Whether it’s the accuracy of UZR (especially in small samples) or the question of if FIP captures enough information, using WAR as the only factor to determine player value in-season isn’t exactly the best method.

        It’s a good tool and very useful in comparisons, but I will agree with the complaint that this site does fall into traps of using it as an absolute end-all capture of a player’s value.

        Vote -1 Vote +1

    • Mr. Know-it-All says:

      “Generally, there’s a lack of honest introspection into the role that SABR-friendly folks have played in the broader sports world’s misunderstanding of the SABR community.”

      Yes! Yes! Thank you, AK!

      As a huge baseball fan and a saber-skeptic – but one who has tried to educate himself on at least the basic sabermetric concepts – I have run headfirst into this phenomenon repeatedly. And, indeed, your WAR example is spot-on 100% correct.

      I appreciate the OP’s concession that statistical calculations and traditional scouting each have their place in baseball analysis – but as you so incisively and accurately observe, it’s just not what we see in actual practice among the sabermetricians.

      Vote -1 Vote +1

  7. Telo says:

    I was enjoying most of what Grantland had to offer until that drivel was posted. Completely useless. The true pursuit of sabermetrics is to know what you know, and know what you don’t know. No one thinks they can look ONLY at numbers and know everything about a player.

    Lehrer is the definition of a hack.

    Vote -1 Vote +1

    • AK says:

      Plenty of people do think you can look ONLY at numbers and know everything about a player. Maybe not the people in front offices, and maybe not the Dave Camerons and Jonah Keri’s of the world, but many, many of the SABR-inclined fans certainly do. And very often, that’s the impression the general audience has of the SABR community: that one annoying guy they know who insists on telling them that with the right collection of statistics, you can know everything about a player that there is to know.

      Sure, these folks are misinformed, and they’re poorly representing the community they support. But they do exist. It’s disingenuous to pretend that the only advocates of a thing are those that represent the thing graciously and accurately.

      I say this independent of the Grantland piece and Joe’s reaction to it.

      Vote -1 Vote +1

      • Rudegar says:

        Some don’t even use the plural “numbers”. They think one advanced stat with an algerbraic equation that may or may not be accurate weighting, can tell you everything you need to know. Talk pitching, and while not everyone, but some, go FIP, FIP, FIP. And anything else is scoffed at. Defense, and it’s all UZR. Hitting; wOBA with all the rest ridiculed and cast aside as near worthless in comparison.

        Not all of the stat junkies feel this way, but it seems enough do to make any discussion with them pointless.

        Vote -1 Vote +1

  8. B.E. Earl says:

    Don’t intangibles lead to performance (good or bad) which can then be measured by…gasp…stats? Just wondering.

    Great article, Joe. And nice that we folks can leave comments here as opposed to that Grantland site.

    Vote -1 Vote +1

    • AK says:

      Careful… it says above that no one thinks what you’re suggesting. You apparently exist on an island.

      Vote -1 Vote +1

      • John Willumsen says:

        I think one thing that’s getting lost here is that when a fan, or for the most part even a FG writer, comments on intangibles like work-ethic or likeability, they are inescapably speaking with limited to no first-hand knowledge. Most everything we know about intangibles is distilled to us through beat writers, announcers, and snippets we see from the dugout and so on. I would argue that it’s almost worse to discuss such things with our very limited actual knowledge of them than it would be to try to guess at their impact. While it’s important to accept that many of the professional talent evaluators in the game (from scout to manager to GM) MUST and DO take into account that which we label “intangibles,” we should also remember that we, outside of the profession, really know next to nothing about the truth of specific players’ intangibles. The above-cited examples of Milton Bradley and Javy Vazquez. We know nothing about how these men truly comport themselves and even less about what goes on within their own minds. We do them, and perhaps ourselves, a disservice if we place too much weight on our assumptions and best guesses about their character.

        Vote -1 Vote +1

      • John Willumsen says:

        I can’t reply to your reply above since that would be stretching the reply-to-reply-to-reply chain too far. On a related note, reply is a fun word.

        But yeah, thanks for your response to my query, what you said makes sense and, in my opinion, is correct.

        Vote -1 Vote +1

  9. David says:

    On a related note, John Kruk is as big a fan of stats as he is of salads.

    Which, as good a statement as it is on it’s own, leads me into something I found particularly interesting about his comments on teams hiring new managers these days. He was saying something along the lines of managers who make decisions solely based on statistics are taking the easy way out compared to managers who make “gut” or “baseball” decisions because they can say, “I made the right choice, “x”% of the time “y” happens and unfortunately despite the favorable odds it didn’t work out in our favor”. And inferred he would prefer a manager that has the ball(s) to make “gut” decisions and then defend them to the press.

    First, are there any managers who make the majority of their decisions based on statistics? I’m assuming the list is pretty thin as it seems most managers probably hurt their teams with their decisions. And second, from all the post game press conferences I’ve seen it seems most of the time managers are explaining why their gut feelings didn’t work out not why their use of statistics and expected outcomes failed. Why would someone prefer a manager take the 1 in 5 chance and be able to defend failing often when they could take the 4 in 5 chance and say “we got unlucky” the expected 20% of the time?

    Vote -1 Vote +1

  10. Paulie L. says:

    Two moves that came to mind after reading this piece were the Yankees signing of Javier Vazquez and the Mariners trading for Milton Bradley. Both the Yankees and Mariners were praised on this site for those moves.

    Vazquez lost starts at the end of the ’08 season in a tight race and when he pitched against Tampa in the playoffs he was serving up meatballs. I thought he was mentally deficient then and even mentioned so in the article praising the Yankees for signing him.

    Nothing really needs to be said about Bradley. Hendry was a moron for signing him and Jack Z a bigger moron for sending cash and Silva to the Cubs for the honor of hosting the Bradley circus.

    Vote -1 Vote +1

    • todmod says:

      It’s a good thing that Javier Vazquez went back to being dominant when he switched to the Marlins this season. With his mental deficiency playing in New York, it was a surefire fix.

      There definitely couldn’t be any arm/velocity issues examined on this own site that were the cause of that. MENTALLY WEAK!

      Vote -1 Vote +1

      • AK says:

        Remember, no one devalues or neglects intangibles.

        Vote -1 Vote +1

      • Paulie L. says:

        J.V. has been labeled an underachiever for a reason.

        Vazquez was outstanding on a 72 win Sox team in ’07. He went back to being mediocre on a division winning team in ’08, to the point he actually lost starts at the end of the season and only pitched against Tampa in the playoffs because the other starters were exhausted from pitching on short rest. Javier then went to the NL and was terrific once again for a 3rd place Braves team which convinced the Yankees to bring him back a second time.

        Below are Vazquez’s Baseball-Referenc.com WAR, Fangraphs.com WAR and the team wins/finish since ’06

        ’06) 2.6 and 4.8 90 wins 3rd place
        ’07) 5.9 and 5.1 72 wins 4th place
        ’08) 3.1 and 4.9 89 wins 1st place
        ’09) 5.2 and 6.5 86 wins 3rd place
        ’10) -.3 and -.2 95 wins 2nd place

        Vote -1 Vote +1

  11. Nick says:

    While I agree that the article was fairly self-serving and not grounded in much evidence, you should be equally careful in your counter. You sight, “No one — no one worth reading, at least — discounts the immeasurable side of the game. Since it is immeasurable, though, it is difficult to find evidence to back claims.” But your counter is largely based around the fact that he presents no evidence. Wouldn’t that be expected?

    Clearly a writer with an affinity for immeasurables would also have an affinity for the unproveables? They have an engrained notion that the “magical” aspects of baseball are what we should hold onto. They are offended when the saber community asserts “Roy Hobbs is not walking through that door.” Just my $.02.

    Vote -1 Vote +1

  12. Steve says:

    I read Fangraph’s predictions every year, and despite the high powered statistical metrics (NERD, really?), the site does no better or worse in predicting final outcomes than the sportswriter who has trouble with simple multiplication. So what’s the point?

    The real valuable work was done by Bill James and applied by Billy Beane. The importance of on base percentage, slugging ability, and most of DIPS were the big steps.

    It’s gone overboard, and I’d argue, too far, especially on this site. Just as an example, WAR is over-used and over-rated because it tries to equate defensive ability to wins. The problem is, UZR is notoriously volatile from year to year and offers almost zero predictive ability, yet is about half of the WAR value.

    Vote -1 Vote +1

    • Sean O'Neill says:

      One problem with your conclusion: UZR’s lack of predictive ability is irrelevant to the discussion events that have already happened, and it’s those events that have already happened that matter to us when trying to figure out things like…how many wins a player has been worth.

      Vote -1 Vote +1

      • vivalajeter says:

        Sean, it looks like Steve is referring to people who use WAR as a predictive measure rather than a measure of what actually happened. Like saying a free agent is overpaid because he was worth 5 WAR last year and he’ll decline by 0.5/year, etc. Or like saying “Don’t trade for Beltran because it’ll only add ~1 win in August/September”. There are many instances where people cite WAR as a predictive measure.

        Vote -1 Vote +1

    • Nik says:

      WAR isn’t used to predict. It’s used to show what has happened, it is up to the individual to create his own predictions using stats of any kind, and frankly I’ll take UZR, wOBA and xFIP over ERA, AVG and RBI any day of the week.

      Vote -1 Vote +1

    • Jason B says:

      “the site does no better or worse in predicting final outcomes than the sportswriter who has trouble with simple multiplication.”

      So you’ve studied this? Can I see your results? (Just because you think something may be true, or wish it to be, doesn’t necessarily imply that it is. If you have valid results, I stand to be corrected.)

      “The real valuable work was done by Bill James and applied by Billy Beane. The importance of on base percentage, slugging ability, and most of DIPS were the big steps.”

      Billy Beane’s work wasn’t about the importance of on-base percentage, slugging percentage, and/or DIPS, but about finding and/or properly valuing undervalued assets.

      Vote -1 Vote +1

    • MetsKnicksRutgers says:

      Well… How hard is baseball to predict really?

      AL Playoffs:

      Boston, New York, Minnesota, Texas. Switch Texas with Anaheim and Tampa with one of Boston or any and you have the past several years.

      NL PLayoffs:

      Philadelphia, St. Louis, Colorado, Atlanta. The west has been a toss upnrecently, but when making predictions one can’t say it’s a toss up. It’s an opinion.

      Murray Chass doesn’t need wOBA, wRC+, or xFIP to know that Boston, Tampa, NYY, Philly, ATL are gonna be really good. Generally really good players put up good traditional stats, AVG, RBI, Homers. Its the details, where advanced stats come in, such as building a team. RBIs aren’t very indicative of future success (I.e. Jeff Francoueer), and when buildi a team one shouldn’t place Pujols next to Ryan Howard because of Howards superior RBI totals. It leads someone to an incorrect conclusion, where advanced stats give a much better picture of how valuable a player is.

      Also, wasn’t fangraphs the first to jump on the Tampa bandwagon in 2007, preedicting a playoff appearance by 2008? That was a pretty damn good prediction.

      Vote -1 Vote +1

      • Steve says:

        It’s funny how a lot of replies set up their own straw-man, whether it be RBIs, or AVG, or ERA. Never said they were better, only said that WAR adds nothing to James’ and others work that culminated in OPS and DIPS (specifically K/9, BB/9, HR/9), and is misleading to use as short hand for a player’s value.

        (Billy Beane’s work wasn’t about the importance of on-base percentage, slugging percentage, and/or DIPS, but about finding and/or properly valuing undervalued assets.)

        So, what did he use to find these undervalued assets?

        Vote -1 Vote +1

      • MetsKnicksRutgers says:

        I don’t really understand how I set up my own straw man, but think what you like. I just disagree with your assertion that
        A: FG doesn’t have better tools to accurately predict individual seasons
        B: nobody has expanded on dips OPs etc.

        For a 7 year old, or foreigner with no knowledge of the sport, WAR is actually a great stat. It filters through the mud that AVG and RBI doesn’t at times, where as OPS, FIP, and OBP may be somewhat confusing. Whats the difference between obp and ba? Why is slg on a different scale from ba and obp? These are valid questions for a new fan of the sport, and if somebody sees that Pujols consistently has a WAR ~8 they will know he is one of the best of his generation, Homer runs won’t do that, see Hundley 96, nor will RBIs or BA. OBP Is pretty close, but has exceptions like Luis Castillo or Eckstein. SLG is even less descriptive. I’m not even a fan of WAR, but for it’s complexity I find it best utilized when introduced to novices of the sport.

        Lasly, I read moneyball 3-4 times and I’m well aware of what beane, jp, forst , and depo we ere trying to do. But IIRC they DID use McCrackens dips, it was the tool they utilized to find undervalued assets. Most notably, pitchers with high GB rates rather than K rates because they weren’t valued. After writing this, It remind me of how sad I was when mensa member Omar Minaya let Bradford go, only to sign The Schoew(enweis) to a freaking 3 year deal.

        Vote -1 Vote +1

  13. DrBGiantsfan says:

    You need to be careful to not put up your own straw man. Just because someone may have their doubts about the utility of xFIP, just as one example, doesn’t mean that they are anti-statistics. Not saying you did that here, but it’s happened.

    Vote -1 Vote +1

  14. DCatcher says:

    You know, the problem with sabermetrics is that we don’t have enough of them. We really need a way of measuring contribution in the dressing room. I hear Cesar Izturis is great in the clubhouse. And Kevin Millar outperformed everyone while walking back and forth in the dugout. Neither the Blue Jays nor the Orioles have won since letting Millar go.

    Vote -1 Vote +1

    • Sultan of Schwinngg says:

      You mocking what are real values on a sports team is simply evidence to your never having played the game in a competitive environment. I know that may come across rudely, but it’s true. Confidence is a great talent of any player, and yes, there are people who instill that in others, even if they’re statistical scrubs, although they’re usually not.

      Vote -1 Vote +1

  15. Thompson says:

    I’m fully on board with marrying both approaches – stats and scouting – when evaluating a player and trying to guesstimate future performance. What strikes me as disingenuous, though, is when folks like Lehrer and others (ok, full disclosure: I haven’t read the Lehrer piece, just the excerpt above) suggest that looking at just statistics completely disregards elements like if the player is an asshole or listens to his coaches. The implication there is that these are independent variables, when clearly they are not. If a player is open to being coached, his quantifiable performance – the statistics that we embrace here – should reflect that. If not, he’s not going to be in MLB for much longer. Moreover, the salary you’re giving to a player is contingent upon his measurable performance to a much greater extent than it is upon whether he’s a good clubhouse presence. The latter is icing on the cake, while the former is (in most cases) having a larger effect on the team’s performance as a whole. The end game is trying to increase the club’s wins, and projecting that a player can only get on base one in every four tries is a bigger indicator of that than projecting whether he’ll sign autographs or not.

    I think the evidence is pretty convincing that advanced metrics do a good job, in most cases, of predicting who will help a team succeed. The burden of proof is on those who say, “Well, if the NERDS are only going to look at stats, than WE are only going to look at scouting!” to produce evidence that scouting is as equally important a piece as they believe. Show me that, when controlling for quantifiable ability, being a good teammate predicts team success the same way that something like FIP or OPS+ does. Show me that an Elijah Dukes or Albert Belle or Nyjer Morgan (sorry, you three are the first examples that came to mind) has a measurable impact on his teammates’ quantifiable skills when he changes teams – do the players around them perform worse when those “clubhouse cancers” are there, and better after they leave? Do their teams lose more games than when they are replaced with nicer, but comparably skilled, players? I have no doubt that a player’s attitude, motivation, and so forth can impact their play and the play of others – but until I see data suggesting that it makes a measurable impact, I’m unconvinced that money spent on quantitative analysis is better spent on scouting.

    I like this analogy: if you are looking to invest with a brokerage firm, do you want to put your chips in with the firm that invests your money based on past performance, or on their gut instinct and experience? There are very obvious pros and cons for each, but I want the one that takes a data-driven approach – even if it’s just because I’ll sleep better at night knowing I made a statistically defensible decision. Maybe that’s why I like to read Joe Pawlikowski, Dave Cameron, and RLYW, and not so much Grantland and ESPN.

    Vote -1 Vote +1

    • Erik says:

      “Show me that, when controlling for quantifiable ability, being a good teammate predicts team success the same way that something like FIP or OPS+ does.”

      It’s a Catch-22: once you have quantifiable evidence of its effect, it ceases to be an “intangible” and becomes a stat!

      Vote -1 Vote +1

  16. Infinite Jest says:

    Grantland was created by Bill Simmons and I’m not surprised to see that article on there since I feel like I read something once where Simmons was basically decrying the stats/SABR blitz in baseball and how it’s reduced the game to a set of numbers where you don’t even need to watch it, whereas his precious sport of basketball is the exact opposite.

    Vote -1 Vote +1

    • MetsKnicksRutgers says:

      Simmons jumped ship. Ever since the Cameron and Beltrr signings he’s all about UZR and war etc.

      Vote -1 Vote +1

  17. juan pierre's mustache says:

    can we come up with a stat that quantifies how embarrassing it is to be scored on by a short person? the thrust of his argument about the mavericks seems to be that it was so embarrassing for the heat every time barea hit a short shot (and his missed layups were not embarrassing for the mavs as he is short and can’t be expected to make them) that the heat basically broke down and cried (too soon?). i would be interested to see if pitchers, similarly, fare poorly after they give up a hit to an embarrassingly small hitter due to a high degree of shame at their failures

    Vote -1 Vote +1

  18. Mcneildon says:

    I’m not so sure about that. I read an article of his on ESPN a few years ago where he encouraged readers to learn about advanced metrics and even gave about ten dumbed down definitions of various concepts. I also know that he is a big proponent of advanced metrics in basketball like the ones devised by John Hollinger.

    Vote -1 Vote +1

  19. dickey simpkins says:

    The funniest part of this whole Grantland piece is that the widely criticized Jason Kidd trade a few years ago was defended by Cuban and his front office on the basis of stats. They had their own +/- formula that basically stated Kidd was one of the best players in the league when he was in a period of decline. Rick Carlisle is routinely cited as one of the few NBA coaches who openly embraces advanced stats, they have a stat guy on the bench.

    This goes on a lot more in basketball because I’m convinced a lot of mainstream writers don’t actually watch games aside from the few TNT/ESPN televised ones. And unlike baseball it’s not a sequence of isolated events. JJ Barea had a fantastic series because a) he was guarded by the corpse of Mike Bibby and b) Erik Spoelstra was too stubborn to move Dwyane Wade or Mario Chalmers on Barea when Kidd was nothing but a spot-up shooter on offense.

    When someone in pro sports starts using numbers as the only basis for personnel movement, these articles will have validity. And also the robots will have won the war.

    Vote -1 Vote +1

  20. Nathan says:

    Bill Simmons is a clown.

    Vote -1 Vote +1

    • Sultan of Schwinngg says:

      I have no problem with statements such as that. I do believe that with such a definitive statement you should be required to back it up though. May I see your spreadsheets proving that conclusion?

      Vote -1 Vote +1

  21. MikeS says:

    He doesn’t need to give an iota of evidence. His whole argument is that some things are important but can’t be measured or proven so there can be no evidence to support their importance. It’s a beautiful argement in it’s way. He can’t be proven wrong.

    Vote -1 Vote +1

  22. Pierre says:

    A couple of points:
    – I’d suggest that not being an asshole and having playoff experience are, in fact, not very important. Listening to the coach, i.e. doing what you’re told, is pretty important, but mostly because pissing off the manager is a bad idea.
    – who’s an asshole and who doesn’t listen to the coach is just not what a site like this is or should be about. That’s what I expect out of ESPN, etc. Of course, ESPN et al do not deliver, but that’s another story.
    – There’s nothing wrong with the advanced stats that have been cited in these comments. The problem arises when folks take them too literally or apply them too broadly. Same as any other stat. As in “A-Rod X has 6.3 WARs and King Felix only has 6.1″.

    Vote -1 Vote +1

  23. Carlo says:

    “And so we end up with teams that are like the worst kind of car. They look good on paper — so much horsepower! — but they fail to satisfy…
    This is largely the fault of sabermetrics”

    Why would anyone need to read beyond this line?

    Vote -1 Vote +1

  24. Random Guy says:

    I’m surprised the article didn’t bring up VORP. That’s the stat that all the know-nothings like Murray Chass always mention in their regular “why my head is in the sand” articles, presumably because it sounds all dorky and Star-Trekky.

    Anyway, it’s not particularly fair to compare statistical analyses of baseball and basketball. In basketball there are a ton of things that happen on each and every play that don’t get recorded in the box score — successful picks, good passes that don’t lead directly to a basket, defenders redirecting shots, etc. Even the more interesting statistical analyses seem flawed to my eyes (the seminal one on “streakiness” was based on one team in one season, and a defending champion at that, which is hardly the kind of random sample I would want to use). I’m a lot more receptive to arguments for intangibles and the like in basketball than I would be in baseball.

    Vote -1 Vote +1

  25. johng says:

    The claim that those of us not entirely convinced about the value of some or all SABR stats are picking on a strawman is, in itself, a strawman argument.

    Before SABR, there was always a guy who played 2 years of high school ball, and knew everything about the game. He also knew that you were wrong…because…he can hit a curveball. Or that he played on the same field as Buster Posey once. Or something. I find alot of SABR arguments to be the same hot air.

    That fact that I can’t prove that a guy hitting .083 is worse than a guy hitting .400 using SABR stats, or that I might rely on BA, OBP and SLG – doesn’t invalidate my argument. Many pretend it does.

    I would imagine that is the main problem most people have embracing SABR stats. The know-it-all jerks who pretend they have infinitely better insight because they use WAR and xBABIP.

    Vote -1 Vote +1

    • Mcneildon says:

      It kind of seems like you threw your own straw man argument in your comment.

      “That fact that I can’t prove that a guy hitting .083 is worse than a guy hitting .400 using SABR stats, or that I might rely on BA, OBP and SLG – doesn’t invalidate my argument. Many pretend it does.”

      Maybe I misinterpreted what you were saying with that quote.

      Vote -1 Vote +1

      • johng says:

        You’re right. Nobody’s ever been mocked and then disregarded on a baseball blog for using BA or ERA to argue a player’s value.

        Vote -1 Vote +1

    • Mcneildon says:

      Although I do agree with the idea that many people are turned off by people who use sabermetrics to imply that those with differing opinions feel like idiots. But, I don’t think those people are as numerous as you might think. They just might be a little louder.

      Vote -1 Vote +1

      • Mcneildon says:

        correction: not “feel like idiots”

        “are idiots.”

        Vote -1 Vote +1

      • johng says:

        Most of the baseball blogosphere speaks in SABR, now. I don’t generally discuss players on my favorite team on blogs that focus on my team, because there’s already a groupthink built on SABR arguments (not stealing bases, not bunting) that becomes the proof in itself. (I believe it, so therefore, it’s better.)

        Unless I want a daylong argument and endless ridicule about using batting average, I just let it drop.

        Vote -1 Vote +1

    • waynetolleson says:

      “I would imagine that is the main problem most people have embracing SABR stats. The know-it-all jerks who pretend they have infinitely better insight because they use WAR and xBABIP.”

      That’s exactly it. I mean, I learned in Little League that when I was pitching, it was better to get ahead 0-2 rather than fall behind 3-0. I knew it was better to be up 2-0 in the count at the plate than behind 1-2.

      Somehow, without SABR, I figured out that if I kept the ball down in the strike zone and tried to get batters to hit it on the ground, that was better than pitching up in the zone, and having batters hit fly balls and line drives.

      Most of what SABR does is to quantify common sense. I’m all for learning as much about the game as possible. I could do without the condescending, holier-than-thou attitude. After all, stats explain WHAT HAPPENED. They don’t always tell you HOW THEY HAPPENED.

      Vote -1 Vote +1

      • johng says:

        I have no problem with SABR stats and arguments based on them. That’s why I come here. Sometimes, though, it’s pretended that a guy’s collection of stats and a probability based on them (a Strat-o-matic card, if you will) are what’s being sent out to the plate, not the player.

        Vote -1 Vote +1

      • CircleChange11 says:

        …. or that the pitchers that K the most, walk the least, and don’t give up homers are often the best pitchers.

        I think sabermetrics did a lot to show Batted Ball Data (BABIP, essentially) and the run value of a walk. A walk isn’t as “good as a hit” in most cases, but it damn near is. I think this is incredibly valuable in evaluating speedsters that “can’t steal first base”.

        I don;t think defense is as reliable as we’d like in order to chalk it up as a big sabermetric victory.

        I think batted ball velocity/trajectory/positioning, etc is going to show who can “go get em” and who cannot. Actually, I think it’s going to quantify the differences. We already know pretty much who can and can’t.

        Vote -1 Vote +1

    • Mcneildon says:

      I didn’t say that nobody has ever been mocked for using ERA and BA, or that I agree with people who do so. I don’t like people mocking others for any reason. I said the example you used was a straw man in that it presented an exaggerated hypothetical counter argument to your stance. Nobody would tell you that you are invalid for saying somebody with a .083 BA is worse than someone with a .400 BA. It’s almost a certainty that under any sort of examination, a player with a .083 BA is terrible. But by using that as your example of how a statistically inclined person would tell you that you are wrong in your thinking, you are creating a straw man.

      Vote -1 Vote +1

    • Eric says:

      Maybe I’m missing the point but using stats like WAR and xBABIP are a lot better than using, I don’t know, BA and ERA. Acting like an asshole and that you are holier-than-thou because you use those stats is stupid, but at the same time I try and enlighten the people I’m around that there are much more complete stats to use. If you’re going to tell me pitcher A is better than pitcher B because he has a lower ERA, that is an invalid argument, in my eyes, because it leaves so much out.

      Vote -1 Vote +1

  26. jesse says:

    Thank for writing my thoughts as read that article… I found his examples as utterly pointless and amazingly incorrect.

    Vote -1 Vote +1

  27. johng says:

    My interpretation of the article isn’t that the recent focus on and paying attention to SABR stats makes the game less enjoyable, it’s that if you don’t really want to speak in SABR stats, there’s a shrinking pool of places to discuss baseball online.

    Vote -1 Vote +1

  28. Antonio Bananas says:

    I think maybe instead of inventing new stats for the sake of inventing new stats that supposedly work better we need to evaluate if they actually do. Like that OPS vs wOBa study that was posted. That’s good stuff.

    I also think a better way of integrating newer stats to old minds would be to use the old stats and make them better. I’d guess that saying “yea he hit .280 but his monthly variance was high due to a very very good June” would be a better way of explaining a guy getting lucky than like BABIP or whatever other stats you’d wanna use.

    I really feel like normal statistical measures like variance and least significant different are under-utilized. throwing weightings into equations based on what you think they represent and plugging them into an equation doesn’t actually make that much sense. I mean it does, but like Einstein said “genius isn’t making simple ideas complex, but making complex ideas simple”. I feel like sabers tend to do the former.

    Vote -1 Vote +1

  29. fang2415 says:

    That book’s co-author would be Sheldon Hirsch, not Seymour. (Seymour Hersh is the Pulitzer Prize-winning investigative journalist who broke the My Lai and Abu Ghraib stories.)

    Vote -1 Vote +1

  30. Jason says:

    As a quantitative scientist I use statistics all the time in my work. I love the use of statistics and think it holds great promise for evaluating sports. …that said, I am often dismayed by the fact that statistics are only ever used descriptively on this site (and for sports in general as far as I can tell). I have yet to see a hypothesis actually tested.

    The recent article on blue eyes is an example. The hypothesis was that blue-eyed players hit worse during the day. The unstated null-hypothesis is that blue eyes have no effect on hitting. To support the blue-eyed hypothesis you have to be able to reject the null hypothesis that nothing is going on. This involves observing a difference that cannot be explained by chance alone (i.e. a difference with a very low probability of being due to randomness. In science we use arbitrary thresholds of 5%, 1% or 0.1% probability of being due to chance as “significant”.)

    This is a really simple hypothesis to test, yet the author never bothered to do it. Instead he gathered a bunch of descriptive statistics that showed no obvious effect. Some blue-eyed players appeared to hit better during the day and some worse. The author then, without justification, concluded that there was no blue-eyed effect and that the data is better explained by chance. But to actually evaluate the hypothesis we need to calculate the PROBABILITY that the observed variation is due to chance. This is the P-value that you learn about in any introductory statistics class.

    The great irony to me is that the SABR community is basically the statistics equivalent of the old school baseball scouts that are so derided on this site. You guys basically collect all the data, line it up and declare that X is greater than Y. You just eyeball the data and say, “it looks a lot bigger” or “there doesn’t seem to be any difference”. You’ve got all the data! Just take the next step and calculate if the differences that you observe are actually significant! Slap a probability on it so that we know whether to believe you or not!

    I think the next trend in sports evaluation is proper hypothesis testing not improved statistics. It will make your statistics meaningful. But until then its nothing more than the numerical equivalent of “the ball seems to jump off his bat”.

    +6 Vote -1 Vote +1

    • CircleChange11 says:

      I agree with what Jason is saying.

      I also think a lot of this attitude and commentary is blowback from how the saber community uses statistics to attempt to show how just about everyone from players, to coaches, to GMs, to fans, are stupid.

      Seriously, the other day, American League regular season data regarding bunting was used to show that college coaches were bunting too much in an elimination-style short series playoff setting in a drastically decreased run environment. No further commentary is required.

      Much of the analysis done with sabermetrics is simply the same application method the rest of the baseball world uses, only traditional metrics have been replaced by advanced metrics. We restate the problem/situation with different metrics. A lot of the time, that’s where it ends.

      I think the next big thing, as far as sabermetric fans go, are using advanced metrics for situations. We all understand “regression” and things of that nature and there’s no challenge in that. I think fans, in the near future, will be looking at situations like IBBs, etc and being able to calculate whether it was a good idea, based on the individuals in the specific situation … rather than just taking average league wide data and applying it to every situation. Same deal with relievers, etc.

      Vote -1 Vote +1

    • Paul says:

      I think this is such a great comment, but I’m skeptical that a lot of people in this community actually want to take the next step, because as you know, in real science it can be difficult to see a significant effect, and even then in many cases like field experiments in agriculture, one must acknowledge confounders, and you can’t just make blanket declarations with which to bludgeon your opposition with (alluding somewhat to Cirlce’s first para).

      For me an even bigger problem is the seeming unwillingness to use observation on this site to reinforce the stats. The Joakim Soria article from the other day was great, but it was two weeks late on the cause of his woes. Should we just believe everything that a local beat writer says about a player whose performance has changed in response to a specific adjustment? No. But shouldn’t it at least be referenced? What about before and after video? I have urged writers here to link to MLB video and to their credit, some of them do it routinely. Too many times I read articles on this site that use graphs and stats to come to a conclusion that is so obvious from mere observation, even to a casual fan, that it undermines the purpose. It makes it appear that this is just playland for people with nothing better to do, while simultaneously claiming that their methods are rigorous and beyond reproach. It’s unfortunate, and as usual when you have a high degree of conformity I think it’s cultural. For that I appreciate the effort by Johan Lehrer, even though it was terribly, terribly wrong.

      Vote -1 Vote +1

  31. Matt says:

    When I was first getting interested in sabermetrics, my friend summed this article up very succinctly: “Don’t think for a second that these sabermetricians aren’t watching the game!!”

    Vote -1 Vote +1

    • CircleChange11 says:

      I look at it from the opposite point of view. For all these guys that are “watching the games:, [1] just how many games do they watch? and [2] do they really know what they’re watching? (from a mechanical, cause/effect, etc point of view).

      You can watch a lot of games (or look at a lot of numbers) and still not know poop from chocolate pudding.

      The tough part is finding guys that “know baseball” but also “know math/stats”. In other words, knowing what numbers they need to have, and how to apply them to baseball situations.

      When we watch the games, we all watch them because we love the game, and we enjoy it. We’re not watching the games as if we’re an advanced scout or if we’re collecting observational data to be analyzed at a later point in time.

      Vote -1 Vote +1

  32. Shaun says:

    It seems that when people rail against stats, it’s really about railing against the past as a reliable indicator of what is likely to happen in the future; probably because the past isn’t indicating the likely future that they themselves want.

    These people seem to be uncomfortable with randomness but they are also uncomfortable with the past predicting the future. They would rather the future depend on will power or something besides measurable aspects of life or also something besides randomness.

    Vote -1 Vote +1

  33. I think the game of baseball can’t be fully appreciated without statistical analysis. Having said that, many a statistician has said about a given analysis, if it looks like shit it probably is. It’s so often said, that many call it the first rule of statistics. The more than occassional 10 paragraph tome on the meaning of a regression analysis with an r.squared of 0.15 does provoke a smile.

    What would be interesting is to see some of the tools of time series analysis used in an interpretive manner to look at events in real time. Predcting the ups and downs of a hitters BA during situation episodes could be really interesting. The same could be done with pitching, I would suppose. It’s always interesting to see a play by play guy report on somebodies ‘hot bat’ or ‘arm’, using results in the recent past as opposed whole season or history.

    There is a lot of really interesting real time data about time or interval related parameters that would be fascinating to investigate.

    Vote -1 Vote +1

  34. Linus says:

    Ignoring Jonah Lehrer’s point and decrying it as criticism of a strawmen is silly. The issue with sabermetrics is NOT that it is less accurate, or that sabermetricians are misguided. The issue isn’t even that numbers are unreliable and that our guts should be trusted on non-quantifiable #s.

    As stated, the issue is that when it comes to complex interactions (and hitting a roundball with a round bat travelling with heavy spin at 90+ Mphs has a ton of complex interactions), the temptation to start looking at only EASILY quantifiable #s is very high. Lehrer’s article is not a call to ignore sabermetrics, but it is simply a reasonable warning to not overestimate its power.

    And, seriously, if sabermetricians can’t see that by trying to reduce a pitcher’s value to 3 #s don’t epitomize this point, then well they are the ones with a stunted world view. This is not to say there is no value in reducing complex interactions into as few variables as possible. FIP is a better number than ERA, so is xFIP, and WARP is a better tool than BA. However, every GM should be fired if the analysis stopped there. (And no i don’t think sabermetricians think they should do this, but there is a definite, “if we are wrong streak, everyone else is still wronger.”)

    The point that Jonah Lehrer is making that overemphasis on the reductio analysis will cause poor results. Complex interactions is one reason that computer simulations for pharmacology tests still cannot replace the use of animal testing. It is a good reason why weather analysis, while light years ahead of 50 years ago, will still be caught on its asses at times. The notion that baseball or sports is somehow immune to the complexity aspect is insane, and quite frankly arrogant.

    And while you can quibble with his use of the Mavs as an example, the reality is that there are LOTS of cases where sabermetric analysis has proven no less reliable than ole ScoutyMcScout testing the wind with his thumb.

    If you really want to consider sabermetrics as a rigorous analytic tool, then seriously don’t be so damned sensitive to critiques. The proof as they say is in the pudding.

    Vote -1 Vote +1

    • Erik says:

      Unfortunately, much of the article comes off as “look what intangible ‘virtues’ CAN do” as opposed to “look what quantitative analysis CAN’T do.” Had he focused on the latter, as you do in your comment, the article might have been more compelling to a statistically-inclined reader (like me).

      And of course there aren’t any examples of either, so no only do readers take issue with the apparent thesis, they take issue with the writing.

      Vote -1 Vote +1

    • I’m not sure confidence is the word I’d use, smug, from time to time, might be. But in their favor, it is the intellectual youth of their efforts that is exhibited. Just like in that first physics course, instructors everywhere assume no friction and a round horse from time to time, it isn’t meant as a serious exploration of a horse race, but a first step. Today, sabermetrics is at a four legs stage of development, at least, and capable of evaluating aggregated parameters at the risk of sometimes making the obvious abundantly clear to a knowledgeable audience. However, to many folks it’s a shortcut to baseball insight and interest, so I’m all for that. In time, as the proponents of statistical analysis start to look forward at real time prediction, it will necessarily get more sophisticated in it’s products meaning, beyond it’s baseball card like current applications. Look at how scouts are using some statistical methods to look at swing decomposition for pitchers to better deal with MLB hitters. It’s going to be an evolution I for one will fllow with great interest.

      When sabermetrics can convincingly figure out how the Giants are leading the NL West with a negative run differential, I’ll be impressed.

      Vote -1 Vote +1

  35. Linus says:

    To put it more succinctly, while every sabermetrician will acknowledge the complex variables play a role in the results, (i.e. Joe’s point about pitching a fat ball up the middle), sabermetrics requirements of quantifiable analysis will necessarily eliminate it from the analysis. The danger is that while we can all “appreciate” intangibles in the theoretical sense, when it comes to decision making, if we have a “numbers” based approach vs. a non-numbers approach, the bias is definitely there to take the “concrete” approach.

    We may think we are being fair, we may even say, “there are variables we can’t measure” as we make the decision. But that doesn’t make the decision-making process is skewed to variables that are easily measurable, but perhaps not as relevant to the outcome as we hoped.

    Vote -1 Vote +1

  36. Linus says:

    Okay, sorry my last attempt to make a coherent argument.

    Many people seem to reacting to Lehrer’s argument as if he is somehow criticizing statistics as an accurate measuring tool. That is not the point. The point is that when we use numbers to measure something, whatever it is, we can tend to overestimate its predictive power when we make a decision.

    It has been said over and over, that sabermetricians realize that they can’t measure everything that is relevant to whether a hitter will hit a ball, or a player will contribute to b-ball game, or whether a particular play will be successful in football.

    The point of these articles is not to say, “trust your gut not the #s, they are wrong.” But rather, if it is acknowledged that metric analysis leaves out important and relevant variables (a point conceded by every sabermetrician), then there is a problem if decision making becomes overly reliant on that type of analysis.

    Put it this way. These articles could have been written about baseball when BA, RBIs and ERAs were being used. And everyone here would have agreed that reliance on those #s was shortsighted. So imagine a world 25 years from now where we realize that FIP, or xFIP, or WAR, or RC+ , etc etc were similarly flawed. The problem with the evaluation today vs 25 years from now, isn’t just that our metrics are poorer, but rather the cognitive bias in relying on metrics WHILE ACKNOWLEDGING that there was an entire bucket of missing variables we didn’t plug in.

    While sabermetricians can say ad nauseum that we shouldn’t fire scouts, it is a bit disingenuous to try and say that they acknowledge the limitations of their analysis. The fact of the matter is, there is a cognitive bias to go to #s rather than not. As humans we always want to hear %s and those %s can cause a whole big mess. The best example i can think of, is the complex and rigorous tools used by analytics departments at Bigbanks. Does any sabermetrician think they are more rigorous than the quantz guys at Lehman Brothers, or Merrill Lynch?

    This does not mean the quest to find better metrics is wrong, or even fruitless, it is still hugely important. But just realize that until you get to a r^2 value on predictability over .75, then trusting only the current state of metric analysis is going to lead to skewed results. That isn’t to say go ahead and use astrology or palm reading, but there is a lesson in Gladwell’s Blink..

    Vote -1 Vote +1

    • Interesting, and I agree, mostly. I would add to your ‘variables’ point, that most of the quantifiable variables currently used are an approcximation of the typical ‘baseball card’ mentality of classical baseball gurus. It’s when these guys finally move into continuously measureable components or more dynamic categorical variables real performance description and more importantly improvement can be obtained. Unfortunately, those numbers aren’t accessible to the everyday fan, but the results will be interesting. As a perhaps poor example, how do relate the hand position and body rotation of Tim Lincecum to his control and fastball velocity. Add to that the differences windup and stretch values of those parameters and the ‘models’ that come out would both interesting and useful.

      Vote -1 Vote +1

    • Pierre says:

      The point about cognitive bias is reasonable, if not exactly earth-shattering. And the author says that this was his sole point (although if you read the article, this is hard to believe). But:
      1- sports isn’t a good example of this phenomenon. I really don’t think anybody needs to worry about their team becoming too fixated on statistics.
      2- to blame sabermetrics is dumb. If a company is too focused on accounting metrics, is this the fault of the accountants?

      I think what pissed people off about this article is the quasi-intellectual patina the guy puts on the same old lazy, ignorant arguments. It has the feel of somebody whipping off an article about something he doesn’t know or care much about. Personally, I’d rather read Dan Shaughnessy making snide comments about the “sports as math homework crowd”. At least that’s honest.

      Vote -1 Vote +1

  37. Kevin says:

    I think I can give an example of an instance where an intangible helped very much to determine the outcome of the World Series. In 2006, when my beloved Tigers went to the WS and were very much favored to win against the Cardinals, who barely made the playoffs. There were several occasions when the Tigers lost leads they would never recover from because of the pitcher commuting throwing errors in with two outs. I specifically recall that, after game five, Justin Verlander said that, when he made the error that would lead to them losing the game, he was basically telling himself “Okay, just throw it to third base nice and easy, and whatever you do, don’t make another humiliating error on a routine play and cost the team the game” which he then proceeded to do.

    I wouldn’t say Sabermetrics isn’t useful, but that play cost the Tigers the championship, and it was caused by something statistics couldn’t account for.

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>