FanGraphs Logo

The Sabermetric Library

Over the weekend, in a thread over at Tango’s blog, the idea of a “Sabermetric Library” was raised. As noted over there, one of the positives of the academic journal process is to catalog the work that has been done, making it easily searchable for future readers who are not following the discussion in real time. The statistical analysis crowd doesn’t have that kind of formal structure, which makes it difficult for those who come later to catch up on what has already been done.

Rather than employing a full time “librarian” to keep up with the most recent work, I thought perhaps we could just attempt to crowd-source this idea. So, that’s what we’ll attempt to do in this thread.

In the comments below, I’d like to encourage you to think back to influential articles that you’ve read about the game, and if you can, link to them. If they were written a book, link to it at a particular bookseller of your choice that carries it. If you can quickly summarize the conclusion, even better.

It doesn’t have to be an epic research piece that changed the face of analysis (such as Voros’ piece on DIPS), though those obviously fit in here, too. But if there is a blog post somewhere that explained something in a way that allowed you to understand it for the first time, link to that. If there was an interesting discussion on a popular topic (Blyleven for the HOF, maybe), then link to that.

The goal would be to populate the comments with enough resources to allow someone to go through and read a Best Of The Sabermetric Community collection of writings. There are a lot of good writers out there doing good work, but given the size of the internet, some of it can get lost in the shuffle. Let’s preserve the pieces that deserve to be kept alive, and at the same time, create a resource for those who come along in the future to find out about the work that has already been done.

In order to keep the layout easy to read, I would ask that you refrain from commentary about this post. Please limit comments to the format of linking to important pieces, with necessary comment about that piece as an abstract of sorts. If this takes off as I hope it does, we’ll do a discussion thread on another day about potentially culling the list, giving space for people to argue for or against any of the linked pieces below.



Print This Post

Dave is a co-founder of USSMariner.com and contributes to the Wall Street Journal.

47 Responses to “The Sabermetric Library”

You can follow any responses to this entry through the RSS 2.0 feed.
Click here to view comments in a non-threaded output.
  1. Big Oil says:

    Happy to contribute what little I really know:

    THT’s excellent xBABIP calculator, which gives you an expected batting average on balls in play value. Using this, you can calculate the projected line of an individual player (xAvg/xOBP/xSLG/xOPS) given their expected batting average on balls in play. It is a strong predictor of future performance of a player.

    http://www.hardballtimes.com/main/fantasy/article/simple-xbabip-calculator/

    Vote -1 Vote +1

  2. Joe R says:

    Even though it’s not the thought out version, I think Boswell’s “Total Average” You can always buy the book if you want to read some early “common sense” thought: http://www.amazon.com/Imitates-World-Penguin-sports-library/dp/0140064699

    Sad that Boswell ended up writing stuff like this, but it shouldn’t diminish his contributions in the 70′s and 80′s.

    Vote -1 Vote +1

  3. CA says:

    Not to be a sycophant, but I think Dave’s USS Mariner post about evaluating pitcher talent is a solid resource, especially for folks looking for something at a more introductory level.
    http://ussmariner.com/2006/08/29/evaluating-pitcher-talent/

    Vote -1 Vote +1

  4. The A Team says:

    Nate Silver’s Is Alex Rodriguez Overpaid from Baseball Between the Numbers is an excellent introduction to revenue curves and the whole “teams pay for wins” concept. Without this kind of work, the WAR framework losses a bit of its oomph.

    http://books.google.com/books?id=uxdvwQdXbboC&pg=PA174&lpg=PA174&dq=Nate+Silver+A-Rod+overpaid&source=bl&ots=JAx3Ze6J9b&sig=IYKdB4cEn8gV3CvtU9xtp2LFsl8&hl=en&ei=S85dS9z5J43INc3t0foO&sa=X&oi=book_result&ct=result&resnum=2&ved=0CAoQ6AEwAQ#v=onepage&q=&f=false

    Vote -1 Vote +1

  5. Ryan D says:

    The whole Baseball Between the Numbers book should be a starting point IMO.

    Vote -1 Vote +1

  6. Dan says:

    While not in itself a sophisticated piece of sabermetirc research, I’ve often forwarded this Joe Posnanski post to my friends that are stuck in AVG/HR/RBI lockdown. It’s a great explanation of why those big 3 numbers are flawed and makes people more receptive to more advanced metrics.

    http://joeposnanski.com/JoeBlog/2008/11/20/batting-average-home-runs-rbis/

    Vote -1 Vote +1

  7. snapper says:

    My first intro to sabremetric thought, and I’m sure that’s true for thousands of others.

    The Hidden Game of Baseball, by Thorn and Palmer

    http://www.amazon.com/Hidden-Game-Baseball-John-Thorn/dp/0385182848

    Vote -1 Vote +1

  8. Tree says:

    I’m always looking back at these two articles when I’m thinking about UZR. They go through most details on how basic UZR is calculated and then how the various corrections are applied.

    http://www.baseballthinkfactory.org/files/primate_studies/discussion/lichtman_2003-03-14_0/
    http://www.baseballthinkfactory.org/files/primate_studies/discussion/lichtman_2003-03-21_0/

    Vote -1 Vote +1

  9. yossarian says:

    Not a monumental piece, but I was always impressed with Josh Kalk’s work with Pitch F/X on Hardball Times. Absolutely clear analysis, with plots that can teach you a ton about baseball in an statistics-based way, which is the whole point of sabermetrics.

    I couldn’t find a great overview of everything, but I liked his pieces on Greinke and Hughes from ’08.

    http://www.hardballtimes.com/main/article/anatomy-of-a-player-zach-greinke/

    http://www.hardballtimes.com/main/article/anatomy-of-a-player-phil-hughes/

    Vote -1 Vote +1

  10. philosofool says:

    Pretty much everything that Tango has linked under “Research” on his main page is great. I especially like the baseruns stuff. It’s mathematically weighty, but it’s also very good.

    Vote -1 Vote +1

  11. Luke Appling says:

    Beyond the Boxscore’s Sabermetric Writing Awards are a good compilation of recent articles and a jump-start to the library.

    http://www.beyondtheboxscore.com/2010/1/18/1253835/btb-sabermetric-writing-awards

    Vote -1 Vote +1

  12. scstrato says:

    Out of curiosity, why not incorporate a FanGraphs wiki into the website? Should be easy enough to accomplish and would give you, as well as us fans, the ability to update content as needed. MediaWiki is a fairly good one and there are others depending upon your needs.

    Just a thought.

    Vote -1 Vote +1

  13. Jamal G. says:

    Written by “Kincaid” of 3-D Baseball, this two-part piece takes a look at how to evaluate pitchers using FIP, and how the stat actually regresses balls in play:

    Part I: http://bit.ly/1hVsPB
    Part II: http://bit.ly/56d6Hd

    Vote -1 Vote +1

  14. The Hit Dog says:

    A somewhat obscure piece that, though not widely publicized, still might have had an impact on a modest slice of the sabermetric community:

    http://en.wikipedia.org/wiki/Moneyball

    Vote -1 Vote +1

    • Tom Jakubowski says:

      What’s funny is that Moneyball is often thrust up as the “Sabermetrician’s Bible” by detractors, but it doesn’t even explore advanced statistics all that much. It’s much more about the behind-the-scenes of front office and their exploitation of market inefficiencies (which, at the time, was as simple as “patient hitters are undervalued” — need no stat more advanced than BB%) than anything else.

      Vote -1 Vote +1

  15. Mo Wang says:

    Here’s an in-depth attempt at disproving the “pitching to the score” argument that Jon Heyman types use to push Jack Morris for HOF:

    Vote -1 Vote +1

  16. AKMA says:

    No one should utter a syllable about the historical archives of sabermetrics without naming Earnshaw Cook’s Percentage Baseball, then pausing for a moment of silence (if only to sigh quietly about the awkward first step taken by this baby of which we’re so proud now).

    I had been poking uninformedly at ideas about Markov chains when I first read Mark Pankin’s article in the Great American Stat book ().

    Vote -1 Vote +1

  17. chris says:

    I have one for the people who are relatively new to sabermetrics. Alex Remington is an author for Yahoo Sports and over the past few months has been writing articles explaining the workings of certain stats such as BABIP, OPS+, FIP, wOBA, WPA, WAR, UZR, J-HOFFA, and Win Shares with more on the way. He explains what each stats means, how it’s calculated, what it’s good for, what it’s bad for, and why we should care about it. Check it out, spread it around.

    http://sports.yahoo.com/mlb/blog/big_league_stew?author=Alex+Remington

    Vote -1 Vote +1

    • Bryz says:

      After we were just told, “It doesn’t have to be an epic research piece that changed the face of analysis (such as Voros’ piece on DIPS)…”

      Vote -1 Vote +1

  18. Bradsbeard says:

    I learned everything I know about stats from Fangraphs, but this is a nice intro to linear weights in an ongoing series by Shawn Goldman over at Bleed Cubbie Blue:

    http://www.bleedcubbieblue.com/2010/1/17/1255925/uzr-error-fail-or-win-a-lesson-on

    Vote -1 Vote +1

  19. Rusty says:

    >
    > As a long time fan of the game, I am only now beginning to understand
    > the importance of this stuff. This particular post by Rory Paap of
    > Paapfly.com helped me understand how luck and BABIP affects ERA and
    > thus can create a huge difference between ERA and FIP.
    >
    > http://www.paapfly.com/2009/12/affeldt-stars-aligned-in-2009.html
    >
    > http://www.PaapFly.com is a great blog for everyone, statheads to average
    > baseball fans.
    >
    >

    Vote -1 Vote +1

  20. craigtyle says:

    Great, straightforward article on batting stats:

    http://www.hardballtimes.com/main/article/how-to-evaluate-hitters/

    Vote -1 Vote +1

  21. Josh Fisher says:

    I’ve always thought that FJM’s glossary is an interesting place to start someone on key principles (like worthlessness of pitching wins and the like). The humor makes the concepts very accessible.

    http://www.firejoemorgan.com/2005/04/glossary-of-terms.html

    Vote -1 Vote +1

  22. Rockfish says:

    My brother has been teaching me a bunch about the stat analysis trend in baseball. He uses some of the new math to breakdown the homeetown SF Giants. Here are a couple examples:

    http://www.paapfly.com/2009/12/can-buster-fill-bengies-shoes.html

    http://www.paapfly.com/2010/01/moneyball-and-beane-are-evolving.html

    http://www.paapfly.com/2009/12/affeldt-stars-aligned-in-2009.html

    Vote -1 Vote +1

  23. TJ says:

    I felt a bit out-of-the-loop when I first discovered the wonterful world of intelligent baseball analysis last year, one of the first places I went to was “The Book”. In my opinion “The Book” has been the best help to me. However, a sabermetrics wiki may prove to be a very useful education tool.

    Vote -1 Vote +1

  24. William says:

    Okay, so believe it or not this just came out today, but I am sure these will become fundamental metrics, their descriptions are short and clear, and the justification for development helps further define the metrics they develop from.

    Basically, they nuance our understanding of pitching and luck by creating x-versions (e.g. xBABIP) to account for things really in pitchers’ control.

    http://www.hardballtimes.com/main/fantasy/article/introducing-xw-xbabip-xlob-xhr-fb-and-more/

    Vote -1 Vote +1

  25. Breadbaker says:

    Alan Schwartz’s The Numbers Game is a good history of baseball statistical analysis for those who thing they began with Bill James musing in a boiler room in Lawrence, Kansas.

    http://www.amazon.com/Numbers-Game-Baseballs-Fascination-Statistics/dp/0312322232/ref=sr_1_14?ie=UTF8&s=books&qid=1264480911&sr=8-14

    Vote -1 Vote +1

  26. TheSinators says:

    I’m still new to the sabremetric community (this is my virgin post). I still have a lot to learn, but I really enjoyed learning about tRA and wOBA, which, to my naive mind, sure seem like the best pitching and hitting stats around (of course, I don’t know about tRA* and tRA# and tRA~ and all the offspring of tRA). I hope it’s okay to post a couple.

    Graham MacAree’s post explaining tRA:
    http://www.lookoutlanding.com/2008/6/23/557089/the-big-tra-post

    wOBA, explained briefly:
    http://www.insidethebook.com/woba.shtml

    wOBA, explained more thoroughly:
    http://www.insidethebook.com/ee/index.php/site/comments/the_history_of_the_woba_part_1/

    Infield defense:
    http://www.hardballtimes.com/main/article/infield-defense-mdash-back-to-basics/

    Great idea for a post. This gives me lots of homework to read up on!

    Vote -1 Vote +1

  27. Jake in Columbus says:

    http://www.askrotoman.com/fbguide/samediff.pdf
    My first exposure to a more sabermetric approach which I felt gave me an edge in ranking pitchers for my first ever fantasy baseball season.

    http://www.askrotoman.com/wordpress/?p=1033
    The following year’s post per my request.

    Vote -1 Vote +1

  28. Jonah Keri says:

    The old “Baseball Prospectus Basics” series, which ran in 2004, still holds up pretty well today. Great discussion of a wide range of topics, by several noted analysts, including Woolner, Silver, Click and others:

    http://www.baseballprospectus.com/news/index.php?column=31

    That series was, to some extent, a take-off on Woolner’s “Baseball’s Hilbert Problems” work, which first came out a full decade ago in BP2000, then was updated for the site in 2004:

    http://www.baseballprospectus.com/article.php?articleid=2551

    Also, two great (older) pieces on how fix revenue sharing:

    The Zumsteg Plan:
    http://www.baseballprospectus.com/article.php?articleid=1599

    Keith Woolner’s take:
    http://www.baseballprospectus.com/news/20020418woolner.shtml

    Vote -1 Vote +1

  29. Swarley says:

    I highly recommend these two pieces by FanGraphs author Mitchel Lichtman, as he tries to apply game theory to decision making in baseball. Obviously this is not straight on sabermetric work, but I think he’s on to something.

    http://www.fangraphs.com/blogs/index.php/were-the-yankee-sac-bunts-in-the-8th-inning-correct

    http://www.fangraphs.com/blogs/index.php/should-lidge-have-thrown-more-sliders

    Vote -1 Vote +1

  30. jinaz says:

    This is, in a sense, something of what I was trying to compile here:
    http://www.beyondtheboxscore.com/2009/12/17/1200459/want-to-help-me-plan-my-baseball
    Focus might be slightly different, but there are a lot of good links there as well as good stuff submitted in the comments.
    -j

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>




Player Linker - Contact Us - Advertise - Terms of Service - Privacy Policy