The Sabermetric Library

Over the weekend, in a thread over at Tango’s blog, the idea of a “Sabermetric Library” was raised. As noted over there, one of the positives of the academic journal process is to catalog the work that has been done, making it easily searchable for future readers who are not following the discussion in real time. The statistical analysis crowd doesn’t have that kind of formal structure, which makes it difficult for those who come later to catch up on what has already been done.

Rather than employing a full time “librarian” to keep up with the most recent work, I thought perhaps we could just attempt to crowd-source this idea. So, that’s what we’ll attempt to do in this thread.

In the comments below, I’d like to encourage you to think back to influential articles that you’ve read about the game, and if you can, link to them. If they were written a book, link to it at a particular bookseller of your choice that carries it. If you can quickly summarize the conclusion, even better.

It doesn’t have to be an epic research piece that changed the face of analysis (such as Voros’ piece on DIPS), though those obviously fit in here, too. But if there is a blog post somewhere that explained something in a way that allowed you to understand it for the first time, link to that. If there was an interesting discussion on a popular topic (Blyleven for the HOF, maybe), then link to that.

The goal would be to populate the comments with enough resources to allow someone to go through and read a Best Of The Sabermetric Community collection of writings. There are a lot of good writers out there doing good work, but given the size of the internet, some of it can get lost in the shuffle. Let’s preserve the pieces that deserve to be kept alive, and at the same time, create a resource for those who come along in the future to find out about the work that has already been done.

In order to keep the layout easy to read, I would ask that you refrain from commentary about this post. Please limit comments to the format of linking to important pieces, with necessary comment about that piece as an abstract of sorts. If this takes off as I hope it does, we’ll do a discussion thread on another day about potentially culling the list, giving space for people to argue for or against any of the linked pieces below.



Print This Post



Dave is the Managing Editor of FanGraphs.


Sort by:   newest | oldest | most voted
Big Oil
Guest
Big Oil
6 years 5 months ago

Happy to contribute what little I really know:

THT’s excellent xBABIP calculator, which gives you an expected batting average on balls in play value. Using this, you can calculate the projected line of an individual player (xAvg/xOBP/xSLG/xOPS) given their expected batting average on balls in play. It is a strong predictor of future performance of a player.

http://www.hardballtimes.com/main/fantasy/article/simple-xbabip-calculator/

Bradley Woodrum
Member
Member
6 years 5 months ago

For those without Excel (such as myself), the blog Cubs Stats made Chris Dutton’s xBABIP Quick Calculator available on Google Docs:

http://cubsstats.blogspot.com/2010/01/chris-duttons-xbabip-quick-calculator.html

Armageddon
Guest
Armageddon
6 years 5 months ago
Joe R
Guest
Joe R
6 years 5 months ago

Even though it’s not the thought out version, I think Boswell’s “Total Average” You can always buy the book if you want to read some early “common sense” thought: http://www.amazon.com/Imitates-World-Penguin-sports-library/dp/0140064699

Sad that Boswell ended up writing stuff like this, but it shouldn’t diminish his contributions in the 70’s and 80’s.

CA
Guest
CA
6 years 5 months ago

Not to be a sycophant, but I think Dave’s USS Mariner post about evaluating pitcher talent is a solid resource, especially for folks looking for something at a more introductory level.
http://ussmariner.com/2006/08/29/evaluating-pitcher-talent/

Brad Johnson
Member
Member
6 years 5 months ago

Nate Silver’s Is Alex Rodriguez Overpaid from Baseball Between the Numbers is an excellent introduction to revenue curves and the whole “teams pay for wins” concept. Without this kind of work, the WAR framework losses a bit of its oomph.

http://books.google.com/books?id=uxdvwQdXbboC&pg=PA174&lpg=PA174&dq=Nate+Silver+A-Rod+overpaid&source=bl&ots=JAx3Ze6J9b&sig=IYKdB4cEn8gV3CvtU9xtp2LFsl8&hl=en&ei=S85dS9z5J43INc3t0foO&sa=X&oi=book_result&ct=result&resnum=2&ved=0CAoQ6AEwAQ#v=onepage&q=&f=false

Brad Johnson
Member
Member
6 years 5 months ago

I forgot to mention that the framework has been updated considerably, this is just a nice and well presented jumping off point.

Bryz
Guest
6 years 5 months ago

Hell, the entire book is pretty good.

Joe R
Guest
Joe R
6 years 5 months ago
Ryan D
Guest
Ryan D
6 years 5 months ago

The whole Baseball Between the Numbers book should be a starting point IMO.

Dan
Guest
Dan
6 years 5 months ago

While not in itself a sophisticated piece of sabermetirc research, I’ve often forwarded this Joe Posnanski post to my friends that are stuck in AVG/HR/RBI lockdown. It’s a great explanation of why those big 3 numbers are flawed and makes people more receptive to more advanced metrics.

http://joeposnanski.com/JoeBlog/2008/11/20/batting-average-home-runs-rbis/

snapper
Guest
snapper
6 years 5 months ago

My first intro to sabremetric thought, and I’m sure that’s true for thousands of others.

The Hidden Game of Baseball, by Thorn and Palmer

http://www.amazon.com/Hidden-Game-Baseball-John-Thorn/dp/0385182848

Joe R
Guest
Joe R
6 years 5 months ago
serious man
Guest
serious man
6 years 5 months ago
Tree
Guest
Tree
6 years 5 months ago

I’m always looking back at these two articles when I’m thinking about UZR. They go through most details on how basic UZR is calculated and then how the various corrections are applied.

http://www.baseballthinkfactory.org/files/primate_studies/discussion/lichtman_2003-03-14_0/
http://www.baseballthinkfactory.org/files/primate_studies/discussion/lichtman_2003-03-21_0/

yossarian
Guest
yossarian
6 years 5 months ago

Not a monumental piece, but I was always impressed with Josh Kalk’s work with Pitch F/X on Hardball Times. Absolutely clear analysis, with plots that can teach you a ton about baseball in an statistics-based way, which is the whole point of sabermetrics.

I couldn’t find a great overview of everything, but I liked his pieces on Greinke and Hughes from ’08.

http://www.hardballtimes.com/main/article/anatomy-of-a-player-zach-greinke/

http://www.hardballtimes.com/main/article/anatomy-of-a-player-phil-hughes/

philosofool
Member
Member
philosofool
6 years 5 months ago

Pretty much everything that Tango has linked under “Research” on his main page is great. I especially like the baseruns stuff. It’s mathematically weighty, but it’s also very good.

Luke I am your Father
Member
Luke I am your Father
6 years 5 months ago

Beyond the Boxscore’s Sabermetric Writing Awards are a good compilation of recent articles and a jump-start to the library.

http://www.beyondtheboxscore.com/2010/1/18/1253835/btb-sabermetric-writing-awards

scstrato
Guest
scstrato
6 years 5 months ago

Out of curiosity, why not incorporate a FanGraphs wiki into the website? Should be easy enough to accomplish and would give you, as well as us fans, the ability to update content as needed. MediaWiki is a fairly good one and there are others depending upon your needs.

Just a thought.

scstrato
Guest
scstrato
6 years 5 months ago
Jamal G.
Guest
6 years 5 months ago

Written by “Kincaid” of 3-D Baseball, this two-part piece takes a look at how to evaluate pitchers using FIP, and how the stat actually regresses balls in play:

Part I: http://bit.ly/1hVsPB
Part II: http://bit.ly/56d6Hd

The Hit Dog
Guest
The Hit Dog
6 years 5 months ago

A somewhat obscure piece that, though not widely publicized, still might have had an impact on a modest slice of the sabermetric community:

http://en.wikipedia.org/wiki/Moneyball

Tom Jakubowski
Guest
Tom Jakubowski
6 years 5 months ago

What’s funny is that Moneyball is often thrust up as the “Sabermetrician’s Bible” by detractors, but it doesn’t even explore advanced statistics all that much. It’s much more about the behind-the-scenes of front office and their exploitation of market inefficiencies (which, at the time, was as simple as “patient hitters are undervalued” — need no stat more advanced than BB%) than anything else.

Mo Wang
Guest
Mo Wang
6 years 5 months ago

Here’s an in-depth attempt at disproving the “pitching to the score” argument that Jon Heyman types use to push Jack Morris for HOF:

Mo Wang
Guest
Mo Wang
6 years 5 months ago
AKMA
Guest
6 years 5 months ago

No one should utter a syllable about the historical archives of sabermetrics without naming Earnshaw Cook’s Percentage Baseball, then pausing for a moment of silence (if only to sigh quietly about the awkward first step taken by this baby of which we’re so proud now).

I had been poking uninformedly at ideas about Markov chains when I first read Mark Pankin’s article in the Great American Stat book ().

AKMA
Guest
6 years 5 months ago

Hmm, the link didn’t show up:

http://www.pankin.com/markov/intro.htm

chris
Guest
chris
6 years 5 months ago

I have one for the people who are relatively new to sabermetrics. Alex Remington is an author for Yahoo Sports and over the past few months has been writing articles explaining the workings of certain stats such as BABIP, OPS+, FIP, wOBA, WPA, WAR, UZR, J-HOFFA, and Win Shares with more on the way. He explains what each stats means, how it’s calculated, what it’s good for, what it’s bad for, and why we should care about it. Check it out, spread it around.

http://sports.yahoo.com/mlb/blog/big_league_stew?author=Alex+Remington

Conor
Guest
Conor
6 years 5 months ago
Bryz
Guest
6 years 5 months ago

After we were just told, “It doesn’t have to be an epic research piece that changed the face of analysis (such as Voros’ piece on DIPS)…”

Bradsbeard
Guest
Bradsbeard
6 years 5 months ago

I learned everything I know about stats from Fangraphs, but this is a nice intro to linear weights in an ongoing series by Shawn Goldman over at Bleed Cubbie Blue:

http://www.bleedcubbieblue.com/2010/1/17/1255925/uzr-error-fail-or-win-a-lesson-on

Rusty
Guest
Rusty
6 years 5 months ago

>
> As a long time fan of the game, I am only now beginning to understand
> the importance of this stuff. This particular post by Rory Paap of
> Paapfly.com helped me understand how luck and BABIP affects ERA and
> thus can create a huge difference between ERA and FIP.
>
> http://www.paapfly.com/2009/12/affeldt-stars-aligned-in-2009.html
>
> http://www.PaapFly.com is a great blog for everyone, statheads to average
> baseball fans.
>
>

craigtyle
Member
Member
craigtyle
6 years 5 months ago

Great, straightforward article on batting stats:

http://www.hardballtimes.com/main/article/how-to-evaluate-hitters/

Josh Fisher
Guest
6 years 5 months ago

I’ve always thought that FJM’s glossary is an interesting place to start someone on key principles (like worthlessness of pitching wins and the like). The humor makes the concepts very accessible.

http://www.firejoemorgan.com/2005/04/glossary-of-terms.html

Rockfish
Guest
Rockfish
6 years 5 months ago

My brother has been teaching me a bunch about the stat analysis trend in baseball. He uses some of the new math to breakdown the homeetown SF Giants. Here are a couple examples:

http://www.paapfly.com/2009/12/can-buster-fill-bengies-shoes.html

http://www.paapfly.com/2010/01/moneyball-and-beane-are-evolving.html

http://www.paapfly.com/2009/12/affeldt-stars-aligned-in-2009.html

TJ
Guest
TJ
6 years 5 months ago

I felt a bit out-of-the-loop when I first discovered the wonterful world of intelligent baseball analysis last year, one of the first places I went to was “The Book”. In my opinion “The Book” has been the best help to me. However, a sabermetrics wiki may prove to be a very useful education tool.

William
Member
William
6 years 5 months ago

Okay, so believe it or not this just came out today, but I am sure these will become fundamental metrics, their descriptions are short and clear, and the justification for development helps further define the metrics they develop from.

Basically, they nuance our understanding of pitching and luck by creating x-versions (e.g. xBABIP) to account for things really in pitchers’ control.

http://www.hardballtimes.com/main/fantasy/article/introducing-xw-xbabip-xlob-xhr-fb-and-more/

Breadbaker
Guest
Breadbaker
6 years 5 months ago

Alan Schwartz’s The Numbers Game is a good history of baseball statistical analysis for those who thing they began with Bill James musing in a boiler room in Lawrence, Kansas.

http://www.amazon.com/Numbers-Game-Baseballs-Fascination-Statistics/dp/0312322232/ref=sr_1_14?ie=UTF8&s=books&qid=1264480911&sr=8-14

TheSinators
Member
TheSinators
6 years 5 months ago

I’m still new to the sabremetric community (this is my virgin post). I still have a lot to learn, but I really enjoyed learning about tRA and wOBA, which, to my naive mind, sure seem like the best pitching and hitting stats around (of course, I don’t know about tRA* and tRA# and tRA~ and all the offspring of tRA). I hope it’s okay to post a couple.

Graham MacAree’s post explaining tRA:
http://www.lookoutlanding.com/2008/6/23/557089/the-big-tra-post

wOBA, explained briefly:
http://www.insidethebook.com/woba.shtml

wOBA, explained more thoroughly:
http://www.insidethebook.com/ee/index.php/site/comments/the_history_of_the_woba_part_1/

Infield defense:
http://www.hardballtimes.com/main/article/infield-defense-mdash-back-to-basics/

Great idea for a post. This gives me lots of homework to read up on!

Jake in Columbus
Guest
Jake in Columbus
6 years 5 months ago

http://www.askrotoman.com/fbguide/samediff.pdf
My first exposure to a more sabermetric approach which I felt gave me an edge in ranking pitchers for my first ever fantasy baseball season.

http://www.askrotoman.com/wordpress/?p=1033
The following year’s post per my request.

Jonah Keri
Guest
6 years 5 months ago

The old “Baseball Prospectus Basics” series, which ran in 2004, still holds up pretty well today. Great discussion of a wide range of topics, by several noted analysts, including Woolner, Silver, Click and others:

http://www.baseballprospectus.com/news/index.php?column=31

That series was, to some extent, a take-off on Woolner’s “Baseball’s Hilbert Problems” work, which first came out a full decade ago in BP2000, then was updated for the site in 2004:

http://www.baseballprospectus.com/article.php?articleid=2551

Also, two great (older) pieces on how fix revenue sharing:

The Zumsteg Plan:
http://www.baseballprospectus.com/article.php?articleid=1599

Keith Woolner’s take:
http://www.baseballprospectus.com/news/20020418woolner.shtml

Mike
Guest
Swarley
Guest
Swarley
6 years 5 months ago

I highly recommend these two pieces by FanGraphs author Mitchel Lichtman, as he tries to apply game theory to decision making in baseball. Obviously this is not straight on sabermetric work, but I think he’s on to something.

http://www.fangraphs.com/blogs/index.php/were-the-yankee-sac-bunts-in-the-8th-inning-correct

http://www.fangraphs.com/blogs/index.php/should-lidge-have-thrown-more-sliders

Fett42
Guest
Fett42
6 years 5 months ago
jinaz
Guest
6 years 5 months ago

This is, in a sense, something of what I was trying to compile here:
http://www.beyondtheboxscore.com/2009/12/17/1200459/want-to-help-me-plan-my-baseball
Focus might be slightly different, but there are a lot of good links there as well as good stuff submitted in the comments.
-j

wpDiscuz