A (Re)Introduction to the FanGraphs Library

Entering play on Thursday night, Kyle Seager owned a .274 batting average. Chris Johnson‘s average was a nearly identical .273. The two third basemen have played in a similar number of games and have come to the plate close to the same number of times. If you use batting average to evaluate these players’ seasons, you’d come to the conclusion that Seager and Johnson are essentially equivalent players this year.

They’re not. In fact, it’s very clear Seager is substantially better than Johnson. Let me rephrase that: It’s very clear Seager is better than Johnson — but only if you’re well-versed in the language of baseball statistics. If you know how to properly value walks, extra base power, baserunning and defense, the difference between Seager and Johnson is impossible to miss.

At FanGraphs, our writers use statistics and metrics like wOBA, wRC+, FIP and WAR to evaluate baseball players and teams. We provide those tools, and more, so others might conduct evaluations on their own. Want to know Miguel Cabrera‘s wOBA against lefties? You can find that on FanGraphs. But what if you don’t know what wOBA means, how it’s calculated or why you should care about it more than batting average?

You can find some of that information on FanGraphs. A well-motivated, self-starter could show up at the site, notice something called wOBA on the leaderboards, go to the glossary and figure out what it means and why it’s important. But it can be intimidating and challenging for people who are just starting out to make sense of everything we offer.

In an effort to make advanced statistics easier, and to understand and to better use the data and features available at FanGraphs, we’re relaunching and promoting the FanGraphs Library. There’s a lot of great information there already, but this revamped library is even better. There’s a steep learning curve, though, so I’ve been tasked with making things a bit simpler.

This is going to be a comprehensive and ongoing project that will feature updates to the glossary entries, blog posts about how to use various stats and the site’s many features, and weekly chats — each Wednesday at 3 pm eastern, starting next week — to answer reader questions. You probably know FanGraphs is a sabermetrically-themed blog, but FanGraphs is also about the dissemination of quality information. The information is already here, but not everyone is up to speed on how to use it.

I’ll be doing everything I can to make learning and using sabermetrics easy. You can comment on posts in the library, ask questions in chats or find me on Twitter (@NeilWeinberg44). If there are things that don’t make sense, or you don’t know how to get your hands on the stats you want, I’d like to help.

If you want to kick back on your sofa and simply enjoy world-class athletes competing against each other, that’s perfectly fine too. No one’s pressuring you to become a stat-person. But if you want to evaluate players, engage in debates with friends, play armchair general manager or squash your fantasy baseball buddies, learning to speak saber is going to help. It doesn’t mean spending your life looking at spreadsheets instead of watching games but it does mean knowing how much a walk is worth compared to a double and why using runs allowed alone to judge a pitcher can be misleading.

There’s a lot of great information available to the public for free. If you want to get the most out of that information, we’re going to be here to help you do that. You probably knew Kyle Seager was having a better year than Chris Johnson without sabermetrics. That doesn’t require a lot of extra information. But not every comparison or analysis is so simple or so clear. Sometimes you need to park-adjust, know exactly how much a triple is worth or whether a defensive play was routine or unlikely.

It’s my hope this project will accomplish two primary goals: First, I want to streamline the process by which a person learns about advanced stats so you can pick up the basic skills in an afternoon and be fluent in a couple of weeks. Second, I want people who are well-versed in sabermetrics to be able to make the most out of the FanGraphs’ features.

Did you know you can create and save a custom leaderboard with any players you want? Did you know that you can look up Alex Gordon‘s on base percentage from June 7 to June 28? If there are specific things you want to learn, let me know.

If you want to learn more about the stats we use or the features we offer, stick around. If you have friends who might be interested, send them our way. There’s a lot to learn and plenty of questions to ask, regardless of how much time you spend on the site. All of you — and all of us — are here because we enjoy baseball and we want to uncover more about the game we love. I hope our library is just one more step toward reaching that goal.




Print This Post



Neil Weinberg is the Site Educator at FanGraphs. He is also the Associate Managing Editor at Beyond The Box Score and can be found writing enthusiastically about the Detroit Tigers at New English D. Follow and interact with him on Twitter @NeilWeinberg44.


46 Responses to “A (Re)Introduction to the FanGraphs Library”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Rian says:

    Great post, though I’m guessing Fangraphs is more interested in the “dissemination” of quality information than the “decimation” of it ;) Looking forward to these updates!

    +8 Vote -1 Vote +1

  2. Blake says:

    This is exciting stuff. I’ve tried to educate myself when I don’t understand certain statistics and methods for evaluation; excited to learn more!

    Vote -1 Vote +1

  3. Pale Hose says:

    This is going to be awesome!

    Vote -1 Vote +1

  4. Gabes says:

    I’m not sure how deep this is planning to go, but I’d like to see a better discussion on how to calculate FIP-based WAR for pitchers. I’ve tried to follow the articles in the glossary to calculate pitcher WAR from scratch and it never seems to turn out right. I’m not sure how much of that curtain can be pulled back, but some more information would be welcomed. Thanks in advance for what already sounds like a good series.

    Vote -1 Vote +1

    • Pale Hose says:

      Second

      Vote -1 Vote +1

    • Yeah, this is definitely coming. Some of those articles are older and some changes have been made, so you might find yourself getting near the right answer but never totally there. Stay tuned!

      Vote -1 Vote +1

    • Brian says:

      I think it was in 2012 or 2013 when Dave started to use pop-outs in the WAR calculation, because an IFFB is pretty much as sure to be an out as a strikeout. So the FIP on the leaderboards is not the same as FIP used for WAR. The adjusted FIP for WAR purposes is.

      FIP = ((13*HR)+3*(BB+HBP)-2*(IFFB+K))/IP + constant

      This should close most of (if not all of) the gap between WAR you calculated and the WAR from the leaderbards.

      I’m glad that the library is being updated because it was very confusing as it was.

      Vote -1 Vote +1

  5. Matt Perez says:

    I’d like to see an update to the documentation about how WAR is determined?

    The pitching documentation hasn’t been updated since 2009 (just looking at the glossary) and I’m pretty sure it’s missing things like how pitching leverage is used to determine reliever WAR (implemented I believe in 2010).

    I’m sure there have been more changes since 2009 that I’m not aware of.

    Vote -1 Vote +1

  6. frivoflava29 says:

    Is there way to add a pitcher’s RA9 wins to your custom leaderboards/dashboard? Can there be? I know it’s relatively easy to calculate, but we don’t have access to FDP or BIP wins either, which also would be cool. Looking through the library makes me want to be able to make use of all these stats.

    Vote -1 Vote +1

  7. Tim L says:

    Just wanted to mention too that there is a baseball analytics course (Sabermetrics 101) offered through edX, by Andy Andres at BU. The realtime course is almost over, but you can still access the course materials and it will be archived as well for continual access. The course offers five segments each week over six weeks covering topics in sabermetrics, statistics, tech (basic SQL and R), history, and interviews with contemporaries in the field.

    They also anticipate offering a Sabermetrics 201 course at some point soon, so keep your eyes open for that if interested.

    Vote -1 Vote +1

  8. joser says:

    About four years ago Graham MacAree at LookoutLanding did an excellent Sabermetrics 101 series. Fangraphs would do well to try to match that (with updated info for things that have changed and improved since then, particularly with respect to how Fangraphs specifically calculates certain results).

    Since I’m linking LL anyway, I’ll also point out this post that compiles a variety of useful introductory material from sites all over the web (including FG, of course, but also THT, Tango’s blog, etc).

    Vote -1 Vote +1

  9. urchman says:

    For stats like wOBA, FIP, etc., in addition to an explanation of what the stat is measuring and how it’s calculated, could FG also include the mean and z-score for each stat, preferably by year?

    Vote -1 Vote +1

  10. lvmnz says:

    I think a good statistic for the library/for people’s custom boards would be some sort of BABIP vs. Career BABIP ratio–a good measure of possible outlier performance. Of course pitch FX data can always indicate the hitter has improved in some area, but I think this would be a good road sign.

    Vote -1 Vote +1

  11. Elan says:

    Do you guys have stadium-specific stats? I’m curious about HR/FB ratios across the league.

    Vote -1 Vote +1

  12. Bob says:

    Might be a little overdue, but it’s definitely a great idea and I look forward to it

    Vote -1 Vote +1

  13. Joe Durant says:

    What I’d really like is for, when I hover my mouse over the top of a column, to see what the abbreviations stand for, and a small description of the stat. It does it on the player pages, but not on the leader boards (for me, anyway)

    Vote -1 Vote +1

  14. scooter262 says:

    Trying to find Basruns on the site. I have read about them in several articles, but have not been able to find them in any leaderboards or list pages.

    Vote -1 Vote +1

    • scooter262 says:

      Baseruns.

      Vote -1 Vote +1

    • This is the only place they are currently available on the site: http://www.fangraphs.com/depthcharts.aspx?position=BaseRuns

      Vote -1 Vote +1

    • Brian says:

      Baseruns only make sense to use for teams. This is because Baseruns values depend on the rate at which a team’s runners score when they reach a specific base. So you shouldn’t use it to evaluate a player because he has nothing to do with which base his teammate reached or how often he will score once he gets there.

      You could use some league average numbers for the scoring rate part of BaseRuns, but at that point you’re pretty much creating your own linear weights and you might as well use the ones that Fangraphs already calculated for you.

      So in short, wRC (not wRC+) is a good enough measure of how many runs each player contributed to his team.

      The effect that all those wRCs have on each other, when added up, is the team BaseRuns result at the link that Neil posted.

      Vote -1 Vote +1

  15. Bradsbeard says:

    One thing that is very difficult to wrap my head around, much less explain to other people, is the WAR positional adjustment. The current glossary entry does a decent job of explaining how it operates, but it’s hard to get a sense of how the adjustments are derived or what they say about a particular player. For instance, when we say Miguel Cabrera gets a credit of +2.5 runs for standing in at 3B for a 162 games, are we saying anything in particular about his defensive skill? There is some form of accounting going on there, but it is unclear what is being accounted for. I’d really appreciate seeing a piece explaining how the precise adjustments were calculated and assigned. There is a link in the glossary to an old Tango blog post which sort of lays out in a stream of consciousness manner how they were derived from UZR, but it’s hard to follow and I have a hard time drawing conclusions from it. It would definitely be a project, but I really think if would be worth your while.

    Looking forward to what’s in store!

    Vote -1 Vote +1

  16. scb says:

    Thanks for the heads up about the custom leaderboards. Those are awesome.

    Now is there a tool that makes you stop poking around on Fangraphs after a certain amount of time so you can actually finish the work you need to get done by the end of the day?

    Vote -1 Vote +1

  17. mgoetze says:

    “FanGraphs is also about the dissemination of quality information.”

    It is? Explain the presence of Inside Edge fielding “data” on your site then.

    -5 Vote -1 Vote +1

  18. Okra says:

    why is it that two pitchers with very similar BB, SO, and HR rates can have two very different FIP numbers? aren’t those the only three things that FIP looks at? thanks for answering

    Vote -1 Vote +1

    • Okra says:

      for example, here are colby lewis and yordano ventura’s stats for this year. very similar with lewis actually having a better SO rate and HR/FB rate, yet a much worse FIP. why is that?

      k/9 BB/9 HR/FB FIP
      7.82 2.79 8.9 4.16 colby lewis
      7.52 2.74 10.1 3.57 yordano ventura

      Vote -1 Vote +1

      • just one problem... says:

        HR/FB rate is not directly part of FIP. In your example we need to look at HR/9. Evidently Colby Lewis has a high FB% which means his HR/9 is big. The coefficient for HR in the FIP core is 13 versus 3 and 2 for K & BB so even small variation in HR has a huge effect.

        Vote -1 Vote +1

    • This is the FIP equation:

      (13*HR)+(3*BB+HBP)-(2*K)/IP + constant

      So for the pitchers in question, Lewis has hit 4 batters in 84 innings compared to 2 for YV in 102 innings. He also has two more HR in 20 fewer innings. That’s going to wash out the similarities in their K and BB%. Does that make sense?

      Vote -1 Vote +1

      • Jordan says:

        Are the FIP coefficients determined from linear weights?

        Vote -1 Vote +1

        • Essentially yes, although we don’t adjust the coefficients on a year to year basis like we do with wOBA. I think it is because their relationship to one another is static enough that it’s not super important to use 2.98 for the BB/HBP coef and 12.8 for HR, especially when we adjust the constant for run environment. I think I read somewhere that the 2-3-13 coefs aren’t exactly perfect, but the beauty of FIP is how clean it is.

          Vote -1 Vote +1

  19. NotFoul says:

    I’ve stuggled to find the time to dive into saber stuff, but I’m ready to learn. Guess I picked the right time to stop being lazy. Looking forward to the (Re)Introduction.

    Vote -1 Vote +1

  20. Hunter Satterthwaite says:

    Hey Neil, you mind following me on twitter at @huntman234 so I can learn more about sabermetrics?

    Vote -1 Vote +1

  21. peterevang says:

    Is there a way to look at just righties or lefties (for either pitchers or batters)? It seems like a natural option, but I haven’t found it. Thanks!

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current ye@r *