Why It’s Always Better to Use Multiple Statistics

One of the most common questions I get when talking about advanced metrics with people who are new to the experience is “what’s the best stat for looking at X?” My standard response depends on the particular question, but I almost always drop the caveat that you should always be looking at multiple pieces of information rather than one single stat and I don’t think I’m alone in offering that advice.

As our metrics for evaluating baseball improve there’s a desire among many for the new stats to push the old stats out of the conversation. Now that we have wOBA, why would you ever use OBP? And then once you have access to wRC+, is wOBA even necessary anymore? If we have K%, isn’t K/9 completely useless?

In some cases, that’s a fine idea, but in many you would rather have access to as much information as possible because stats that don’t do very well on their own can still be informative in the context of other statistics. Wins Above Replacement (WAR) is the best single metric we have to determine a player’s complete value, but WAR only conveys the answer to a very specific question. If you want to know about how good a player is overall, WAR is great. If you want to know if he’s a power hitter or a player with a good eye, WAR doesn’t do very much.

The same is true for wRC+. You know a 150 wRC+ means someone has had a very good season, but you don’t know if he’s doing it with a high average, good patience, excellent power or some combination of them. We’re striving for better measures of performance but you can’t only look at one or two numbers because baseball is full of questions that require a variety of tools to evaluate.

Read the rest of this entry »


Calculating Position Player WAR, A Complete Example

One of the hallmark statistics available at FanGraphs is Wins Above Replacement (WAR) and we’ve just rolled out an updated Library entry that spells out the precise calculations in more detail than ever before. There’s always been a clear sense of the the kinds of things that go into our WAR calculation, but we’re never just dropped an equation in front of you and said, “Here!”

As of today, we’ve done that and I encourage you to go check out our basic primer on WAR and our detailed breakdown of how we calculate it for position players. If you’re a hands on learner, grab a pen and paper or spreadsheet and follow along. I’m going to walk you through a complete examples of how to calculate WAR for position players. Let’s use the 2013 version of Joey Votto as our exemplar.

Read the rest of this entry »


The Beginner’s Guide to Using Statistics Properly

We’ve spilled a great deal of virtual ink and audible podcasting words on the nature of Wins Above Replacement (WAR) and defensive metrics recently. Jeff Passan of Yahoo! Sports and many who responded to his critique of the current WAR calculation dug into the relative merits of the metric itself and how well we’ve estimated it to date. That’s a great conversation to have and Dave has done the heavy lifting on behalf of FanGraphs in that regard. I’d like to pivot and discuss a very important point about the use of statistics in baseball: Everything has flaws.

Every single statistic is wrong. Your eyes are wrong. It is all wrong. Nothing we have will provide you with perfect information or even truly accurate information with respect to the underlying variables about which you care. You don’t get to choose between flawed and not flawed statistics, you get to choose between useful and not useful statistics. More importantly, statistics become useful based on your awareness of the proper way to wield them.

Read the rest of this entry »


The Beginner’s Guide to Measuring Defense

There’s a decent chance you’ve arrived at this page without a serious desire to hear more about defensive statistics. Trust me, I understand your frustration and your fatigue. Defensive stats like Ultimate Zone Rating and Defensive Runs Saved are controversial in some circles because they are reasonably new and the underlying data is somewhat hidden from view. You hear words like “flawed,” “absurd,” and “subjective” surrounding them. You’re tired of it.

Yet I’d like to lay out why we have advanced defensive statistics and how they work in the abstract. You won’t get to the end of this post and decide that UZR has perfectly measured Alex Gordon‘s defense, but hopefully you will have a better appreciation for why we measure defense the way that we do.

Read the rest of this entry »


Learning to Speak Saber: Runs and Wins

One of the things people love about baseball is that the game is both very simple and very complicated all at once. Baseball is simple in that all you’re trying to do is score more runs than the other team during 162 finite, nine inning contests. You are trying to reach base and advance runners and you are trying to prevent the other team from doing the same. How you go about doing those things is where baseball gets complicated. Jeff Sullivan often refers to baseball as being “obnoxiously complicated,” which I find to be a fitting description.

Think of all of the different possible outcomes of every pitch and all of the different pitches and locations from which the pitcher can choose. The complicated part of baseball is what makes baseball interesting, but the simple part of baseball is where you need to start to get your head around sabermetrics and player evaluation. Baseball is about producing and preventing runs.

As a result of that simple reality, the heart of baseball analysis is determining what leads to run scoring and run prevention. Specifically, how many runs is each possible action worth? If a player hits a single, how much has that player just increased his team’s odds of scoring a run? If a fielder makes a nice running catch, how many runs has he prevented? We don’t actually care about hits and walks and double plays, we care about how those finite events contribute to the overall goal.

Read the rest of this entry »


Defensive Metrics, Their Flaws, and the Language of Writers

If you spent time hanging around the comments section of Dave’s Alex Gordon piece, you lurked in the shadow’s of his conversation with Jeff Passan on Twitter, or you’re one of those people who Twitter searches the word “FanGraphs,” you probably saw a decent amount of skepticism about single-season defensive metrics this week. People tossed around words like “flawed” and “absurd.”

The interesting part of the debate, for me at least, was that there was skepticism from both sides. The sabermetric elite dove into an esoteric debate about how to best incorporate defense into WAR and less analytically minded fans used Gordon passing Mike Trout in WAR as kindling for their “WAR is silly” crusade.

Dave’s piece does a nice job covering exactly what it means to say Alex Gordon leads position players in WAR, but the fact that Dave had to write that piece in the first place speaks to a problem we often run into when using advanced metrics. It’s a communication problem. Dave addresses it, but I’d like to expand on it here because it’s vitally important.

Read the rest of this entry »


ERA, FIP, and Answering the Right Question

One of the things baseball fans and analysts work very hard to do is isolate individual performance. At the end of a game, there is a final score that tells you how many runs each team scored. At a very basic level, that’s all that really matters. Baseball is a battle to score more runs than your opponent over the span of nine innings repeated 162 times. Yet analyzing the game requires more information than that because we want explanations. We want to know which players are good and which players aren’t so good. We care about how individual performance contributes to winning.

For pitchers, this is especially difficult because while pitchers have a huge impact on the number of runs they allow, they don’t have complete control. You can’t just look at the number of runs a pitcher allowed and say they were definitively responsible for those runs and call it a day. You aren’t isolating their performance and if you aren’t isolating individual performance you’re looking only at outcomes, and that’s not typically very interesting.

Every statistic, or really any analysis in general, should start with a question. On a basic level, the question we have is “How good is this pitcher?” which more specifically translates into “How effective is this pitcher at preventing runs?”

Read the rest of this entry »


Why We Care About BABIP

Batting Average on Balls in Play (BABIP) is actually a pretty tried and true part of the baseball vernacular. Sabermetricians may have given it a long name with a fun-sounding acronym, but the principle goes back as far as presidential first pitches and wooden bats. Everyone knows that bloop hits and seeing eye ground balls go for hits quite regularly and that screaming rockets get snatched out of the air by leaping defenders pretty often. You couldn’t find a baseball fan alive who would argue with you on that simple fact.

BABIP is really just the amalgamation of all of those screaming rockets and bouncing grounders. When a batter puts the ball in play, it either goes for a hit or it doesn’t. Sometimes it’s a clean single, sometimes the defender can’t quite reach it. It’s a game of inches and these things happen.

Read the rest of this entry »


How to Use FanGraphs: Leaderboards!

In addition to updated glossary entries and blog posts extolling the virtues of various sabermetric statistics and principles, the revitalized FanGraphs Library is also going to be a place where we highlight features available at the site that will allow you to get the most out of our data.

Below, you’ll find everything you ever wanted to know about the FanGraphs Leaderboards. If you’ve been a long-time reader who never misses a single post, a lot of this might be old news. If you’re anything short of that, there’s a good chance you’ll pick up a few tricks to get the most out of the site.

Read the rest of this entry »


wRC+ and Lessons of Context

This introduction is a setup. Don’t fall for it. I’m going to present you with two stat lines and ask you to silently compare them. Your job is going to be to determine which player had the better season at the plate. Remember, it’s a trick.

  • Player A: 697 PA, .372/.463/.698, .476 wOBA, 42 HR, 59 2B, 103 BB, 61 K
  • Player B: 716 PA, .323/.432/.557, .423 wOBA, 27 HR, 39 2B, 110 BB, 136 K

If I hadn’t primed you, it would be hard to suggest anything other than that Player A had the better season. He’s leading everything, except for a slight disadvantage in walk rate. Player A had the better season, right? It’s obvious. Even though I told you it was a trick, you’re still struggling to find a way to argue the opposing side. I’m telling you that Player B actually had the better season, but that’s because I have more information. I know a couple of important pieces of information that you don’t have and it makes a world of difference.

Read the rest of this entry »


wOBA As a Gateway Statistic

Despite all of the rhetoric and talk-radio bluster, sabermetric principles and statistics aren’t actually very complicated. It might take a sharp statistician or savvy programmer to derive perfect park factors, but it doesn’t take anything more than a curious mind to understand and apply the basics. In my time working to help spread these principles, one of the most common and useful questions I get is about which few statistics a person should learn when trying to get into the world of advanced stats.

On Wednesday during my chat I got such a question. Here’s how I responded:

Read the rest of this entry »


FanGraphs Library Stat Glossary

To find a particular statistic, use Ctr-F and type in the abbreviation or stat name that you are looking for.

Offense:

OBP – On-Base Percentage
OPS – On-base Plus Slugging
OPS+ – On-base Plus Slugging Plus
wOBA – Weighted On-Base Average
wRAA – Weighted Runs Above Average
UBR – Ultimate Base Running
wRC – Weighted Runs Created
wRC+ – Weighted Runs Created Plus
BABIP – Batting Average on Ball In Play
ISO – Isolated Power
HR/FB – Home Runs per Fly Ball rate
Spd – Speed Score
GB% – Ground ball percentage
FB% – Fly ball percentage
LD% – Line drive percentate
K% – Stikeout rate
BB% – Walk rate
O-Swing% – Outside-the-zone swing rate
Z-Swing% – Inside-the-zone swing rate
Swing% – Swing rate
O-Contact% – Outside-the-zone contact percentage
Z-Contact% – Inside-the-zone contact percentage
Contact% – Contact percentage
Zone% – Percentage of pitches within the zone
F-Strike% – First-pitch strike percentage
SwStr% – Swinging Stike percentage
wFB – Fastball runs above average
wSL – Slider runs above average
wCT – Cutter runs above average
wCB – Curveball runs above average
wCH – Change-up runs above average
wSF – Split-finger fastball runs above average
wKN – Knuckleball runs above average
wFB/C – Fastball runs above average per 100 pitches
wSL/C- Slider runs above average per 100 pitches
wCT/C – Cutter runs above average per 100 pitches
wCB/C – Curveball runs above average per 100 pitches
wCH/C – Change-up runs above average per 100 pitches
wSF/C – Slit-fingered fastball runs above average per 100 pitches
wKN/C – Knuckleball runs above average per 100 pitches

Defense:

rSB – Stolen Base Runs Saved runs above average
rGDP – Double Play Runs Saved runs above average
rARM – Outfield Arms Runs Saved runs above average
rGFP – Good Fielding Plays Runs Saved runs above average
rPM – Plus/Minus Runs Saved runs above average
DRS – Defensive Runs Saved runs above average
BIZ – Balls In Zone
OOZ – Balls Out Of Zone
RZR – Revised Zone Rating
CPP – Expected Catcher Passed Pitches
RPP – Catcher Blocked Pitches in runs above average
TZ – Total Zone
TZL – Total Zone with Location data
FSR – Fan Scouting Report
ARM – Outfield Arm runs above average
DPR – Double Play runs above average
RngR – Range runs above average
ErrR – Error runs above average
UZR – Ultimate Zone Rating
UZR/150 – Ultimate Zone Rating per 150 defensive games

Pitching:

ERA – Earned Run Average
WHIP – Walks and Hits per Innings Pitched
FIP – Fielding Independent Pitching
xFIP – Expected Fielding Independent Pitching
SIERA – Skill-Interactive ERA
tERA – True Runs Allowed
K/9 – Strikeout rate
BB/9 – Walk rate
K% – Strikeout percentage
BB% – Walk percentage
K/BB – Strikeout-to-Walk ratio
LD% – Line drive rate
GB% – Ground ball rate
FB% – Fly ball rate
HR/FB – Home runs per fly ball rate
BABIP – Batting Average on Balls In Play
LOB% – Left On Base percentage
ERA- – ERA Minus
FIP- FIP Minus
xFIP- – xFIP Minus
SD – Shutdowns
MD – Meltdowns
O-Swing% – Outside-the-zone swing rate
Z-Swing% – Inside-the-zone swing rate
Swing% – Swing rate
O-Contact% – Outside-the-zone contact percentage
Z-Contact% – Inside-the-zone contact percentage
Contact% – Contact percentage
Zone% – Percentage of pitches within the zone
F-Strike% – First-pitch strike percentage
SwStr% – Swinging Stike percentage
wFB – Fastball runs above average
wSL – Slider runs above average
wCT – Cutter runs above average
wCB – Curveball runs above average
wCH – Change-up runs above average
wSF – Split-finger fastball runs above average
wKN – Knuckleball runs above average
wFB/C - Fastball runs above average per 100 pitches
wSL/C- Slider runs above average per 100 pitches
wCT/C – Cutter runs above average per 100 pitches
wCB/C - Curveball runs above average per 100 pitches
wCH/C - Change-up runs above average per 100 pitches
wSF/C – Slit-fingered fastball runs above average per 100 pitches
wKN/C – Knuckleball runs above average per 100 pitches

Win Probability:

WPA – Win Probability Added
-WPA – Loss Advancement
+WPA – Win Advancement
RE24 – Run Above Average based on the 24 Base/Out States
REW – Wins Above Average based on the 24 Base/Out States
pLI – A player’s average LI for all game events
phLI – A batter’s average LI in only pinch hit events
PH – Pinch Hit Opportunities
gmLI – A pitcher’s average LI when he enters the game
inLI – A pitcher’s average LI at the start of each inning
exLI – A pitcher’s average LI when exiting the game
WPA/LI – Situational Wins
Clutch – How much better or worse a player does in high leverage situations than he would have done in a context neutral environment

WAR

Offensive

Batting – Park Adjusted Runs Above Average based on wOBA
Base Running –  Base running runs above average, includes SB or CS
Fielding – Fielding Runs Above Average based on UZR (TZ before 2002)
Replacement – Replacement Runs set at 20 runs per 600 plate apperances
Positional – Positional Adjustment set at +12.5 for C, +7.5 for SS, +2.5 for 2B/3B/CF, -7.5 for RF/LF, -12.5 for 1B, -17.5 for DH
Fld + Pos
RAR – Runs Above Replacement (Batting + Fielding + Base Running + Replacement + Positional)
WAR – Wins Above Replacement

Pitching

RA9-Wins – Wins Above Replacement calculated using Runs Allowed
BIP-Wins – BABIP wins above average
LOB-Wins – Sequencing in wins above average (calculated as the difference between RA9-Wins and WAR minus BIP-Wins)
FDP-Wins – BABIP and Sequencing wins above average, also the difference between RA9-Wins and WAR
RAR – Runs Above Replacement
WAR – Wins Above Replacement


New Library Section: Contract Details

While contractual details may not be sabermetric statistics or concepts, they can still be really confusing. I consider myself a pretty knowledgeable baseball fan, yet I still get baffled with details about player options and service time. Baseball is one of the more complicated sports in terms of rules, and so it only makes sense that the many transaction rules surrounding the game are just as intricate and tedious.

As a result, I’ve started a new hub over at the Library for contract details. You can find the hub underneath the “Sabermetric Principles” drop down tab, and I’ll be adding pages to it throughout the next week. At the moment, the first page up there is on Player Options. I also have planned articles on waivers, service time, and a few miscellaneous topics like the Rule 5 draft. If there are any other topics that you would like to see covered, please contact me either on Twitter or using the “Contact” link provided in the sidebar at the Library.

After the jump, you’ll find the write-up on player options that can now be found at the Library.

Read the rest of this entry »


The Curious Case of Ben Zobrist

This piece was originally written for a mainstream audience, yet I’ve never been able to find a good place for it. I think it’s a good example of how you can write sabermetric pieces without relying heavily on advanced statistics and without scaring away new readers. Enjoy.

There are some players in baseball that are chronically underappreciated by fans. These are the players who do not fit into any of our traditional molds: they are first basemen, but not power hitters; leadoff hitters, but not basestealers; bullpen aces, but not closers. Growing up following the game, we learn to expect certain things from specific players, and become baffled when a player does not fit in a specific mold. What to do with a clean-up hitter that only hits 20 homeruns, or a leadoff hitter that hits .260 and steals 4 bases? Both these players may still be valuable – the clean-up hitter could have hit 50 doubles and the leadoff hitter could have reached base more often than a .300 hitter – but our expectations blind us, leading us to view these players as inherently less valuable than others.

Read the rest of this entry »


Pitchers and Injuries: It Happens

When news broke on Wednesday of Adam Wainwright‘s season-ending injury, it obviously was quite distressing news for Cardinals fans. Not only was Wainwright the ace of the Cardinals’ pitching staff, but the Cardinals are projected to be thick in the race for the NL Central, making his contributions all the more valuable. While Wainwright isn’t costing the Cardinals much this season, the list of pitchers that will be competing to replace him isn’t anything to get excited about. If I were a Cardinals fan, I’d be watching this video over and over and over again, drowning my sorrows in fond memories and root beer.

But Wainwright’s injury isn’t traumatic only for Cardinals fans: no matter what team you root for, this news is frightening. Wainwright is a relatively young pitcher (entering his age 29 season) and he’s pitched 230 innings each of the previous two years. He’s been a perennial Cy Young contender, and never had significant arm issues before. If this sort of an injury can happen to him, well, who isn’t at risk?

This is probably old news for the majority of FanGraphs readers, but this point can’t be driven home often enough: pitchers are fickle creatures that are always at risk for an injury.

Read the rest of this entry »


Understanding Projections, “True Talent Level”, and Variability

This is the second in a series of posts about projections. The first part was about the methodology behind each projection system. In this section, we look at what projections are actually telling us.

If you’re new to projections and want to use them to, say, help with your fantasy team, it’s easy to make a common mistake: underestimating the built-in variability in projections. Many people – and I used to be among this group myself – view projections as hard and fast guesses at a player’s production this next season. Most people get into projections as a result of fantasy baseball, so this makes sense; we all want to know which player is going to hit 30 homeruns this next season and which will steal 40 bases. However, projections are actually measuring something different than a player’s expected production: they’re measuring a player’s true talent level.

This might seem like an arbitrary distinction, but trust me, it’s not. As we all know from our day-to-day lives, having a “true talent level” at a particular skill does not necessarily mean you’ll perform at that level every single time in the future. Our minds love to ignore variability and instead treat outcomes as solely talent-driven, but the world doesn’t work that way. Let’s consider a couple examples.

Read the rest of this entry »


Food Metaphors, Replacement Level Style

When writing my irreverent NotGraphs post on Casey Fossum, an interesting question popped into my head: how could I best explain the concept of a replacement level player using a food metaphor? In other words, is there a “replacement level” food? Not every baseball fan is a math nerd, but ALL sports fans love food. This is an indisputable truth, and means that food metaphors have the potential to be one of the most potent teaching instruments since these amazingly quirky mathematics videos.*

*Also, before you ask, this post is a direct reference to Fire Joe Morgan and their historic “Food Metaphors” tag, possibly the best thing that Ken Tremendous has ever created, ever. And yes, I’m a huge fan of “The Office”.

Before we get into the nitty gritty of finding the perfect food metaphor for replacement level, we need to know what replacement level is. In case you have forgotten (or don’t know), here’s Graham MacAree’s description of replacement level, as taken from our page in the Library:

We can define a replacement level player as one who costs no marginal resources to acquire. This is the type of player who would fill in for the starter in case of injuries, slumps, alien abductions, etc.

These are essentially the Triple-A filler players that can be found in every organization (and in copious amounts on the free agent list) every year. They cost next to nothing to acquire, can be found in massive quantities, and should only be used in case of emergency – at best, they make adequate bench players. They are, in short, the very base of major league baseball’s (triangular) talent distribution.

So with this in mind, what’s the ideal food to capture the essence of a replacement level player? Let’s take to the Twitter!

Read the rest of this entry »


The Projection Rundown: The Basics on Marcels, ZiPS, CAIRO, Oliver, and the Rest

Now that football season is over and baseball is once again close at hand, Projection Season is well underway. Fantasy players, analysts, bloggers, and plain ol’ fans – everyone turns to projections to help them this time of year. The Hot Stove has cooled down and Spring Training has just started, so really…what else is there to do?

With that in mind, I’ve got a handful of posts on projections in the works for the next week. This is the first one, and in it I deal with a basic question: what are the different projection systems available, and how are each of them calculated? In order to know how to properly use each projection, it’s always a good idea to understand what data is taken into account and how it is used. Remember: there is no one “gold standard” for projection systems. Each system will tell you something slightly different, so whenever trying to draw conclusions from projections, it’s best to use as many sources as possible.

Read the rest of this entry »


“Sabermetrics for Dummies”: Mainstream Media Style

Jason Collette and Tommy Rancel talking with J.B. Long from the Bright House Sports Network.

Rarely do you ever see a mainstream media outlet take the time to discuss sabermetric stats. Every now and then you’ll see a passing reference to WAR or FIP on ESPN, but the announcers have a maximum of 30 seconds to introduce the statistic, explain what it means, and make their point. These mentions are great for general awareness of sabermetric statistics, but do they actually educate anyone? They can make be a good introduction to a statistic and make someone curious to learn more – and don’t get me wrong, I love when mainstream news sources mention saber stats – but to truly educate someone about sabermetrics takes more than that.

Enter the Bright House Sports Network.  While Bright House is a major sports network in the Tampa Bay area, covering topics ranging from national sports stories to local high school teams, they’ve begun augmenting their baseball coverage with some sabermetric analysis. Jason Collette, Tommy Rancel, and R.J. Anderson – three premier Rays bloggers – contributed articles on the BHSN website during the later half of the 2010 season, using their analyses as a springboard for readers to become familiarized with advanced statistics.

And now, Bright House is taking it a step further: filming “Sabermetrics for Dummies” videos with Jason, Tommy, and reporter J.B. Long. This first video is a mere introduction to the series, but more videos will be released this week and the topics will include wOBA, BABIP, LOB%, WAR, IsoP, and FIP. These are extended videos, with the idea of explaining to viewers how the sabermetric stats are calculated and why they are useful.

Is it just me or is this rather unique? Has any other mainstream sports station done something similar? I’d love to hear examples of other media outlets doing similar projects (please share!), but at least to my knowledge, the Bright House Sports Network is ahead of the curve.


Left On Base Percentage (LOB%): A Video Explanation

Analyzing pitchers is one of the most difficult things to do in baseball (at least, in the “non-playing” category). Pitchers are notoriously fickle, and their performances can vary widely from start to start and year to year. They don’t follow a set aging curve like position players (who peak at ages 27-30), but improve and decline with no overarching pattern. Some pitchers are late-bloomers and don’t peak until their 30s (e.g. Randy Johnson), while others peak in their early 20s and never reach the same level again (e.g. Scott Kazmir).

Not to mention, when you try analyzing a pitcher’s results, there are so many variables in play. How much of a pitcher’s performance is his talent shining through, and how much is the defense, opposing team, umpire, catcher, and ballpark? With no discernible difference in his pitch movement, sequencing, or velocity, a pitcher may let up 8 runs in four innings during one start yet turn around and throw an 8 inning shutout his next time out. How much of that variance should we pin on the pitcher and how much is outside his control?

These are all difficult questions without any exact answer, which is why there are a large number of pitching statistics available here at FanGraphs. In order to see past those confounding variables and get a grasp on a pitcher’s true talent level, it’s best to look at a wide range of statistics instead of relying upon one as the be-all-end-all. ERA, FIP, tERA, xFIP, BABIP, LOB%, HR/FB – all these stats tell you something different and paint a more complete picture when used together.

And so, here’s a chance to learn a bit more about one of those statistics: Left On Base Percentage (LOB%). This video is courtesy of Bradley Woodrum from DRaysBay and Tom Tango from The Book Blog: