FIP May Look Like ERA, But It’s Designed Like wOBA

If you have at least a passing familiarity with sabermetrics, you’ve probably heard something like this: Fielding Independent Pitching (FIP) is what a pitcher’s ERA should have been based on his walks, hit batters, strikeouts, and home runs. In other words, FIP is described as a predictive tool to tell you what should have happened rather than as a retrospective assessment of actual pitcher performance.

But this is wrong. This is a shorthand way of describing FIP that well-meaning analysts (myself included) have used, but I’ve come to realize that by aiming to put FIP in terms of ERA, we’ve actually made it more difficult for people to grasp and embrace what FIP is really telling us. It’s time to change the way we talk about FIP, because while the concept of defense independent stats has gained popularity, there is often push back (by some) against FIP as a measure of value, in part, because of less than ideal presentation.

Read the rest of this entry »

The Beginner’s Guide To Deriving wOBA

We feature many statistics on FanGraphs, but one of the most fundamental is Weighted On-Base Average (wOBA). If you’re not familiar with the merits of wOBA in general, I invite you to head over to our full library page on it or to learn about why it’s a gateway sabermetric statistic. For our purposes, I’ll simply include the summary:

wOBA flash

wOBA is designed to weigh the different offensive results by their actual average contribution to run scoring. Batting average treats all hits equally and ignores walks. OBP treats all times on base equally. Slugging percentage weighs hits based on the number of bases achieved but ignores walks. Adding OBP and SLG is better than any one of AVG/OBP/SLG, but it still isn’t quite right.

Hitting a single and drawing a walk are both positive outcomes, but they have a different impact on the inning. A walk always moves each runner up one base while a single could have a variety of outcomes depending on who is on base and where the ball is hit. We want a statistic that captures that nuance.

Granted, wOBA doesn’t adjust for park, league, quality of competition, or a number of other factors, but it’s a good starting point on which to build. So how do we take the beautiful chaos of baseball and create the formula listed above?

The exact numbers are going to change each year based on the run environment (how many runs are being scored league wide), but they are consistent enough that we won’t have any problems understanding each other. I’m going to use the 2015 data, but you can view every year here. Allow me to bring a chunk of that table into this post for convenience:

Seasonal Constants
Season wOBA wOBAScale wBB wHBP w1B w2B w3B wHR runSB runCS R/PA R/W cFIP
2015 .313 1.251 .687 .718 .881 1.256 1.594 2.065 .200 -.392 .113 9.421 3.134
2014 .310 1.304 .689 .722 .892 1.283 1.635 2.135 .200 -.377 .108 9.117 3.132
2013 .314 1.277 .690 .722 .888 1.271 1.616 2.101 .200 -.384 .110 9.264 3.048
2012 .315 1.245 .691 .722 .884 1.257 1.593 2.058 .200 -.398 .114 9.544 3.095
2011 .316 1.264 .694 .726 .890 1.270 1.611 2.086 .200 -.394 .112 9.454 3.025
2010 .321 1.251 .701 .732 .895 1.270 1.608 2.072 .200 -.403 .115 9.643 3.079

You can ignore the last four columns for the purposes of wOBA, but this is a truncated version of our Guts! page and shows us each year’s league wOBA, the wOBA scale, and the weights for each of our six offensive events of interest. Hopefully those numbers will look similar to the wOBA equation you saw earlier.

Our ultimate goal is to create a statistic that measures each offensive action’s context neutral contribution to run scoring because scoring runs is the currency of baseball. We have decided that we want to measure walks, HBP, singles, doubles, triples, and home runs. If you wanted to, you could build wOBA with more nuanced stats like fly ball outs, ground outs, strikeouts, etc; it would just get more complicated without much added value.

We have a specific goal and the set of offensive actions we want to measure, but now we need a method of putting them together.

Run Expectancy

The first thing we need is a run expectancy matrix. If you need a complete introduction to the concept, head over to this page. In general, run expectancy measures the average number of runs scored (through the end of the current inning) given the current base-out state.

Base-out states are a record of the number of outs (0, 1, or 2) and how many runners are on base and where (no one on, man on 1B, men on 1B and 3B, etc). There are three out-states and eight base-states, meaning that there are 24 base-out states. Each plate appearance has a base-out state.

Let’s use one out, man on first as our example. In order to calculate the run expectancy for that base-out state, we need to find all instances of that base-out state from the entire season (or set of seasons) and find the total number of runs scored from the time that base-out state occurred until the end of the innings in which they occurred. Then we divide by the total number of instances to get the average. If you do the math using 2010-2015, you get 0.509 runs. In other words, if all you knew about the situation was that there was one out and a man on first, you would expect there to be .509 runs scored between that moment and the end of the inning on average.

You repeat the process for the other 23 base-out states and wind up with a table like this:

Run Expectancy Matrix 2010-2015
Runners 0 outs 1 outs 2 outs
__ __ __ 0.481 0.254 0.098
1B __ __ 0.859 0.509 0.224
__ 2B __ 1.100 0.664 0.319
1B 2B __ 1.437 0.884 0.429
__ __ 3B 1.350 0.950 0.353
1B __ 3B 1.784 1.130 0.478
__ 2B 3B 1.964 1.376 0.580
1B 2B 3B 2.292 1.541 0.752
SOURCE: Tom Tango

The table listed here was calculated by Tom Tango using 2010-2015 data for the entire league and serves as a good baseline. At FanGraphs, we park adjust the matrix for each game, so the exact numbers might be a touch different if you’re trying to play along at home in excruciating detail.

Linear Weights

Now that you have a run expectancy matrix, you need to learn how to use it. Each plate appearance moves you from one base-out state to another. So if you walk with a man on first base and one out, you move to the “men on first and second and one out” box. That box has an RE value of 0.884. Because your plate appearance moved you from .509 to 0.884, that PA was worth +0.375 in terms of run expectancy.

Every plate appearance has one of these values, either positive or negative. You can learn more about this by following the earlier link.

What we want to determine is the average run value of a walk, HBP, single, double, triple, and home run. To do this, we take the total RE value of all walks (unintentional in this case), for example, and divide that number by the number of walks in that season. You’re going to wind up getting something around 0.3. You repeat this for the other five actions. This gives you the runs above average produced by each of these kinds of events.

In theory, we could essentially be done right now because we have everything we need to build a statistic that will weigh the offensive actions properly. However, the inventors of wOBA decided that it would probably be best to scale it to something familiar to make it easier to understand. And they picked OBP.


We have the runs above average for walks (0.29), HBP (0.31), singles (0.44), doubles (0.74), triples (1.01), and home runs (1.39), but what we want to do now is put wOBA on a scale that will look like OBP. In OBP, an out is worth zero, so the first thing we want to do is adjust the run value scale so that an out is equal to zero.

There is an easy way to do this. First, we need to find the linear weight for all outs using the same method we used to find the value for the other events. We’ll call it -0.26 for 2015. This means that an out is worth -0.26 runs less than the average PA when it comes to run expectancy. What we want to do now is add 0.26 to each of our run values so that outs are equal to zero. So for walks, which we said are worth 0.29 runs above average, we bump those up to 0.55 runs relative to an out. Using linear weights, walks are worth 0.55 runs more than outs. We repeat this for each of the five other positive offensive outcomes.

2015 Linear Weights (Relative To Outs)
Event Run Value
BB 0.55
HBP 0.57
1B 0.70
2B 1.00
3B 1.27
HR 1.65

As you’ll notice, these are not the weights you saw in the wOBA equation. We’re not done scaling them yet. We know that we want BB, HBP, 1B, 2B, 3B, and HR in the numerator of the wOBA formula and plate appearances (minus weird stuff like sac bunts) in the denominator, so what we’re going to do is calculate “wOBA” for the entire league using the linear weights in this table and the total number of events of each type.

In other words, we’re gong to multiply 0.55 times the number of walks in MLB in 2015 and add that to 0.57 times the number of HBP and so on, and then divide the entire sum by the number of plate appearances (really AB + BB – IBB + SF +HBP). If we do that, we wind up with 0.250.

But remember that we want wOBA to look like OBP. So we need to scale the entire thing so that the league’s wOBA is .313 (to match OBP with IBB removed). To do that, we divide .313/.250 and get 1.251, which we call the wOBA Scale.

We take the wOBA Scale and multiply it against the linear weights from the table above and viola, we have ourselves the weights listed in the wOBA equation. And we’re done!

Concluding Thoughts

It’s important to remember that wOBA is one implementation of a linear weights based offensive metric. Baseball Prospectus has their own version, True Average, which is based on the same pillars and implemented differently. Choosing to scale it to OBP is an aesthetic choice. We could scale it to batting average or to nothing. The important part is just that we understand the scale we’re using.

The main idea is that we’re giving each type of outcome a value based on the average change in run expectancy that particular outcome yields. The idea is to give the right amount of credit to each kind of event. Doing so does not make wOBA a perfect statistic, it simply makes it a better one than the traditional AVG/OBP/SLG.

There are lots of little nuances you can add to something like wOBA to get it closer and closer to the truth. All we’re doing here is creating the foundation for all of that work.

How To Use FanGraphs: Spray Charts

Once upon a time, all we had were box scores. We might know a player went 1-3 with a double and a walk, but we wouldn’t know how exactly all of the game’s events unfolded. We’ve come a long way since then, getting play-by-play data, pitch-by-pitch data, video tracking, PITCHf/x, and Statcast. We have results data stretching back more than a century, but the way those results came about gets easier to understand with new information.

What direction was the double hit? How far did it go? Who fielded it? Hearing a player hit a double seems like specific information, but there’s plenty more you might want to know about that event. One of the ways we communicate that information is through Spray Charts.

There are certainly other ways to communicate information of this nature, but one implementation is to display it visually on a diamond graphic and you can find our implementation of spray charts on the player pages here at FanGraphs.

Read the rest of this entry »

A Place To Learn About Sabermetrics

Pitchers and catchers are reporting for Spring Training this week and, before you know it, there will be real, live baseball happening in Arizona and Florida. As the season approaches, I’d like to take a little time to welcome any statistical newcomers to the FanGraphs Library.

If you’ve made it this far, you’ve almost certainly read articles on the main site, visited some of our player pages or leaderboards, or played our fantasy baseball game, Ottoneu. But what you might not know is that we have an entire section of the site devoted to helping you get the most out of the information housed at FanGraphs.

The most well-known components of the Library are detailed descriptions of the statistics available at the site. These pages are written presuming no previous knowledge of sabermetrics, statistical theory, or mathematics. If you understand the rules of the game, you’ll have no trouble following along. For example, if you come across “wOBA” in an article or on one of the stat pages and have no idea that it stands for, our Library entry is here to help. Not only can you find a basic description of that stat, but there is also a detailed breakdown of how to calculate it, how to use it, why it is important, and all sorts of other information that will help you get more out of the site.

Read the rest of this entry »

Basic Principles of Free Agent Contract Evaluation

While it’s January and many free agents have decided where they will be playing in 2016 and beyond, there are still some notable players without new teams. One thing I’m struck by each offseason is how frequently some people comment on new contracts without a good grasp of how teams and players settle on a term of years and dollars. In particular, it’s common to hear these comments from pundits and fans who aren’t quite as plugged into the game as regular readers of sites like FanGraphs.

So for the new reader, or the old one looking to explain the finer points to their friends, here are some basic principles about free agent contracts to remember when thinking about their prudence.

Read the rest of this entry »

The Beginner’s Guide To Aging Curves

This time of year is about roster decisions. Teams are working to build their 2016 rosters with an eye on how 2016 fits into their overall plan. Some teams are looking at their current roster and payroll and deciding to go for it, while others are setting themselves up for a bright future. Clubs are making trades and signing free agents, and from the outside, we’re trying to figure out which moves are good and which aren’t.

There are a lot of factors that go into evaluating a particular transaction or set of transactions. Far too many to talk about all at once. But we can generally agree that our attempt to forecast future player performance is central to any effort. In order to know if the Cubs made a smart move in signing Ben Zobrist, we need to develop some prediction about how good Zobrist will be over the life of his four-year deal. Obviously, this is a tricky business.

We are trying to project Zobrist’s future. We’ve talked about projections in this space before. They are estimates of true talent, adjusted for aging. You can read more about the basics here, but this article will focus on the aging component. In order to make decisions about players, we need to know how good they are presently and how those skills will improve or decline in the future.

Read the rest of this entry »

Understanding The Qualifying Offer

The World Series ended just over a week ago, but the offseason is already in full swing. Free agents are free to sign with any club they wish and we’ve even had our first significant trade. The MLB offseason is a little slower to develop than some of the other major sports, but there is plenty to follow from the start. One of the first steps in the offseason journey is the extension and acceptance or decline of the Qualifying Offer (QO). The qualifying offer is a pretty simple concept that comes along with some relatively important consequences.

It works like this. Teams who are losing free agents are able to offer those free agents a one-year contract which the players can choose to accept or reject. If the player rejects the contract and signs with another team, the team who lost the player gets an extra (between the first and second rounds) draft pick the following June and the team signing the player loses their first round pick the following year. Because this is baseball, there are a number of nuances to that description.

Read the rest of this entry »

Two Different Ways To Be Wrong: Sequencing and Bad Projections

Baseball analysts are frequently wrong. Everyone who writes for this website picked the Nationals to win the NL East, for example. We also split between Detroit and Cleveland for the AL Central, with no votes for the Royals. Predicting baseball is difficult because it’s a game with many variables and lots of randomness. It is probably very unlikely that a world class chess player would lose to a novice in any one game, but it’s especially unlikely they’d lose more often than not over 100 games. In baseball, there are so many things impacting single games, and there are 2,430 games, so predicting a full season is especially challenging.

And as has been noted in a lot of places, we didn’t do a great job predicting the 2015 season. The Rangers, Astros, Blue Jays, Royals, and Mets weren’t exactly consensus playoff picks. Whoops!

This has led to plenty of push back against sites like ours and serves as a criticism of the work we do when it comes to predicting the game. Presumably, if we can’t accurately predict which teams will be good and bad, you might not want to put a ton of stock in what we’re saying. Surely, no reasonable person would hold anyone to a standard of perfection, but whiffing often can be a sign of a flawed process.

I’m not going to litigate exactly where our projections may or not be flawed in this post, but rather, I want to separate out two very different components of overall wrongness. In fact, there are essentially two ways in which our overall estimates of the league can be incorrect and you should understand the forces at play when determining how much stock to put into the work done here, and at other sites like Baseball Prospectus.

Read the rest of this entry »

The Beginner’s Guide To Pulling A Starting Pitcher

Unfortunately, if you are a major league front office employee, this is not a presentation of ground-breaking new research regarding the prediction of pitcher meltdowns that will save you innumerable frustrations. Rather, this post provides a summary of some of the basic factors that go into the decision to pull a starting pitcher. If you’re new to the game or are just starting to pay attention to sabermetrics, it’s likely that you haven’t really ever had a run down of the different decisions a manager needs to make when plotting out their mid- to late-inning choices.

The conventional wisdom is generally about two things, fatigue (usually in terms of pitch count) and effectiveness (usually in terms of a stat line or recent hitter performance). A pitcher will get yanked after 100-115 pitches unless they are absolutely dealing or a pitcher will get yanked if they’re getting hit around a lot. Over the first seven or eight innings, that’s typically the mindset of many. Of course, there’s the obnoxious “save situation” problem that arises in the ninth inning, but we’ll leave that for another day.

But in general, while fatigue and effectiveness are good variables, the decision to pull a starting pitcher is multi-dimensional. Let’s consider some of the factors in more depth.

Read the rest of this entry »

Context: Neutral or Dependent?

Every statistic is an answer to a question. “How often does a batter reach base?” is answered by On-Base Percentage. “How many extra bases does a hitter average per at bat?” leads us to Isolated Power. A statistic is only as good as it’s generating question and if you’re asking a silly question, the statistic may give you a silly answer. Stats like pitcher wins, saves, and RBI all answer questions, but they don’t really answer questions we really want to know the answer to.

RBI, for example, tells you how many times a batter has had their hit, walk, or sacrifice fly lead directly to a runner crossing the plate. On the surface, this may seem like a useful statistic as a measure of run production. But you soon realize that RBI is reliant on the number of opportunities each player has to drive in runs. Coming to the plate with a man on first and coming to the plate with a man on third are not the same type of RBI opportunity, even if the batter hits a single in both situations.

In other words, RBI is a very crude context-dependent statistic. Generally, RBI isn’t very useful because it doesn’t provide you with a lot of information about individual player’s role in the production of a run. If they have a lot of RBI, did they have a ton of opportunities? Did they cash in on a large percentage of their opportunities? You don’t really know. But the fact that RBI doesn’t provide much insight does not mean that context-dependent stats aren’t valuable when designed properly. Essentially, context-neutral and context-dependent stats are both useful, but they are simply answering different questions.

Read the rest of this entry »

The Beginner’s Guide To Single-Season BABIP

Batting Average on Balls in Play (BABIP) is one of the most commonly cited statistics in sabermetric analysis, and it’s role in mainstream coverage of the sport is growing as well. BABIP is a measure of how often “balls in play,” or non-home run batted balls, fall for hits. It’s an easy statistic to understand, but it’s not always the easiest statistic to use properly.

The problem occurs when people focus too heavily on one of the three main drivers of BABIP, which are player quality, defense, and luck. Most of the discussion surrounding BABIP is on the amount of luck that is involved. For some people, BABIP is simply a measure of how lucky or unlucky a player is getting over a period of time. But in reality, that is only part of the equation. Certain hitters consistently produce higher BABIP than others, and the presence of a good defense behind a pitcher can absolutely suppress their BABIP even before we consider the role of luck in the process.

Read the rest of this entry »

How To Use FanGraphs: Live Scoreboard

You’ve probably had a chance to peruse our leaderboards and player pages, and hopefully you’ve had a chance to check out our posts about getting the most out of the leaderboards and player pages. Another thing you might have seen on the site, or being shared on the internet, is our live win probability graph. It looks like this:

chart (8)

Read the rest of this entry »

Using FanGraphs to Find Bryce Harper Facts

There are a lot of reasons you might have arrived at FanGraphs. Perhaps you’re here for the articles or you’re just trying to find a detailed fantasy baseball game, but there’s a good chance that our various statistics are a big part of the draw for you. We host a lot of numbers and there’s a lot you can do with them if you know where to look. Last year, I put together a primer on how to use the FanGraphs Leaderboards to aid readers in their efforts to manage the information we provide.

If you’re new to the site, that’s a great place to start, but if you’re somewhere between newbie and expert, this post might help you get the most out of what we have to offer. When you’re thinking about baseball, there are a lot of questions you might want to answer. How do these two players compare? How does this player measure up historically? How rare is this particular thing?

Today, we’re going to use Bryce Harper‘s exciting 2015 season to explore some of the features available at FanGraphs. This isn’t an exhaustive run down of the tools, simply an explanation of some of the more useful ones that don’t get enough recognition. If you’re reading this in the future, the screen grabs for 2015 are current through July 18, 2015, but the links will update automatically with new data.

Read the rest of this entry »

The Beginner’s Guide to Service Time

While there’s rightfully plenty of focus on the events on the field, teams and fans are also interested in getting the right players onto the roster in the first place. This is why there’s so much focus on free agency, the trade deadline, and the draft. Games are won and lost on the field, but it’s a whole lot easier to win if you’ve assembled a good roster. As a result, we spend a lot of time evaluating roster moves. We care about how well teams are using their resources to assemble a team. One of the important concepts to understand when evaluating these moves is service time.

Service time is exactly what it sounds like; the number of years and days of major league service a player has in their career. Typically, it’s written as Year.Days, so we would express a player with four years and one hundred and fifteen days of service time as 4.115. You earn a day of service time for every day you are on the 25-man roster or the major league disabled list during the regular season. If you’re called up on June 22 and you’re sent down after June 28, you’ve earned seven days of MLB service. Your team doesn’t have to play a game for you to accrue a service day.

There are usually about 183 days in an MLB season, but a player can only earn a maximum of 172 days per year. That means if you’re on the roster for 178 days, you earn 172 days. If you’re on the roster for 183 days, you also earn 172 days. Not surprisingly, 172 days of service is equal to one year of service.

Read the rest of this entry »

Team Record, Pythagorean Record, and Base Runs

The currency of baseball is wins. The ultimate goal is to win enough games to make the postseason and then win enough games in the postseason to win a World Series. For that reason, we care a lot about what leads to wins and losses, and outscoring your opponent is the only path to victory. This is all pretty obvious, but if we unpack it we stumble on to some pretty important realizations.

Before we go anything further, this post stays at 30,000 and serves as an introduction to Pythagorean Record and Base Runs. I won’t be going into the details of the exact formulas, but rather why these statistics are useful when looking at the team level. If you’re already well-versed in the various expected records, there probably isn’t a lot of new information below.

Read the rest of this entry »

Measuring Pitching Value is Complicated

You’re likely aware that there are different versions of Wins Above Replacement (WAR) housed here, at Baseball-Reference, and at Baseball Prospectus (called WARP). For a lot of people, this makes the statistic confusing because it seems like there shouldn’t be multiple ways to calculate something with the same name. To the credit of the critics, somewhere along the way we should have agreed on a way to make it easier to communicate which statistic is which that’s a little more clear than fWAR, rWAR, and WARP, but that’s not the focus of the discussion today.

When it comes to WAR for position players, the differences among the models are less philosophical and more technical. The sites use different defensive components, different base running stats, and a few other differences in the same vein, but the overall approach is pretty much equivalent. The inputs are different, but the different WARs agree on what should be measured. When it comes to pitching, it gets more complicated because what should be measured becomes the debate itself. This article doesn’t intend to tell you which WAR is best, but rather to walk through the decisions that one needs to make when evaluating a pitcher’s value.

Read the rest of this entry »

The Beginner’s Guide To Plate Discipline

At its heart, baseball is a battle to control the strike zone. There are plenty of other things going on, but the origin of the action is over the plate. Good hitters make good decisions about when to swing and when to take and good pitchers attempt to negatively impact that decision-making process. As the importance of walks and working counts became clear over the last generation, hitters who knew the zone and pitchers who could generate swinging strikes became very popular.

Throughout history, batters have been judged by their results. Things like batting average and RBI have given way to wOBA and WAR, but in general the average fan cares about the outcomes rather than the process. Plate discipline numbers are inherently process based. You don’t get credit in the box score for taking a pitch just off the plate, but taking a pitch just off the plate is probably going to help you do things that lead to runs, like walking and getting good pitches to hit.

Read the rest of this entry »

The Difference Between Range and Positioning

Perhaps one of the biggest objections people have with the current state of defensive metrics is that the stats don’t account for the starting position of the defender. Shift plays are excluded from the calculations, but when a center fielder plays in 20 feet, the system doesn’t know that he’s starting from a different spot than the average center fielder, which could obviously lead to some imprecise accounting.

This is true for every position except pitchers and catchers, as the starting location of the fielder influences the probability they will make a play, independent of anything they do from the moment the ball is pitched. If you start out of position, even if you run at top speed and take a perfect route, you might not be able to offset the initial disadvantage of not being in the right spot to begin with. This creates problems, but there’s a lot of nuance to these problems that are worth discussing, even as we get closer to having StatCast and rendering the discussing irrelevant (we hope!).

Read the rest of this entry »

How To Use FanGraphs: Depth Charts

In addition to the daily analysis and normal statistical offerings, FanGraphs has added some pretty useful and powerful features over the last couple of years. Anchoring a lot of those features are the Depth Charts, which in addition to providing information on their own, power the playoff odds and projected standings we host on the site.

The Depth Charts are pretty simple in theory. They blend together two of the leading projection systems (Steamer and ZiPS) and then scale those projections to our expectations about playing time. The Depth Charts are updated constantly to provide the most up-to-date snapshot possible for the current state of a team, league, or position. You can think of the Depth Charts as the baseline projections for the entire site, as they are the input for the projected standings, playoff odds, and game odds.

As far as the basic Depth Charts are concerned, there are essentially three different views. You can look at a team’s Depth Chart, you can look at Depth Charts by position, and you can look at the summary data of both of those at one. To generate each the charts, we take a 50/50 mix of Steamer and ZiPS for the rate stats and then our staff manually allocates playing time based on what we expect teams to do with their lineups and injury histories.

Steamer and ZiPS update nightly throughout the season and our playing time estimates change every 15 minutes (if necessary). If a player gets hurt, we update their playing time. If a player gets moved to the pen or changes positions, we update the Depth Charts. Also, the Depth Charts are showing what we expect to happen for the rest of the season, not the stat line we expect them to end the season with.

As always, when you’re dealing with constantly updating information, there are occasionally bugs. If you see something that looks obviously wrong, it’s likely just a database error that will resolve itself once the system updates in a few minutes.

As far as viewing options, you can look at the Depth Charts in team view, in position view, or in summary view. In team view, you get a breakdown of a single team by position, meaning on the Blue Jays page there’s a box for catchers, first basemen, etc with the expectation that each position for each team will receive 700 PA per season. Obviously that will vary a bit, but it’s a good rule in general. Each team also has a box for all positional players and all pitchers, as well as a box on the right that shows you where they stand overall.

In position view, you can look every team’s Depth Chart at any one position. For example, here is the page for catchers. This allows you to compare positions around the league and see which group of backstops is most valuable. Obviously these rankings are based on the projection systems and our playing time estimates, so if you believe playing time will shake out differently that we do, you might expect to see a different overall ranking.

Finally, this handy grid collapses those two views into one. You can’t see all of the players in that view, but it puts together each team’s expected WAR at each position so that you can quickly compare how teams and positions stack up against each other.

The Depth Charts are very useful for a couple of reasons. First, they blend two projection systems together without you having to do any of the work, and that’s helpful because aggregate projections are better than any one system. Second, playing time is controlled by humans. While projection systems are much better at forecasting performance than people, projection systems aren’t very good at figuring out how much playing time a player is actually going to get. Finally, the Depth Charts gather a lot of information in one place. We’ve had projections on the site for years, but having them built into the system like this allows you to make a lot of comparisons and see where teams are strong or weak.

So as you get back into the swing of things this season, the Depth Chart pages will be a valuable resource if you want to look into the future. Obviously, the charts are only as good as their inputs, but if you care at all about the inputs, the way the data is presented is really helpful.

The Beginner’s Guide to Sample Size

A baseball season is the amalgamation of a lot of little events. Each pitch fits into a plate appearance which fits into an inning which fits into a game which fits into a series which fits into a season. That’s a lot of little data points flowing into an overall end result. We care a lot about which players will have good seasons and careers. It matters to us that we can distinguish between good players and bad players, but doing so requires that we understand which chunks of data are meaningful and which aren’t.

Enter sample size. You’ve heard this phrase plenty over the last few years when talking about baseball statistics and it’s usually a conversation ended rather than a conversation started. Someone cites a stat and then another person says it doesn’t matter because the sample size is too small. What does that mean and how should we properly think about sample size in baseball?

Read the rest of this entry »