## wOBA As a Gateway Statistic

Despite all of the rhetoric and talk-radio bluster, sabermetric principles and statistics aren’t actually very complicated. It might take a sharp statistician or savvy programmer to derive perfect park factors, but it doesn’t take anything more than a curious mind to understand and apply the basics. In my time working to help spread these principles, one of the most common and useful questions I get is about which few statistics a person should learn when trying to get into the world of advanced stats.

On Wednesday during my chat I got such a question. Here’s how I responded:

I’ve handed out this advice before and I think it holds up pretty well. If you can master those three or four concepts, everything else falls into place pretty easily. Baseball is a constant struggle to score more runs that the other team and unpacking the events to determine their specific contribution to that act is really all you have to do. Not only will Weighted On Base Average (wOBA) help you do that, but it will unlock many more sabermetric doors.

Often times you might hear a sabermetrician tell you why batting average isn’t as useful as wOBA, but instead let’s pretend you’ve been watching baseball your entire life and you’ve somehow never seen an offensive statistic and have been tasked with building one. I suspect, if you had the proper tools, you would create wOBA *before *you even thought to create batting average or OPS.

If you were to build a rate statistic you would want something that included all offensive actions (walk, hit by pitch, single, double, etc) and you would want to weigh those actions based on their relative contributions to run scoring. You would want to assign some value for a walk and then some greater value for a single and so on. That’s intuitive. That’s simple. That’s wOBA.

Essentially, the only other steps Tom Tango used when he invented wOBA were to scale it to OBP (so that we could more easily transition to the new number) and adjust the values so that the value of an out is equal to precisely zero.

As a result of this rather straightforward thought process, you now have a statistic that properly weighs all offensive actions and is already scaled in a way that is quite familiar. Why would you only want to know how many hits a batter has in the plate appearances in which he doesn’t walk, get hit, or sacrifice? Why would you want a statistic that treats a home run and a single equally? If you can do better, you wouldn’t.

*wOBA = (0.690×uBB + 0.722×HBP + 0.888×1B + 1.271×2B + 1.616×3B +
2.101×HR) / (AB + BB – IBB + SF + HBP)*

The wOBA formula (shown above for 2013) might look a little intense, but it only requires basic multiplication and division, and it’s not like you ever calculate a player’s stats by hand anymore.

What’s so great about wOBA, beyond its tangible analytic value, is that it can set you up for success across the sabermetric kingdom. You can turn wOBA into runs above average by subtracting out the league average wOBA and dividing by the wOBA scale (both found here) and then multiplying by plate appearances. wOBA doesn’t include park factors, but with a few adjustments wOBA turns into wRC+, which is the most comprehensive offensive rate statistic we have.

Not only is wOBA the foundation of quality offensive measurement, the process by which we found wOBA teaches us to think like sabermetricians. You have to think about the value of every individual action. How much does Event A change the odds that a run will be scored and how does that relate to Event B?

If that last paragraph seemed painfully obvious, it’s because the fundamentals of sabermetrics are extremely simple and wOBA is a great illustration of that point. It might be a little tricky to calculate a player’s Wins Above Replacement (WAR), but the actual building blocks are simple. How many runs do his actions collectively add to his team compared to a freely available player? wOBA teaches you to think in terms of *relative value* and in the language of* runs and wins*. If you can master those two concepts, you have essentially passed Sabermetric Theory 101.

It takes more skill to become a quality analyst with the ability to unpack performance and project into the future, but if you’re only looking to measure past performance, learning wOBA more or less does the trick on offense and provides you with the thought process necessary to interpret the rest of a player’s game.

So while more information and more data is always better, wOBA is really one of the statistics at the heart of sabermetrics. It’s simple, intuitive, useful, and it sets you up to learn other statistics quite easily.

*Have a question about wOBA? Ask it in the comments section!*

Print This Post

## Leave a Reply

122 Comments on "wOBA As a Gateway Statistic"

You must be logged in to post a comment.

You must be logged in to post a comment.

Why is hit by pitch slightly more valuable than an unintentional walk?

They’re very close, with the difference being that HBP happen more randomly than walks. On occasion, pitchers will pitch around a batter with a base open to get to a weaker hitter (but not an official IBB), so some subset of walks happen when they will have slightly smaller implications for run scoring. HBP doesn’t have that effect. Like Matt says below, you can also use a standard formula that makes them equal. Not a huge difference!

Would this by extension imply that hitters who are “better” at getting hit by pitches – which some certainly are – are actually providing slightly more value than hitters who draw the same number of walks instead? So, in a situation where a pitcher really needs to throw strikes, like with the bases loaded, he is more likely to accidentally hit Shin-Soo Choo than he is to accidentally walk Joey Votto, relative to the respective frequencies of each of those events in isolation, meaning that Choo has a slightly more valuable skill in this instance?

I believe there is some evidence of this, but also that for wOBA it’s mostly about the situation rather than the skill.

It’s certainly worse for the pitcher and better for a hitter (if we don’t think about his bruises… ouch) to reach base via HBP than walk. The HBP, unlike the walk, does not require 4 balls (insert testicle joke here) and therefore requires fewer negative outcomes (1 pitch in the wrong spot rather than 4) on the part of the pitcher to result in the batter taking first base. I think that makes it a slightly better result for the batter than the walk. You can’t take a walk in an 0-2 count, but you can still get hit by a pitch and reach base. That’s technically a better job done by the batter even though they most likely played no part in getting that free base.

so wouldn’t that have the opposite effect and make a walk more valuable? Let’s say a pinch hitter comes in with an 0-2 count. Taking a walk is more impressive than getting HBP. It takes the discipline to not swing at 4 bad pitches vs. the discipline to not swing at one bad pitch which is going to hit you.

Statistically, it takes more consecutive mistakes to take a walk than to take a HBP. Because it requires more consecutive negative events, my theory is the HBP is statistically more valuable.

mistakes on the part of the pitcher, that is

The difficulty of the task has no bearing on how valuable it is.

Something extremely difficult or rare (like a triple) is not more valuable than a home run just because it is more difficult to perform.

The result’s value is what is being measured, not the degree of difficulty.

This is fantastic, Neil, thanks.

I’m also very fond of Tango’s “standard wOBA” formula, which is basically just a simpler version of the above that doesn’t change year-to-year. In other words, wOBA for the lazy:

0.7: UBB+HB

0.9: 1B+ROE

1.25: 2B

1.60: 3B

2.0: HR

Denominator: PA – IBB – SH

Isn’t that nice? For people that don’t feel like looking up the weights every time but want to calculate it themselves, that’s a good place to start, and still has 99% of the practical value of “real” wOBA.

Yeah, this works very well if you need it to.

What does uBB stand for?

Just kidding. I am guessing it is just walks.

Unintentional walks

Why are intentional walks not counted towards a player’s production?

That is a good question. It certainly correlates to how much damage a player is perceived to be able to do. Barry Bonds was intentionally walked 100 times because of his offensive skill, so it makes sense to credit him for that.

Because, on average, they are generally issued in situations that provide more desirable base/out states to the defending team. The “run values” that are assigned are calculated based on the increased likeliness of runs scoring from the state before the event and then after the event.

So the run value of a bases-empty single is the difference between how often a team scores from a bases empty situation and how often they score with a runner on first. Then you just average those odds for every possible combination of outs and runners on base, weigh each of those values based on how they occur, and — POOF — you have an answer to how valuable a single is in a context neutral environment.

I don’t really see how the first part answers the question. An IBB still changes the run value, and certain players are issued them more, so I think there is a case for including them in wOBA.

I read a comment that Tango made once about this subject that for the purposes of wOBA laid out in The Book, it didn’t make sense to include IBB, but on a site like FanGraphs it arguably would.

Because IBB’s come when they are advantageous to the defending team. UBB’s typically come when it’s advantageous to the offensive team. This makes a case for the run value of the IBB to be lower but still included in the calculation, so I’m guessing that the average run value is close enough to scratch that it isn’t bothered with.

But they’re also excluded from the denominator, so it gives the impression that sluggers who get IBB’d a lot are slightly more valuable than they are. It probably doesn’t matter for anyone not named Barry Bonds but excluding them from wOBA always seemed lazy to me.

Anyone who says that an intentional walk comes in situations in which it is advantageous to the defending team has just never had the misfortune of watching Fredi Gonzalez manage on a regular basis.

I’d like to see a modified version of wOBA where the denominator is PA and more possible outcomes are included in the numerator. For instance, maybe hitting into a double play should have a negative multiplier since it’s actually worse than a strike out. IBB should have a positive multiplier but smaller than the one for uBB. Etc.

Dividing by PA really appeals to me since it makes the stat a measure of how much “damage” a player does each time he steps up to the plate.

RE24 handles what you want.

IBB come at an advantageous time precisely because the hitter is so good.

SD walked Barry Bonds with the bases loaded, guaranteeing one run. Yes, if Bonds hit, maybe four scored. But if Eddy Gaedel was up instead of Bonds, they wouldn’t walk him and they wouldn’t score that run. Seems like Bonds should be rewarded there

Sure, it’s granted when it’s advantageous (on average, in theory) for the defending team to do so, it’s still beneficial. Maybe its value is something like a third of a run? Less? It’s still something.

I think the main reason it isn’t calculated is because every other part of the calculation is a measure of said player’s actual performance, meaning, the IBB is out of the control of the hitter and therefore he cannot affect the outcome. The goal of wOBA (and many advanced stats, really) is to isolate a player’s contribution on his own merits. The decision of a manager in a particular situation does not equate to a player’s value. This would also become a skewed statistic if one factors in the league a player is in. Without a doubt, more IBB occur in the NL than in the AL because of the pitcher’s spot. No one “feared” Brandon Crawford two years ago, but teams gladly put him on first base and took their chances with Barry Zito’s flailing cluelessness at the dish instead. Bottom line, it doesn’t account for a player’s “real” contribution nor does it reflect a given skill or even luck; it’s more often than not based on situational chance (with Barry Bonds being the lone exception), and we know that simply doesn’t work for accurately evaluating a player (read: RBI).

I was thinking the same thing. While the batter is not actually doing anything in the batters box when an IBB is given, there is always a reason why that batter is receiving an intentional free pass.

An intentional pass is almost always given in order to get to the next batter with the thinking that the next batter has a higher chance of making an out (unless I’m missing a particular situation off the top of my head). It could also be to set up a double play. Either way, the point generally stands.

Past performance dictates the team’s decision to intentionally walk one batter to get to another, unless it’s an NL team looking ot get to the pitcher. In that case the past performance of the 8 hitter really doesn’t matter since the pitcher is pretty much guaranteed to be a lesser hitter.

I guess my overall point is that I don’t think it should be completely eliminated from the equation but it should not be given much value either. With most batters the added value of IBB likely wouldn’t make much of a difference in the wOBA in the end.

I just looked it up too and there have been 25 occurrences since 1950 where a player has had more than 30 IBB in a single season. That’s actually a larger number than I expected. Those 25 occurrences come from only 12 different players as well.

Side note: Wow did Bonds have a lot of intentional walks. He pretty much dominates this category. 6 of the top 10 seasons for intentional walks since 1950 are from Bonds with him nearly doubling his (at the time) MLB record of 68 in 2002 with 120 in 2004. There’s truly never a dull moment looking at Bonds’ numbers.

Seconded. I don’t understand why IBBs are, in effect, omitted from this computation entirely, when they do have a real effect on the run-scoring potential and win probability of a game. I don’t doubt that there is a reason for it, I just want to know the rationale.

Ha…I sound like kind of an idiot now – I swear that there weren’t all those comments preceding mine when I started writing this!

For the record though, I would like to hear the author’s take on this.

If I understand correctly, the weights for wOBA are based upon the average of all game situations/ballpark adjustments etc. that occur, and are determined by regression.

Now you would expect that teams wouldn’t issue an intentional walk if they thought it would increase the expected number of runs. So intuitively, you’d expect the weighted-average increase in runs from intentional walks to be zero.

I don’t know what the actual data was, but I’d be willing to bet that any regression coefficient determined for IBBs would probably have a low significance. If there was a statistically significant benefit from intentional walks, perhaps a judgement call was made not to reward batters for opposing managers’ ignorance ;)

Alright, hopefully everyone sees this!

Here’s the deal with IBB.

First, they are worth considerably less than uBB. Something like .15. In addition to that, IBB are determined largely based on the situation, not the batter. Obviously you would walk Miguel Cabrera more often in the same situation than Andrew Romine, but the IBB situation is generated independent of the batter and the batter played no role in the actual event. Yes, the fact that the batter is good led to the walk, but 8th place hitters in the NL also get IBB’d even though they are terrible.

Also, since we use PA as the multiplier when going to batting runs (or wRAA), guys who get IBB’d a lot do get a bit of a bump when it’s time for value metrics.

If you want the technical debate, the comments here are good. http://tangotiger.com/index.php/site/comments/fip-now-on-baseball-reference

Also, there is some debate about the merits of IBB in FIP. We include them, but some people think you shouldn’t. This is a topic for another day, but the important point is that you can create a wOBA with IBB, but they would be worth less by a lot, and may actually mislead you about a hitter’s production.

I think not including IBB is totally justifiable for the reasons you state here. The issue is basically just with Barry Bonds. He pretty much broke this reasoning entirely, considering he would get intentionally walked in situations where for any other player they wouldn’t even consider it. (I think he has been intentionally walked with the bases loaded, and certainly many times with the bases empty.) For him, and similar cases, the run value of that IBB is tied completely to the player’s skill and the run values of those IBBs are somewhat context independent and at times very valuable. So, a typical player may have his offensive skill better represented by excluding IBBs, but an extreme case like Bonds would be completely shortchanged by excluding them.

Good work Neil. My wOBA questions:

1. Is wOBAScale on the guts page this?

“scale it to OBP (so that we could more easily transition to the new number) and adjust the values so that the value of an out is equal to precisely zero.”

2. Are the wOBAs on the player pages calculated using the weights as they appear, or are the more decimal places?

3. Is it appropriate to use the wOBA weights to calculate a break even percentage of taking an extra base? For example, on a single towards the gap or line the value of not attempting to reach second is .888. The value of attempting to reach second is 1.271 x Pr(Safe) + 0 x Pr(Out). This implies that you would be indifferent on trying for second if the success rate is 69.9%.

#3: not exactly. Part of the run value of a double comes from how it is more likely to score runners: e.g. you can’t score a guy from first on a single, but sometimes you can on a double. The decision to take another base doesn’t really affect the other runners though, it really only affects the next base-out state for that runner, e.g. –1 and 1 out vs. -2- and 1 out. I think it would be more accurate to just assume the batter is on first and treat the calculation as if the runner is trying to steal second base.

If I understand your first question, yes. I don’t actually know the answer to #2, but I have never gotten the wrong answer when calculating by hand so if there are more decimals, they aren’t hugely important.

To #3, Bip’s response is pretty good and your idea is right. You could use the wOBA weights like this, but you need to make the adjustments based on the fact that part of the hit is advancing runners and part is advancing the batter. So it’s probably just safer to think of it like a SB.

I’ve seen some versions of wOBA that include a positive value for RBOE (reached base on error). The formula you provided doesn’t include an RBOE value, but the “standard” formula that Matt Hunter suggested equates RBOE with a single. Can you explain the difference and explain why, in some situations, reaching base on an error is considered a positive event? I recognize that it’s literally positive (i.e., the hitter is now on base), but I feel like most of our offensive stats — unlike pitching stats such as FIP — account for actual defensive performance. (That is, while FIP strips out defense, most sabermetric offensive stats that I’ve seen don’t reward the hitter for defensive error.) Thanks.

I think it makes sense to include it because some players, probably faster players who hit a lot of ground ball, will likely have a higher “true talent” ROE rate than a slow, fly ball hitter. So it’s not just random variation.

I think bbref had that broken out in their WAR calculation based upon somebody’s study showing that there is such a thing as ROE “talent”.

So there’s some debate about the value of RBOE and if it’s a skill or not. The best answer I can give is that there’s an argument for doing it both ways and it only really matters a lot for a couple of players. The subjectivity of the error distinction is a problem and we’d probably all be better off if there was no such thing, which would solve this problem and others.

How are the weights calculated/derived from year to year?

Also, thanks for the article; it’s incredibly informative!

Using linear weights. We have some information in the Glossary on that if you’re looking for a primer.

Hey Neil great piece. So the problem I have as a stats noob is understanding meaningful ranges for stats like wOBA. Everybody knows a traditional stat like batting average is good around .300, slugging .450+. What’s the line that separates bad, good, and excellent in wOBA? Thanks.

Here are some percentiles in 2014, among qualified hitters:

Max: .448

90th: .384

75th: .355

Median: .329

25th: .307

10th: .289

Min: .245

These will generally tend to be lower for non-qualified players, of course, but Tom Tango’s adjustment that Neil mentioned in the article sets the average wOBA at right around .330. If you’re a starter, I’d say .345-.350 is pretty good, .370 is very good, and .395-.400 makes you a top-10 hitter.

Isn’t wOBA scaled the same as on base percentage? If you have a sense of what a good OBP is, then apply that to wOBA. I think if you just look all hitting events, league-wide average wOBA is like .315 or something.

To get a sense of how good wOBA is at a given time, you can look a player’s wRC+. For that stat, a player has a 100 if his wOBA is about what you would expect an average hitter to have in that ballpark that year. How much wRC+ is greater or smaller than 100 is how much better or worse the hitter is compared to an average hitter. >125 is very good, >150 is excellent, and >175 is MVP-caliber numbers.

Even more specifically, 125 is 25% better than league average, 150 is 50% better, 197 is Babe Ruth… Etc.

Great question! We have a table and graph to answer this very question in the actual glossary entry about wOBA. Newly updated, in fact! http://www.fangraphs.com/library/offense/woba/

I am certainly an wOBA believer. Isn’t the problem with explaining the statistic to the non-sabremetrically inclined, however, is that linear weights are not intuitive? I think your explanation to just say “weight” makes a lot of sense, but the response I have gotten from people is “well, where does that number come from?” And even if you explain it, some continue to object because (1) its complicated how to produce it and (2) it varies based on data. I know the standard weights are listed above, but I think the biggest objection that many have is not liking the idea that the relative weight *could* change. I suspect that most non-sabremetrically inclined fans believe there is a true, universal weight (that should also happen to take the value of a whole number or a common fraction). It seems to me that getting over that hurdle is really the big one with wOBA and lots of other stats (like WAR).

I think I would leave out the part about the weights changing unless someone asked. I think for me, the thing I would anticipate being difficult is the idea that wOBA is context-neutral, especially if a person is fond of citing RBI and BA with RISP. It might be hard to explain to a person like that why it makes sense to count a two-out bases empty single the same as a bases-loaded single. Perhaps in response I would point out that home runs and batting average are also context-neutral.

Some people are definitely beyond hope. I’ve seen people object to WAR on the basis alone that it is a complicated formula. “I can see a RBI, but I can’t see a WAR”. If someone is completely statistically illiterate and not that interested in learning about what statistics actually are, then there’s nothing anyone can do.

I’ve had some success recently promoting RE24, which is not context neutral, its just (run expectancy after PA)-(run expectancy after PA), and the 24 base/out situations make easy intuitive sense.

*(after)-(before)

I understand your point, which is partly why Matt posted the standard wOBA. I start with the idea of weights and then if they protest, talk about the run environment. The numbers change because the level of offense is always slightly changing. If you go through history, the ranges are actually pretty small, too.

I have to add to what Jeff said… Every time I have brought up WAR (and wOBA) to a friend, the objections revolve strongly around these “made-up” numbers. This has been the challenge area and is potentially a very real problem for the adoption of sabermetrics going forward.

I can show a die-hard baseball fan WAR, explain it is based off UZR, wSB, UBR, and RAA which is based of wOBA, which is calculated by…. By the time we’re here only the true enthusiast is listening.

Yet I have even as I have gotten to this point with some friends and the linear weights, which could certainly appear very made up (and in a sense, they are), throw them off completely. Because they are complicated to produce and we are already this deep in explanation, the question marks anyone has remain, and people are still skeptical.

When you run into someone like this, my advice is to ask them why all hits should be valued equally and why walks shouldn’t count, if they like batting average. If they can’t give you an answer, their only choices are OPS or wOBA and wOBA is just better.

I often find asking questions of the skeptic can be helpful rather than trying to start by convincing them.

That’ll never work.

Not sure if you’ll see this, but using extreme run environments could help explain the changing weights.

If the team OBP is .750 (or anything really high), extra base hits aren’t as valuable compared to a walk or a single, because most players will score anyways.

Is there any stat that incorporates stolen bases into wOBA? I think something that credits a stolen base and punishes for a CS could really help for total offensive production and will show ability to create runs better.

We used to do this, but decided wOBA is better as a hitting only stat. We have baserunning stats that do exactly this; wSB, UBR and BsR (which is the first two combined).

I think it makes more sense to talk about the two skills separately and then add them together into stats like OFF or WAR. Ask if you have specific questions about any of these!

Alright, I am buying what you are selling. What is the OFF stat?

OFF is Batting Runs + Baserunning Runs. So what you have is essentially, wOBA converted into runs above average based on the number of PA, adjusted for park factors and then added to baserunning value.

Hey Neil, I actually have a question about the wSB calculations. I am very confident in calculations woba, I even calculate it for the German Baseball League. However, I keep SB and CS in my calculations because somehow I am not able to reconstruct the wSB values.

You have the run values and the woba values, which are just the run values divided by the woba scale. Even with all the information available here at FG it is impossible for me to recalculate the exact wSB values.

Could you please walk me through the wSB calculations of let us say Mike Trout?

very, very much appreciated and I am looking forward to further installments.

No problem. Let’s use Mike Trout 2013.

Trout stole 33 and was caught 7 times. runSB was .2 and runCS was -.382 per the Guts! page.

wSB = SB*runSB + CS*runCS – lgwSB*(1B+BB+HBP-IBB)

The only number you don’t have there is lgwSB, which is equal to SB*runSB+CS*runCS/(1B+BB+HBP-IBB) for the whole league.

So first you find lgwSB based on the entire league’s 2013 numbers, which equals 2693*.2 + 1007*-.382/(43596) = .0035

Then you take Trout’s numbers and plug them into the first formula.

wSB = (33*.2)+(7*-.382) – .0035*(224) and you get 3.142, which we round to 3.1.

Does OFF have a positional adjustment?

No, the positional adjustment is added in separately.

Is there a way to calculate runs added from wRC+? It just makes sense that park and league adjusted data would be more accurate when talking in terms of runs and/or wins.

All of the run values we use in WAR include park factors. So wOBA gets you to wRAA (batting runs above average, essentially), but neither have park factors. The Batting Runs you’ll find under the value tab are basically wRAA with a park adjustment.

I wouldn’t call it an issue, but one aspect of wOBA, I wish could be included is the productive out.

Ex: A batter steps to the plate in the 4th inning with a runner on 3rd and 1 out.

Obviously a HR > 3B > 2B > 1B. But then here comes the tricky part. What is the next best desirable outcome?

If the batter in the on deck circle is Giancarlo Stanton then a walk is certainly more beneficial than a sacrifice. But what if Ed Lucas is on deck with the pitcher to follow? Is a walk still better than the productive out?

And even if we were to accept that a walk is always preferable than a productive out, what about productive outs vs. non productive outs.

Ex: Runner on 2nd with nobody out.

Obviously a HR > 3B > 2B > 1B > HBP > BB. But isn’t advancing the runner a more beneficial result than an out with no advancement? Why should those two outcomes be treated the same, when one outcome is clearly more desirable?

So wOBA is context-neutral. We don’t have a context dependent rate stat, but RE24 is a stat that measures you affect on run expectancy in terms of runs, so it does give you more credit for advancing a runner with an out than if you don’t.

I think you point is a valid one, but if we take it to the extreme, you should get more credit for a home run with three men on than with no one on, except that you didn’t do anything to put those runners on base. Presumably you did something to advance the runner, but maybe not. wOBA doesn’t measure context, it’s just not what it’s designed to do. There are other stats that do.

wRC+ is my favourite stat when looking at batting performance and I would really like to know the calculation to convert from wOBA/wRAA. Basically, how are the figures adjusted for league and park? Thanks

This is coming in a future post. Very easy to understand, but takes a bit of work.

I thought of (basically) wOBA on my own as a young teenager years ago…not to credit myself, but to point to how intuitive it is, even for a kid to understand. It’s not like anyone’s asking you to ignore BIP, clutchness, or some other earth shattering saber concept.

First, you get used to wOBA. Once you’ve acquired a taste for it, you might find yourself willing to try some xFIP, or maybe a sample of UZRs. Before you know it, you’re on Fangraphs partaking of their smorgasbord of pitch win values, RE24s, LIs and BABIPs and SIERAs (oh my!), getting caught up in splits by leverage and heat maps and grabbing this all so fast that pretty soon you’re cranking out speed scores on your own and you can’t sleep because you are scouring the web for BatFX and why oh why can’t I find BatFX and WHY THE W#%$@!! DON’T THEY MAKE FREAKING BatFX AVAILABLE!!!………

It all started with a little wOBA.

This is a great response.

Why doesn’t the numerator in the wOBA formula represent runs created? wOBA is supposed to be analogous to BA, which is just H/AB, or OBP = times on base/PA, etc., so one would think wOBA would be total runs created/PA. But it isn’t. Neglecting the scaling factor, which doesn’t change the basic form of the relationship, wOBA = (wRC/PA) + k, where k = lgwOBA – (lgR/PA). Both of those terms are fixed at any one time, though they do vary slightly over the season and somewhat more from year to year, and lgwOBA is always considerably higher than lgR/PA.

IOW, why doesn’t lgwOBA = lgR/PA? Wouldn’t it be useful to have a formula that represented runs created per PA, just as BA represents hits per AB, and so on? I understand that wRC+ is the best final offensive (hitting) stat, but it doesn’t actually tell you how many runs anyone actually created per PA. Total runs created is like a better RBI or runs scored stat, and making it an average stat would correct for differences in games played or PA.

I think the problem here, or part of it, is that the wOBA formula, as you note, only takes into account positive events. While the positive value of not making an out is included in the coefficients, the negative value of making an out is not included. Every time a batter makes an out, he decreases the probability of his team scoring a run, but this is not included in the wOBA formula. Making an out doesn’t change his wOBA, except for the slight increase in PA. Thus you get the strange situation where a batter can produce a net zero offensive runs, even negative offensive runs, yet still have a positive wOBA. Obviously the runs he creates according to the numerator of the wOBA formula are not the same as the actual runs as a result of all his PA.

The wOBA equation is correct. Just because you have a wOBA of .100 doesn’t mean you did anything good, any more than having a .100 OBP does something good for you.

“Making an out doesn’t change his wOBA, except for the slight increase in PA.”

EXCEPT? That’s a huge exception! Take for example a hitter with a .320 wOBA on 600 PA. That would mean he has “192” in the numerator and 600 in the denominator.

Now, give this player 20 walks. That adds 14 to the numerator (now at 206) and 20 to the denominator (now at 620). His wOBA is now .332.

Now, give him 24 outs. Numerator stays at 206 and denominator at 644. wOBA is .320.

So, adding 20 walks has been cancelled out by adding 24 outs.

And that’s perfectly fine.

I didn’t say it wasn’t correct, I said it isn’t analogous to BA, OBP, etc. There is a disconnect between lgR and lgwOBA. lgBA = lgH/AB, but lgwOBA does not = lgR/PA. Why not? How can a league average hitter not create a league average number of runs per PA? A league average BA hitter creates a league average number of hits per AB.

Wrt your example of a batter’s wOBA being reduced by making outs, yes, but only through an increase in the denominator, PA. Why doesn’t an out reduce the numerator. If there is a positive value of not making an out, how can there not be a negative value of making an out? Doesn’t that positive value derive from being a not negative? When a hitter makes an out, he reduces his team’s chances of scoring.

Also, the formula for determining wRC from wOBA implies the existence of negative runs. As I noted earlier, it can be re-written as wOBA = wRC/PA + k. Using the current values of the relevant parameters, k = 0.133. Suppose wOBA = 0, as would be the case if a batter made an out every time he hit. Then wRC/PA is negative, – 0.133. So e.g. if the batter makes an out in every one of 100 PA, he creates 13.3 negative runs.

What I don’t understand, and this is where I hoped you could help me, is that the negative value of an out seems to change. In the extreme example of a xero wOBA, each out has a value of -.133 runs. But if there is a positive wOBA, the effective value of an ou is less than this (in a negative sense, more than this is an absolute sense).

“Why doesn’t an out reduce the numerator. ”

That’s the beauty of the wOBA. I get around it by the way I constructed the equation.

And wOBA is NOT runs. It’s equivalency is times on base. It’s weighted times on base. Hence, the reason for the name: Weighted On Base Average.

What I would suggest to you is to accept that wOBA is properly constructed. Then, ask questions. Don’t ask questions as if you have an opinion that’s been formed and I have to refute it.

wOBA is correct. Ask questions on that basis.

I’m not sure how many times I have to say that I’m not disputing the correctness of wOBA. You say it’s not runs, but is not weighted times on base equivalent to runs? Isn’t the logic for weighing different offensive events that they differ in the probability that they will result in runs scored? wRAA = wOBA – lgwOBA (neglecting the sf) x PA, so there is a clear correlation between changes in wOBA and changes in runs. I’m just asking why this correlation is not extended from lgwOBA down to lgR.

Neil, I understand your point about the negative value of a low but positive BA. But in this case the correlation is extended. As I noted before, lgBA = lgH/PA. And .000 wOBA would not be league average, any more that it would be league average BA. There is a league average wOBA, I believe it’s .312 currently, and any difference between that and an individual hitter’s wOBA reflects his run creation average, i.e., wRC/PA. But unlike BA, which goes to zero when H goes to zero, wOBA does not go to zero when RC goes to zero.

I freely admit to being confused here, I’m not trying to upstage or correct anyone, I’m just trying to understand why this correlation doesn’t hold.

Andy, I apologize if I’m not communicating this very well. Let me try again.

Your concern is that when wOBA = .000 that wRC =/= 0? Is that right?

Let’s do the math.

(((.000-.312)/1.291)+.109)*PA, which is -0.13 R/PA.

So the issue is that not every out worth -0.13 runs, say if a hitter has a .290 wOBA? Is this accurate?

I don’t want to keep talking past each other, so let me know if I have your question right and then I (and maybe Tango) will respond.

Neil, thanks. Yes, what you have written is a sticking point for me, though I would describe it more as a “symptom” than the” “cause”. The cause, for want of a better term, is that lgwOBA does not = lgR/PA. This inequality, which doesn’t make sense to me, is why when wOBA = 0 there are negative runs.

Again, if wRAA = (wOBA – lgwOBA) x PA (neglecting sf), this indicates to me that “wOBA above average” x PA = wRuns above average, and this is just what I would expect. But if it does, why doesn’t lgR = lgwOBA x PA? In one range, changes in wOBA are directly proportional to changes in weighted runs, in another range, they aren’t.

We can make an analogy with BA. We could determine hits by the formula [(BA – lgBA) + (lgH/AB)] x AB, which is exactly analogous to the equation used to determine wRC from wOBA. But it’s needlessly cumbersome to do it this way, because lgBA = lgH/AB. The two terms cancel out, and we can just determine hits by the formula BA x AB. You can’t get an individual player’s weighted runs in this manner from the wOBA formula.

Let me emphasize for Tom’s sake that I do understand the wOBA formula works. I’m not saying it doesn’t. I tried determining weighted runs for a player using his baseruns formula, and it came out close to the result from the wOBA formula. Not quite the same, but I wouldn’t expect it to be exactly the same.

Again, I appreciate your efforts to help me here.

Could the discrepancy have something to do with synergistic effects, addressed by the baseruns approach? My understanding of this approach is that it’s the method to use when discussing teams, because a good hitter surrounded by other good hitters will create more runs than an equally good hitter in a weaker lineup. In effect, the weighted coefficients in the wOBA formula increase. This approach is not used for individual hitters, because it would penalize those on weaker teams, but is necessary when determining the relationship between team wOBA and runs scored.

I remember a few weeks ago there was an article here that pointed out the relationship was not exactly linear. Expected runs were a little higher at higher values of team wOBA than would be predicted from runs at lower values, because of this effect. Going in the other direction, expected runs would be lower. If you have a team comprised of hitters with very low wOBA, they will produce fewer runs than would be predicted based on the wOBA/run relationship at higher values of wOBA, because each hitter is operating in a very poor run environment—few runners on base when he bats, few batters advancing him when he’s on base.

Could this be why lgwOBA does not = lgR/PA? If you had a league where everyone had very low levels of wOBA, well below the current average, there would be fewer runs than expected based on teams with higher wOBAs. So in effect the zero run level is reached before wOBA is zero. But as you raise the lgwOBA above zero, the increase in runs is more than proportional to the increase in wOBA, because you are getting the synergistic effect—every batter is hitting in a more favorable run environment. Eventually, when you get to the current lgwOBA level, you are in a range where the relationship is roughly linear, though as noted in that article, not exactly.

I hope Tango can weigh in on this, because he’s going to have the better mathematical answer. I think you’re onto the answer, but I’m not positive.

For example, it’s technically possible for a pitcher to have a negative FIP, even though it’s not possible for them to allow a negative run. Our models don’t always work perfectly in the extremes because no model is going to fit perfectly.

But again, I’m a little outside the range at which I’m certain of the proper answer, so hopefully Tom can chime in.

wOBA is scaled to OBP, and it is not scaled to R/PA.

This thread is not really useful to get into the technicals that Andy is looking for.

I would prefer that people register on my site here:

http://tangotiger.com/index.php/boards/viewforum/2/

Create a thread, and I’ll respond in detail there. To hurry the approval process, email me at tom~tangotiger~net

Fair enough, thanks.

Tango covers this, but let me take a swing as well. Batting average tells you hits per at bat, but a .100 batting average offers negative value compared to replacement level and league average. If you have a .200 wOBA, you are costing your team runs overall compared to your potential replacements, but it’s not like you are literally not doing anything. You could put wOBA on a different scale, I suppose, if you wanted to so that .000 was league average, but I’m not sure what good that would do.

This is already doing with Linear Weights that Pete Palmer introduced in the 1980s. But people HATE seeing negative numbers because they think it means they have negative value.

They simply cannot align “0” to average. To most people “0” means “no value”.

This is why wRC+ exists, where average is 100. It keeps everyone happy.

IBB issue:

Suppose you calculate wOBA as has been described, and your player ends up with a wOBA of .320.

Now, suppose that you insist on including IBB. We agree it shouldn’t have the full weight of a walk (.70), but we also agree it’s better than an out (.00). Assume that the weight for the IBB is .32.

What happens to this player’s new wOBA if we include IBB in both the numerator (with the .32 weight) and the denominator (as a full PA)? Nothing! His new wOBA stays exactly at .320.

And if you think about it, what wOBA implicitly does with the IBB is to give it a DYNAMIC weight that is exactly equal to the hitter’s wOBA. So, Bonds’ IBB is worth .50 while the #8 hitter’s IBB is worth .25.

And it makes perfect sense if the IBB is win-neutral with respect to the hitter at the plate.

HBP v UBB:

As has been noted, HBP occurs more often when it can do more damage than the UBB, namely with a runner on 1B. UBB are issued disproportionately with 1B open.

ROE issue:

When the defense gets an error, this is a BAD thing for the defense. You can argue it’s far more bad for the fielder than the pitcher, but nonetheless, taking the defense together as a whole, it’s bad for the defense to allow a runner to reach base on error.

And for the defense, whether they allow a runner to reach base on error or by a single, it’s a similar kind of damage.

Defense is pitching+fielding.

Since offense is the exact mirror of defense, whatever is bad for the defense is good for the offense. Regardless of the amount of talent it takes for the batter to reach base via error or via hit batter, the fact is the offense BENEFITS.

Great article. I don’t understand why wOBA isn’t used as much for pitchers as hitters? I can find it in the splits( thanks to Dave Cameron) but I don’t understand why it isn’t used on the pitching leaderboard page.

It’s part of the plan to get it onto the leaderboards at some point. I think it’s just a matter of language for most people. We talk about pitchers in terms of runs allowed rather than slash lines more than for hitters. I often like to look at a pitcher’s wOBA allowed, but also with the understanding that the difference between a 2B and a 3B allowed is almost never within the pitcher’s control.

Good article, but I still won’t bring up wOBA to fans who are afraid of stats beyond batting average, home runs and RBIs. But if they really do want to learn more, then I agree it’s an excellent gateway stat.

For those who are reflexively against advanced stats, I just tell them that every stathead appreciates a player who gets on base a lot, belts some home runs, runs well and plays good defense. If you think those are good things, then you think like a stathead. Some of the formulas may look like rocket science, but it’s really pretty simple.

Maybe try this:

“- Moneyball was all about on-base percentage, which gives the hitter 0 points for each out they make and 1 point for each time on-base. Your points as a percentage of your trips the plate is the percentage.

– But we all know that all these non-outs aren’t the same. Like home runs – if you’re getting on base AND driving yourself in, that should be worth 2 points instead of just 1.

– And singles should be worth a bit less than average, say .9. But, as much as those statheads love their walks, we know they’re worth less than singles because any runners on base don’t move unless their forced, and just one base at most anyways. So give walks and hit by pitches just .7 points.

– Since a double is a third of the way between a single and a HR, give doubles 1.25 points, and give each triple 1.6 points.

– And you know what, why don’t we just stop there. Because now that we’ve made our rating system fairer than on-base percentage, there’s not need for any fancy algebra or other stuff to cause headaches. Just write these numbers on your scorecard for the Fair Rating System:

0 for an out

.7 for a walk or HBP

.9 for a single

1.25 for a double

1.6 for a triple

2 for a homer.

And that’s all. You can add up the Fair Rating System points for a game, or you can figure it out from the basic stats on the back of a baseball card. And if you want to compare percentages, just divide the Fair Rating System points by the number of trips to the plate, and that’s your Fair Rating percentage.

Take THAT Billy Beane!”

(Don’t bother telling them that Fair Rating Percentage is really wOBA, they’ll never know the difference!)

Thanks to Matt Hunter for posting Tango’s “standard wOBA” formula earlier in the comment sectionhttp://www.fangraphs.com/library/woba-as-a-gateway-statistic/#comment-336595And special thanks to Tango, who spelled out everything I put up above in one of his blogs. My only partial contribution was my stab at how to explain wOBA to someone who would never dare to read Tango’s stuff, like the type of fan jardinero describes in his post.I never bring up wOBA to anyone other than on blogs.

Better to let people stumble their way onto it, than for me to push it on them.

Question on how to project wOBA…

wOBA = (0.690×uBB + 0.722×HBP + 0.888×1B + 1.271×2B + 1.616×3B +

2.101×HR) / (AB + BB – IBB + SF + HBP)

How does Zips and Streamer ROS project wOBA? I cannot find the following projections

SF

IBB

uBB

Think I answered this on Twitter by telling you to ask Dan, but I actually have your answer. We do have those projections. Go to any player’s page and scroll down to the standard section.

Is it typical to use wOBA (or any other comprehensive stat for that matter) as an all-inclusive “hey look at this number, that’s why player x does this” stat? I’ve only recently dove into the deeper end of baseball stats and I’m trying to get a grasp of what the standard is.

Also, on a slightly unrelated note, what kind of information is there on regression analysis? I’ve been reading up a bit on BABIP (specifically as it relates to pitchers) and I’m curious as to its effectiveness as it relates to predictive “power”.

Great post

To answer your first question, typically we would say that these comprehensive stats tell you a great deal about “what a player did.” So Trout had a .423 wOBA last year, which is a really good representation of his (non-park adjusted) performance. We wouldn’t necessarily say that .423 is his true talent, because we would want to incorporate more data into that projection. wOBA also doesn’t automatically tell us what kind of player Trout is. A really high OBP guy with less power and a really high power guy with less OBP could have the same wOBA, but are getting there in different ways.

To your second question, this link to our section on regression to the mean might be helpful. http://www.fangraphs.com/library/principles/regression/

If you have more specific questions let me know here or on Twitter, but it takes a really long time for pitcher BABIP to “stabilize,” so it’s unlikely that any one season of BABIP is going to be very predictive, relative to other statistics. It’s more involved than that, but that’s the comment length answer!

While I agree that wOBA is a great statistical tool for measuring overall offensive production, I do have some issues with sabermetric stats.

First of all, FIP and BABIP treat balls in play as random chance when that’s not true at all. Generating ground balls and limiting line drives are both pitching skills that definitely help prevent runs (although ground balls do also correspond to a rise in opponent’s batting average in addition to greatly reducing their slugging percentage). If you want a luck-neutral ERA stat, then it should include not just K%, BB%, and HR% but also GB% and LD%.

Sabermetrics treats clutch hitting as simply random chance, which would be true if hitters were machines, but they’re human and as such can be psychologically affected by the game situation (as can pitchers). I’ve seen too many heavy outliers over the years to doubt that there is at least some skill in clutch hitting.

Third, defensive and baserunning metrics are very unreliable at the present time and should always be taken with a grain of salt. The same is also true of WAR, which not only heavily incorporates these unreliable metrics but also likely overrates them in comparison to Offensive WAR. I just can’t take it seriously whenever I see a speedy defensive whiz with an unremarkable bat among the league leaders in WAR.

Also, I’m not sure if they do this or not, but there should be no positional adjustments in Defensive WAR. Yes, it should be included in any base defensive metrics, but any replacement player is going to be playing the same position and therefore earn the same adjustment in his defensive metrics. It’s possible that the replacement could be someone moving over from a different position, thereby altering the Defensive WAR at both positions, but I don’t know how you’d measure it outside of a case-by-case basis.

FIP does what it does exactly the same way OBP does what it does. No one is suggesting that a walk = HR, but for the purposes of OBP, it does. No one has a problem with that, because OBP is a SUBSET of results.

Same thing with FIP. It’s a subset of results. It doesn’t make ANY CLAIM with regards to batting balls.

***

Clutch is incorporated in RE24 and WPA. Use those if you want. We’ve got a stat for everything.

***

For unreliability: that’s why you regress.

***

You are wrong about the positional adjustment in defense. The positional adjustment is the only way to put on the same scale the below-average SS and the above-average fielding LF and not get a ridiculous result.

1) FIP doesn’t assume randomness, FIP provides you with an estimate of a pitcher’s run prevention assuming average BABIP (defense, luck) on the number of balls that were put in play. So if you allow more balls in play, your FIP is higher. If you allow more fly balls (the opposite of GB), you’ll allow more HR too. I don’t think anyone ever argues the pitcher has NO control over batted balls, it’s that they have so little control of what happens to them that FIP is a really good place to start when evaluating a pitcher. This is especially true in smaller samples. Over three or five years, your batted ball luck will even out and it will be more representative of your true talent, but you can’t grab 90 innings and think their BABIP is mostly about them. Just look at Strasburg this year.

But hey, if you want a stat like that, SIERA might be for you.

2) There are “clutch” stats if you want them, like Tango says, but here’s my question for you on clutch. If David Ortiz (or whoever) is “clutch,” and can rise to the occasion and deliver in the biggest moments, why are they not able to leverage that special ability in the 3rd inning? Are they not trying? I don’t doubt that some people will wilt in high pressure chances, but people who wilt under (this kind of) pressure probably weren’t able to make it to the majors. I’m open to the idea, but show me your evidence.

Additionally, the idea of “clutch,” confuses me, because it seems to suggest that scoring your runs in a tie game in the 8th inning is better than scoring them in the 3rd inning. This doesn’t have anything to do with its existence, I just find it weird that we venerate people who perform with the game on the line more so than the people who made sure the game was already over.

3) Tell me why you say that. Defensive/BR stats are by no means perfect, but they are our best current estimate. Would your solution be to not measure these things simply because we can’t do it perfectly? And when you say WAR overrates them, what makes you say that? Why don’t you think Billy Hamilton can be as valuable as a slugger?

4) the positional adjustment is meant to equate defense at different positions, like Tango says. An average defender at SS is better than an average defender at LF, so we adjust. I think your concern here is simply misinformation. If Miguel Cabrera got injured and was replaced by a true replacement level player, that replacement player would get the EXACT same positional adjustment for every game or inning he played too. Everything else about them would be different, but that would be the same.

I presume this has been answered elsewhere, so please be gentle. Reading this site has allowed me to have justification for my anger at St Louis Cardinals colour guy Al Hrabosky’s meaningless blather every game, as he claims things to be true that stats show are not. So I’m definitely on board.

Anyway, onto the question, which involves sac flies. As has been stated, wOBA is context-independent (hitting a homer with the bases loaded is no better for your wOBA than hitting a solo HR). So far, so good. But deducting sac flies from your AB total seems to me, to be a context-dependent action. All a sac fly is, is getting out while someone happens to be on base. I don’t know of many / any batters who step up saying “not going to hit the ball over the fence today, I’ll just plonk it in centre field so my boy on second can advance”.

Now, I’m sure other people have discussed this, or I’m missing or have misread part of the article. Or I might even be wrong about what SF means. Anyway, it’s an excellent article.

RE: Famous Mortimer.

You make a good point on SF but my initial reaction was this. When Ben Revere comes to the plate to attempt a SF, he clearly isn’t going to be capable to hit it out of the park. In this case he is clearly losing an AB while the SF attempt does not impede on Jose Abreu’s BA.

I promise I’m saying is in a gentle tone, but you don’t subtract sac flies, you ADD them for exactly the reasons you’re saying.

Will there be an article coming which goes into the league adjustments FG makes in their “Value” section?

Also, are the run values for the positional adjustment dynamic y-t-y based on the run environment? For example, in a run environment of 10 R/W a SS gets an pos. adjustment of +7.5 per defensive season. Does he get credited with more runs in an environment of 13 R/W, for example?

In extreme scenarios that could make a big difference. During a deadball-era (say 7 R/W) the defensive value gap between a SS and a 1B would be huge with 20 runs or almost 3 wins.

Whereas in a league or era where offense thrives (say 13 R/W) the gap in defensive value would still be 20 runs but in this case just about 2 wins.

So does FG take the run environment into account when dealing with the positional adjustments?

We’ll cover this when we update the WAR section, but it’s basically just about the difference in run scoring between the leagues.

As for positional adjustments, we currently treat them as constant y-t-y. There’s probably an argument in favor of examining how they might change over time for a variety of reasons, but I’m not aware of anything ongoing about this.

You are exaggerating the differences. There are no 7 RPW or 13 RPW eras.

Fair enough – I must have misread the equation a bit then. It did seem like something someone else would have noticed before me.

Why is there no wOBA for PITCHERS ???

Asked and answered. Read through everything.

If you mean why don’t we have it on the leaderboards, it’s on the list of things to do. But it does exist!

Can wOBA be calculated over a span of seasons or is it most useful when looked at for single seasons only?

Yes, it absolutely can be calculated over multiple seasons!

So can I ask the obvious follow-up… how?? Is it as simple as taking a weighted average of each year’s wOBA (based on plate appearances)?

For example, if a player had 500 PAs in 2013 with a .350 wOBA, and 400 PAs in 2014 with a .375 wOBA… Is his 2-year wOBA .361 ((500*.350)+(400*.375)/900)? Or is that too simple, and I need to somehow get single coefficients that are based on both years combined, and then feed those coefficients into the normal wOBA formula?

Yes, weighted average will do the trick. But it will also work if you do it the more complicated way for whatever reason

Hello,

I have been wondering what the best stat is to determine run scoring, or in other words, which stat has a stronger correlation with runs scored. I’m a Dodger fan and I see that a lot of people are worried that because of the departures of Kemp and Hanley, and the loss in power that comes with their departure, the Dodgers will score less.

So, I did an exercise. I am no stat genius, but I think I’m over average when it comes to math stuff, so bare with me. In Excel I took team data from the last 10 years (that’s 300 team seasons post steroid era), and used the correl function to determine the correlation between runs scored with the following stats: HR,OBP,ISO,wOBA,wRC,SLG and AVG. These were the results:

HR OBP ISO wOBA wRC SLG AVG

Correlation 0.674 0.888 0.758 0.951 0.738 0.996 0.911

1) AVG having a stronger correlation than OBP REALLY stood out to me.

2) SLG being so high surprised me, and it’s a lot higher tan ISO.

3) The darn Yankees can hit, I hate the Yankees (sorry Yankee fans, not really sorry).

I just want to hear your thoughts on this. Is this enough of a sample size to proclaim wOBA as the best stat for this? Does this even mean anything?

Thanks in advance, have a good one!

A column had been labeled wrong. This is the actual result:

HR: 0.674

OBP: 0.888

ISO: 0.758

wOBA: 0.951

wRC: 0.738

SLG: 0.911

AVG:0.802

OPS: 0.953

Quite confuse that formula of wOBA:

0.690×uBB + 0.722×HBP + 0.888×1B + 1.271×2B + 1.616×3B + 2.101×HR

how exactly did you get those numbers? what about wOBAScale? how do you calculate them?

Those numbers are based on linear weights. To calculate them, you basically you run every PA through the run expectancy matrix and then average out the singles, doubles, etc. The scale is used convert those averages so that league wOBA = league OBP

Is there a “constant” scale that can be used for the wOBA that will provide a reasonably accurate result without having to recalculate the scale every day?

http://www.fangraphs.com/blogs/instagraphs/basic-woba-equation/