Expected Run Differentials 2.0

Over the first couple of months of the season, I’ve done a couple of end-of-the-month posts on Expected Run Differentials. While pythagorean expected record — the number of wins and losses a team would be expected to have based on their runs scored and allowed — has become nearly a mainstream concept, I’ve never been a huge fan of using runs to determine how well a team has played thus far.

After all, the entire point of looking at run differential instead of actual wins and losses is because we’re acknowledging that wins and losses are affected by the timing of when runs are scored or allowed, and history has shown that run sequencing is mostly just randomness. So, developing an expected win-loss metric that removes the affects of sequencing is a good idea, but pythagorean record only goes halfway to that goal. It removes the timing aspects of converting runs into wins, but ignores the timing aspects of converting baserunners into runs. Evaluating a team by its run differential removes some of the sequencing effects of wins and losses, but leaves plenty of other parts, with no real reason why we should arbitrarily include some sequencing while taking other parts out.

That’s why I’ve always preferred to look at a team’s performance based on expected runs scored and allowed, rather than actual runs scored and allowed; this gives us the most context-neutral evaluation of team performance to date. In the two preceding posts, I walked through the creation of expected runs scored and allowed totals based on each team’s wOBA and wOBA allowed, adjusted for baserunning and fielding values. As a linear weights based metric, wOBA is a very good context-neutral evaluator of individual events.

However, as Jesse Wolfersberger eloquently illustrated at The Hardball Times last week, run scoring at the team level isn’t really linear.

The exponential nature of offense means a good hitter in a good lineup is worth more than that same hitter in a bad lineup. On a good offense, that hitter is more likely to come to the plate with more runners on, more likely to get driven in once he’s on base. And, the lineup turns over more often, meaning he gets more plate appearances. Not only is he more valuable to a good lineup, but he’s even more valuable to a better one – the effect builds on itself.

While the wOBA-to-runs conversion works well in most cases, it does begin to break down a bit at the extremes, where the non-linear effects of team strength can come into play. And while these extreme examples still don’t change the conversion much, they can add up over a full season.

A team with a .365 wOBA would be predicted to score about 5.85 runs per game, but will actually score about 5.93 runs per game. On the low end, a team with a .285 wOBA would be predicted to score about 3.24 runs per game, but would instead score about 3.33 runs per game. Those are small differences, but remember that baseball has the longest regular season of all major sports. A difference of .09 runs per game equals about 14.6 runs per season, or about one-and-a-half wins.

Given that our sister site just published an article explaining how wOBA can break down a little bit at the team level, I figured that continuing to use wOBA to create the Expected Run Differentials for the monthly posts was probably not the best idea, especially if there was a better alternative. And there is.

The model is called BaseRuns. It’s significantly more complex than a linear weights model like wOBA, but that complexity leads to estimates that fit each team’s own run environment; if a team has a very good offense, the extra value will be captured in BaseRuns when it is not in wOBA. If you’re particularly interested in how BaseRuns works, this is a good primer. If you don’t care about the how and just want to know that it does work, however, research supports the idea that BaseRuns is probably the most accurate run estimator in the public domain.

So, rather than continue to create good-but-imperfect run estimations based on a linear weights model, I prodded our Dark Overlord and said “hey, we have BaseRuns in the database; can I have them please?” And being the benevolent overlord that he is, he did me one better; not only did he give them to me, he’s given them to us all.

On our updated 2014 BaseRuns Standings page, you will now find three columns of year to date data: Actual win/loss and runs scored/allowed totals, the pythagorean expected record based on those runs scored and allowed totals, and finally, the expected runs scored and allowed (and the corresponding win/loss totals) calculation based on each team’s BaseRuns estimate. Essentially, this standings page could be read from left-to-right in descending order of context.

The left-hand side includes all events and the sequencing of those events, giving us the totals that actually count in the Major League standings. The middle columns give you a reduced-context win estimate based on actual runs scored and allowed, retaining the sequencing that turns baserunners into runs. The right-most part of the table is the metric that is as context-neutral as you want to get at the team level, accounting for the non-linear nature of run scoring without giving teams extra credit for bunching their hits together above a reasonably normal expectation.

From here, you can check in every day and see the best estimate of how many teams your team should have won based on their context-neutral performance without having to wait for my end-of-the-month post to update the leaders. And because all the columns are sortable, you can easily see which team has had the best offense or defense, or the combination of both. Here are the BaseRuns numbers, as of this morning.

Team G xWins xLosses xWin% +/- xRunDiff xR/G xRA/G
Athletics 78 50 28 0.638 -2 100 4.8 3.6
Angels 76 46 30 0.602 -3 72 4.8 3.8
Cardinals 79 45 34 0.565 -2 41 3.8 3.2
Nationals 77 43 34 0.561 -2 40 4.0 3.5
Giants 79 44 35 0.560 2 40 4.1 3.6
Dodgers 80 45 35 0.557 -1 42 4.4 3.9
Tigers 74 41 33 0.551 1 37 4.9 4.4
Brewers 80 42 38 0.528 6 20 4.4 4.1
Mariners 79 42 37 0.525   17 3.8 3.6
Blue Jays 80 42 38 0.521 2 17 4.9 4.7
Pirates 78 40 38 0.519 -1 13 4.3 4.2
Braves 77 39 38 0.510 1 6 3.9 3.8
Orioles 77 39 38 0.501 2 1 4.5 4.5
Rockies 79 40 39 0.501 -4 1 5.0 5.0
Reds 77 38 39 0.497 1 -2 3.9 4.0
Indians 78 38 40 0.489   -8 4.5 4.6
Cubs 76 37 39 0.489 -5 -7 3.7 3.8
Rays 80 39 41 0.484 -7 -11 3.9 4.0
White Sox 79 38 41 0.484 -2 -12 4.4 4.5
Mets 78 37 41 0.475 -1 -17 3.9 4.1
Twins 76 36 40 0.473   -19 4.2 4.4
Marlins 78 37 41 0.470 2 -22 4.1 4.4
Astros 79 37 42 0.468 -4 -22 4.0 4.3
Royals 78 36 42 0.464 4 -25 3.8 4.1
Yankees 77 36 41 0.462 4 -27 4.1 4.5
Phillies 77 35 42 0.451   -34 3.9 4.4
Red Sox 79 36 43 0.449   -36 3.9 4.4
Diamondbacks 81 34 47 0.421 -1 -61 4.1 4.8
Padres 79 32 47 0.403 2 -60 3.0 3.8
Rangers 77 30 47 0.389 5 -84 4.0 5.1

(Note: +/- is the number of wins a team has accrued relative to their expected wins by BaseRuns. A team with a +5 in that column has won five more games than expected, for example.)

Not surprisingly, the best team in baseball this year has been the Oakland A’s; it was that way on both of the previous two Expected Run Differential posts as well. The A’s are just trouncing their opponents, and no team in baseball has played better. But the A’s are also a good reminder of why BaseRuns is more useful to look at than pythagorean record, because their run differential suggests that they’ve actually underperformed this year. By pythag, they’ve played like one of the best teams in baseball, and have gotten “unlucky” to only be 48-30.

But that only tells half the story; they haven’t been great at converting their runs into wins, but they’ve been amazing at converting their baserunners into runs. They’ve been “lucky” one way and “unlucky” the other, and only looking at their run differential overstates how well they’ve played by including the “unlucky” part of sequencing while ignore the “lucky” part.

Perhaps a graph will be more helpful than a big table at illustrating these differences. Below, I’ve plotted every teams actual winning percentage, pythag winning percentage, and BaseRuns expected winning percentage on a marked line, so you can see where the variations are for each team. When the blue point is above the green line, you have a team that has won more games than expected; when the blue point is below the green line, that team has won fewer games than expected.

WinEstimates

The big overachievers? The Brewers, Royals, and Yankees, who have each clutched their way into better records than they have earned based on the underlying hits, walks, and other ways of reaching base or advancing runners. On the other end of the spectrum, the Rockies, Cubs, and Rays have all played better than their records would suggest.

But perhaps maybe the most interesting data point to me? Look at the Rangers. They’re a well-known disappointment, but by BaseRuns, they’ve been the worst team in baseball this year. Worse than the Padres, who just fired their GM, and worse than the D’Backs, who hired an overseer to transition the organization down another path. The Rangers, who spent $140 million to sign Shin-Soo Choo and $136 million to trade for Prince Fielder, have been worse than their win-loss record and even their uninspiring pythagorean record. The 2014 Rangers have been atrocious.

Of course, a lot of that can be chalked up to injuries, and we’re still just dealing with a half-season of performance data. While BaseRuns is a very good run estimator, single season inputs still shouldn’t be taken as measures of true talent, and if the Rangers played the Astros in a seven game series next week, I’d probably still take the Rangers. But it’s not the slam dunk choice you might think, and an argument could be constructed that the Astros are not the worst team in the AL West right now.

The good news is that you don’t have to wait a month for the next Expected Run Differential update. With the data now right on our Standings page, you can check it each morning, and see where your team’s to-date performance stacks up against other contenders. When projecting future performance, you still want to account for more than just season-to-date numbers, including past track record for each player, roster changes, and future schedules, which is where the projections on our Playoff Odds page come in handy. But if you’re just wondering how well your team has played this year, and what kind of record they should have, you’re not going to do better than the BaseRuns expected record.



Print This Post



Dave is the Managing Editor of FanGraphs.


Sort by:   newest | oldest | most voted
Catoblepas
Guest
Catoblepas
1 year 11 months ago

a) This is awesome and b) Oakland: 48-30 actual record, 50-28 by BaseRuns. Geeeeeeeez.

Catoblepas
Guest
Catoblepas
1 year 11 months ago

Oops, Dave mentions this in the body of the post. But still! Geeeeeeeeeeeeez.

AC
Guest
AC
1 year 11 months ago

A 104-win pace is good. Great, even. But not exactly historic.

Toffer Peak
Member
Toffer Peak
1 year 11 months ago

Not a historic Win-Loss record but possibly a historic Base-Runs record? We would expect Bare-Runs records to be less extreme than actual Win-Loss records since they have less random variation, i.e. luck, included in them.

Matthew
Guest
Matthew
1 year 11 months ago

Oakland pythag. 53-25!

Rubén Amaro, Jr.
Guest
Rubén Amaro, Jr.
1 year 11 months ago

Phillies rank 24th in RBI.

We really need to work on that.

lorecore
Guest
lorecore
1 year 11 months ago

Do you have individual player baseruns?

mike wants wins
Guest
mike wants wins
1 year 11 months ago

Can we learn anything about managers from this data? Is there any year to year consistency to expected wins vs actual wins? Or is that all luck?

jim S.
Guest
jim S.
1 year 11 months ago

I saw a chart for managers since 1903, based on Pythagorean win expectancy, and Bruce Bochy grades out as the best of all time.

tz
Guest
tz
1 year 11 months ago

May be true. However, the same was said about Mike Scioscia after a long run of the Angels exceeding their Pythagorean expectations. Then, of course, the coin began to flip the other way…

TC
Guest
TC
1 year 11 months ago

Going just off what is in this article,(I didn’t look up any historical data)there appears to be a common trait among the overachievers with the actual playing rosters that can probably explain it better than attributing to the managers. Anecdotally speaking, the gap here appears to be the result of dominant late inning bullpens, or lack thereof.
Now, I’m not very smart, as much as I like this site I don’t always fully understand the processes and meanings behind the valuations I’m reading about. With that being said, even though this baseruns is a team statistic, is there a way to incorporate its components with this theory of relievers being responsible for “extra” wins, into finding a better way to value an individual reliever?
With the concept exponential value of offensive value to overall team success, is it possible that supplementing a mediocre lineup with a dominant, shut down, high leverage reliever, instead of one great hitter, is a better way to add actual wins to a teams record?

Jose
Guest
Jose
1 year 11 months ago

Is it just clutch what makes the “overachievers”? I’m thinking blowouts could really mess with the results, specially when linked with a really bad SP who keeps his job because there’s no one else to take it.

Jeremiah
Guest
Jeremiah
1 year 11 months ago

Are there distributions for Base Runs? A while back I looked into improving on Pythagorean predictions. It turns out that you can get a slight improvement by using actual distributions of runs scored and allowed. In fact, the Pythagorean formula can be derived by fitting a Wiebull distribution to scoring data, but that ignores variance in the distributions. I don’t know if it really matters much, but it might be a similar order of magnitude to the difference between using wOBA and Base Runs.

hstrohm
Member
hstrohm
1 year 11 months ago

Is this BaseRuns data factored into the 2014 Projected Full Season Standings on the standings page?

Darren
Guest
Darren
1 year 11 months ago

Perhaps indirectly through Zips and Steamer ROS projections, but the chart provided is still a small sample size of what has happened over 80 games this season. Any projection system going forward will take in multi-year data

David
Guest
David
1 year 11 months ago

The Dodgers can’t catch a break. Even when their record is better than the Giants the Dodgers winning % is somehow lower.

tz
Guest
tz
1 year 11 months ago

How much 2013 WAR do the Rangers currently have on the disabled list(s)?

KJOK
Guest
1 year 11 months ago

Excellent addition. However, what is the trick for looking at team Baseruns in past seasons? I can’t seem to figure it out…

Anonity
Guest
Anonity
1 year 11 months ago

If good hitters are worth more in a good lineup than a bad lineup, then doesn’t that support the argument that in a really close race, a player on a contending team should win an MVP over a player on a lousy team?

Innocuous Observation
Guest
Innocuous Observation
1 year 11 months ago

Nope!
Cause that’s crediting the player with something they have no little power over: the rest of their offense.

Anonity
Guest
Anonity
1 year 11 months ago

But if the award is simply based on who is the most valuable, and good hitters are worth more in a good lineup than a bad lineup, the fact that an individual has little power over the rest of their roster has little significance.

Andy
Guest
Andy
1 year 11 months ago

See Dave C.’s reply to a responder above. wOBA is fine for individual players, it’s only needed for teams. Meaning the extra :”value” a good player gets from being on a better offense is not really individual value. The extra value is reflected in more runs and RBIs, which I think we can all agree are not good measures of individual value.

I think another way of putting it is that baseruns in effect alters the linear weights (they are no longer linear, but change with the team environment). But to use weights that differ from team to team to compare the value of players on different teams is not really fair.

Andy
Guest
Andy
1 year 11 months ago

That said, comparing players by baseruns is somewhat akin to arguing that players on borderline teams are more valuable, because the wins they generate mean the difference between postseason and staying home, whereas the wins generated by a player on a good team may not actually be critical to the team’s making the postseason.

bookbook
Guest
bookbook
1 year 11 months ago

Interesting. It appears the M’s have caught up to their luck a bit, and no longer look like the luckiest team in the majors?

GreenMountainBoy
Guest
GreenMountainBoy
1 year 11 months ago

I’ve done the analysis… I’ve done the math. It’s true!!! Every base gained is worth 1/4 run. Better teams slightly above that, lesser teams slightly below. It’s simple, and what it boils down to is this. What’s your team’s BA w/RISP? I know.. it’s a regress to the mean stat like W/L in 1-run games… but for 1 year? It’s the best predictor of W/L above Pythagorean. It’s been a commonly available stat for all these years, staring us right in the face, but it’s the most important one, all sabermetrics aside.

Sports Enthusiast
Guest
Sports Enthusiast
1 year 11 months ago

You’re confused between predictive and descriptive stats. A high BA with RISP describes a team that has been scoring a lot of runs, but, unless matched to a team getting on base a lot and hitting for power all the time, it doesn’t predict that the team will do much with these opportunities in the future, nor get them in the first place.

Daniel
Guest
Daniel
1 year 11 months ago

:( rangers

jjgilham
Guest
jjgilham
1 year 11 months ago

This is very interesting. Which form of Pythag are you using? I tried to duplicate the data off the standings Baseruns page using
X = ( (RS + RA)/G ) ^ .285 but I could not reproduce your numbers. Do I have an out of data formula or are you doing something more exotic?

Nick D.
Guest
Nick D.
1 year 11 months ago

Really ool stiff. Dave, in your article you mention the Rangers as being the worst team in BaseRuns, in the standings page they have also overplayed their Pythag Win% too, yet the playoff odds page, with all of their injuries and in the same division as the top 2 BaseRuns teams(Athletics and Angels) are still projected to play at a .485 clip the rest of the year. How is this the case with all of their injuries and the strength of schedule within their division and their underperformance to date?

Gregg
Guest
Gregg
1 year 11 months ago

How much would the Indians improve by if you removed all their unearned runs?

Pwn Shop
Member
Member
Pwn Shop
1 year 11 months ago

Spectacular post. I have been looking for somewhere to find BaseRuns Against, or wOBA against, without having to add up all the pitching stats. Excellent!

Garrett
Guest
Garrett
1 year 11 months ago

Hire an editor.

Mike
Guest
Mike
1 year 11 months ago

Why don’t you volunteer? My guess is you have loads of time to spare for it.

Stephen
Guest
Stephen
1 year 10 months ago

Cubs should almost be a .500 team….hmmm

wpDiscuz