Bill James’ discovery of the Pythagorean Win Expectation is one of the cooler findings of sabermetric research. You can read up on the details by following the given link. In short, what James found is that one can get a pretty good approximation of a team’s winning percentage given only their runs scored and allowed by using the following formula:
It works remarkably well, and more recent versions like PythagenPat are even more accurate. I won’t repeat the basics, which can be looked up elsewhere. Instead, I want to address the occasional misuse of the formula for building narratives of teams being better or worse.
Graham MacAree actually addresses this problematic use of the Pythagorean Win-Loss formula in the previously-linked entry. I want to elaborate on Graham’s point from a slightly different angle, given that this is the time of the year when fans of teams, especially those that are rebuilding, are looking for hope and faith for the following season. A run differential that indicates a better-than-actual winning percentage is sometimes seen as a reason for hope (or, if the differential is worse than the team’s record, despair) for the team going forward.
This is problematic for several reasons. Leaving aside (controversial) ideas held by some regarding certain teams and managers possessing skills allow them to outplay their Pythagorean expectation, run differential is itself subject to a great deal of random variation. Even if deviation from the expected win percentage is simply the product of “luck” (the usual sabermetric shorthand for random variation), runs scored and allowed are themselves products of events subject to variation from the true talent of the players involved.
While this may seem obvious, deviation from Pythagorean expectation is cited enough as intrinsic evidence for how talented a team really is that it is worth showing what sort of silly narrative can be constructed by simply relying on a team’s run differential as an indication of its quality and likely future performance. With that in mind, let’s look at five consecutive seasons of an actual team’s expected winning percentage according to PythagenPat and see what sort of “narrative” it implies.
Year 1: .454
Year 2: .442
Year 3: .403
Year 4: .395
Year 5: .460
A (fictional) person who thinks run differential is the golden road to a team’s true talent would think that this team was mediocre for a couple of years, then absolutely dreadful for a couple more, then returned to about its original level in the fifth season. While that might be a good description of this team’s observed performance in terms of actual won-loss records, I don’t think many would think it would be an adequate, or even accurate description of the state of this “randomly selected” team, the 2007-2011 Kansas City Royals. The simplistic “run differential narrative” would tell you (if you compare it to the actual records) that, e.g., the 2011 team is better than its record shows this year (they’ve underperformed their PythagenPat win expectation by about six games so far), but that the Royals haven’t made any real progress over the last five seasons.
Of course, no informed observer would say that. Indeed, those who use run differential to say the Royals have improved this season know as well as anyone that there is a huge difference between the situation of the 2007 and 2008 teams, which were cobbled together from previously available “talent” and some free agent signings of varying wisdom, and the 2011 team, which is mostly cost-controlled, very young, and has more talent waiting for it in the minors. Indeed, the composition of teams themselves, changes from one season to the next, (which relates to the mistake of taking the current season to be a constant for the future, or even the playoffs), so why would one think the Pythagorean expectation from one season should apply to a different group of players?
In fact, I do think the Royals have made progress this season and that their future looks brighter than it has in a long time. But the Royals are just an example for the purposes of making the larger point — that run differential doesn’t add much, if anything, to our measuring of that progress. We know that observed performance of players varies from their true talent. We know that not only the composition of teams, but the true talent of players themselves (particularly those at the extremes of the aging curve) generally vary a great bit from season to season. All of those factors are important in projecting how teams will perform in the future.
But what does examining a team’s past run differential contribute to the job of projecting a team’s future performance beyond those factors? Nothing, really, by itself. It does have its uses: for example, when one has projected runs allowed and scored, one can use the Pythagorean formula to estimate how many wins can be expected for the team. But let’s stop using the formula in combination with past run differential as a proxy for projection.