Can we Calculate MVP with a CPA?

No this isn’t a piece for accountants so please don’t give up on it or go to sleep!  It is an MVP discussion, and there is always a lot to talk about with the MVP, the very definition of which is vague, entitling anyone to interpret it how they wish.  There are perpetual questions– is it for an outstanding player or one who can meet some criteria of clutch?  Can a pitcher be more valuable than an everyday player?  Must a candidate play on a contender?

This article lays out a framework for quantifying these issues.   As described, a definitive answer requires a little more data than we now have, but it’s possible to have this data for an interesting quantitative measure of the MVP.

Let’s start with principles:

  • The objective is to win a championship. I don’t expect this to be controversial, is it?  As we’ll get into, this doesn’t mean a non-contender can’t win, but it will be more difficult for them to do so.
  • Context and chances matter. We aren’t trying to pick the best player, we’re trying to pick the most valuable.  We’re not trying to forecast the future, we’re looking back at the past.  Whether a player benefits from the luck of situations or of opportunities, the player who capitalizes upon his luck seems to this author to have been more valuable than an unlucky player who doesn’t have as many such opportunities.  Dave Studeman and Dave Cameron have written well on this topic.  If you don’t agree, take it up with them.  (For future research – must an author’s first name begin with the letter “D” to believe this?)  Further, the context of a player’s team matters – clinching a pennant on the last day is more valuable based upon context than an April rout or a meaningless September game between call-ups.

Introducing CPA

Accordingly, we’ll take something old, Win Probability Added, (WPA) and dust off and tweak something else old, Championship Leverage Index (CLI) to make up a new statistic to measure value – Championship Probability Added (CPA).  Our formula is CPA = sum of all daily WPA x CLI.

The facets of WPA are discussed thoroughly in another Studeman article.  Suffice to say, it captures a hitter’s or pitcher’s contribution to the probability of his team winning a game, which we can take as a player’s value to his team in the particular game.

As for the importance of the game to the team, Studeman and Sky Andrecheck have developed a measure of the game’s importance, the Championship Leverage Index, how the outcome of a game affects a team’s championship probability, but, as Studeman pointed out in his WPA article, the new wild-card format makes calculation of CLI difficult.

Fortunately, FanGraphs has a big part of the answer in their playoff probability table, which daily measures a team’s playoff and championship probabilities.  The day to day changes in these probabilities are indicative of each game’s importance, although a full measure of a game’s importance would require running the simulations 15 more times to determine the change in probability for each game’s alternative outcome.

There are different measures of championship probability in these tables based upon projections or upon random (coin toss) probabilities for a season’s balance.  The projection-based probabilities may be more accurate, but, for our purpose, measuring the value of each game, the coin toss probabilities are more useful.  1) The projection-based probabilities are more volatile early in the season as they vary not only with the game’s outcome, but with players’ individual performance which in turn affect his team’s projections.  Thus early-season games are weighted more highly than late games.  2) A player’s individual impact can be diminished because it already has been factored into a team’s projections.

The 2015 MVP Race by CPA

For now, without the complete probabilistic simulations, we’ll try to approximate the value of a game by taking the absolute value of a daily change in a team’s championship probabilities.  We use the absolute value of the daily change since it measure’s the game’s importance whether or not a team wins.  Without this, a player would be penalized if his team loses a game, even if he has a big (valuable) game (high WPA).

For now, the daily changes must be recorded from FanGraphs by hand, so we’ll run with an illustrative example rather than a definitive analysis.  Let’s start with the top two players by WAR in each league:

American

National
Player WAR Player WAR
Trout 9.0 Harper 9.5
Donaldson 8.7 Kershaw 8.6

In the AL, both the Angels and Jays were in contention, although, the Jays’ chances became markedly better later in the season.

Championship Probability
All Star Break September 1

September 30

Angels

6.4%   1.0%

 2.4%

Jays 2.1% 10.0%

12.6%

 

While the Angels had a low probability, there was still a lot of opportunity for Mike Trout to benefit from swings in their chances in the end, but he couldn’t make up all the ground on Josh Donaldson’s high WPA during the Jays high CLI second-half run.

Cumulative Championship Probability Added

All Star Break

September 1 September 30
Trout 0.9% 1.1% 1.3%

Donaldson

1.1% 2.1% 2.5%

On the NL side, WAR leader, Bryce Harper, had his CPA affected by the Nats dropping out of playoff contention.

Championship Probability
All Star Break September 1 September 30
Dodgers 8.9% 11.7% 12.8%
Nationals 8.1%   0.8%   0.0%

Harper’s early-season lead fell by the wayside as Kershaw’s performance improved from its negative start and the Dodgers remained in the championship hunt.

Cumulative Championship Probability Added

All Star Break September 1 September 30
Kershaw 0.5% 1.2% 1.5%
Harper 1.1% 1.1% 1.4%

So, definitive MVP stat?  Not yet, but hopefully a step in that direction.  Calculating a probabilistic CLI would be a big help.  Improvements to WPA to incorporate base running and fielding would help too.

Thoughts?



Print This Post

newest oldest most voted
Eli Ben-Porat
Member
Member

Like the overall concept. Here’s my main issue: Say a player has a HUGE WPA game and his team wins, but the team they are tied with also wins, thus leaving the championship probability the same. The next game he has a negative WPA, but his team wins and the other team loses, driving a large change in the championship probability. This would distort your end numbers by introducing random noise.

Also, you are using the absolute value, so if a team is in decline, it will inflate the player’s CPA despite the fact that games are becoming less valuable.

Here’s my recommendation: Weight each game as a % of Day 0 Championship probability. So if Washington started at a 10% chance and on day 52 were up to a 12% chance, the games would then be counted at 1.2 and on day 152 when they were at 1%, they would be counted at 0.1. This may be extreme, but will smooth out the randomness a little.

tz
Guest
tz

Good stuff. I agree with using the coin-flip mode because of its consistency with how WPA is calculated – both use league-average for future expected probabilities, which makes the CPA metric also a comparison to average.

(Also, props for copying all the daily playoff odds data from here!)

My only concern would be the distortion caused by using the total change in championship probability, which includes the impacts of wins and losses from other teams in the race. For most of the season the error in this method should roughly be a wash, but as Eli’s example demonstrates, any correlation between that error and a player’s game-by-game WPA could bring material noise into the calculation. This could be especially problematic late in the season, when both the game “leverage” and the impact of other teams’ games on the championship probability are the greatest.

So the probabilistic CLI would be the most important enhancement to make. Enhancing WPA to properly reflect baserunning and fielding could be a huge challenge – Russell Carleton has a great article for Baseball Prospectus showing the challenges in partitioning WPA like this for just one play. The best you could probably do without a major headache is to add back the full season wins above/below average for fielding, including the positional adjustment.

Still, for those who look to the narrative approach for determining the MVP, looking for the intersection of player performance and probability of the postseason, this is a great framework.

tander28
Member
Member

Similar to an idea put out on Grantland earlier this month.

http://grantland.com/the-triangle/2015-mlb-mvp-debate-bryce-harper-yoenis-cespedes-anthony-rizzo/

I love the idea of a CPA stat. To me, that’s what the MVP is designed to reward, but I certainly understand and appreciate other ideas of the award.

tz
Guest
tz

Did you happen to save the daily W/L records along with the coin-flip championship odds?

If so, I might have a way of calculating the CLI for each game without needing simulations. I’ll post that to the community page if it works the way I think it’s going to.

Simon
Guest
Simon

Baseball Gauge has daily coin-flip playoff odds in their game logs:

http://seamheads.com/baseballgauge/team.php?yearID=2015&teamID=TOR&tab=schedule