MVP and Cy Young Voting Revisited

The discussion from my last post about MVP and Cy Young voting, there were a few good suggestions to improve the analysis, so I decided to go back and revisit the data. I’m going to give the baseball writers the benefit of the doubt, try some different methodology, and see if I can find any evidence that they are doing a better job at filling out their awards ballots today than they were ten years ago.

One piece of clarification from last post: the correlations are based on the percent of points a player received out of the maximum possible. This adjusts for ballot differences between the leagues and years.

I also think it is important to restate the prism through which these graphs should be viewed. The level of correlation is not the target of the analysis, but rather the trends of the series. If the writers truly embraced the statistics revolution that took place this decade, we should see some positive trend in the correlations between award votes and advanced statistics.

Looking Only At The Top Five

It is possible that voters are making statistics-driven decisions at the top of their ballots, but then reverting to qualitative or geographic factors at the bottom. The players receiving a vote or two could be throwing the correlations off. By looking at only the top 5, we may see different results.

By reducing the sample size, the graph is much more erratic and volatile than it was previously. Through the volatility, there is still no evidence of a positive trend for either award.

It was very surprising to see a strong negative correlation in 2000. I double-checked the numbers, and it’s no mistake, it is just a fallacy of only looking at five players. That led me to the next iteration of this analysis:

Looking At All Qualified Players

Instead of looking at just the top five players, the correlations may be improved by including all qualified players of each season into the correlations. By adding data points of players who put up mediocre statistics and, in turn, received no awards votes, it adds more variance to the data set and might help hone in on the relationship between votes and WAR that I’m looking for.

This change was a definite improvement in methodology. The correlations were much more stable, not fluctuating wildly year to year. Also, there is a slight positive trend for the Cy Young over the decade, particularly over the past five years. This may be the only evidence that I have found on any level that the voters are improving, but it is hardly significant or conclusive.

Using Different Statistics

One assumption that has been made up until now is that WAR is the best statistic to judge the voters’ assimilation of advanced statistics into their balloting. I again want to make it clear that this analysis is not making the argument that WAR is the only statistic that writers should be using. It one of many advanced statistics that could be utilized, but it is particularly convenient for this purpose as it is both a hitting and pitching stat.

By breaking hitters and pitchers up, we can examine some position-specific statistics that may shed better light on the situation. I again used all qualified players in the correlations, but there is one major change to the pitcher data set: I only included starting pitchers. This was done in an effort to look at the pitchers on an apples-to-apples basis. For example, a starter with a 3.00 ERA is a Cy Young candidate, but a reliever with the same ERA probably isn’t.

For pitchers, I examined ERA and FIP (xFIP is not available for the whole decade). Given, these are highly collinear statistics, but they are no more similar in 2000 than they are in 2010. And the slight differences in these statistics are the exactly what separate conventional statistics from sabermetrics.

For this graph, remember that the further negative, the more significant the correlation, because smaller ERAs and FIPs signify better pitchers.

The results are truly counterintuitive. When looking at linear trend lines of these series, it shows that the correlation with FIP grows slightly weaker over the decade while the correlation with ERA grows slightly stronger. The writers were better at pairing Cy Young votes and FIP in the pre-Moneyball era than they are today.

This conclusion is the same when looking at hitter data.

When comparing the correlations between MVP votes and wRC+ and RBI, there is simply little or no change between the turn of the century and today. In fact, there is again a slight negative trend for wRC+ over the time span.


In revisiting this analysis, I concur with my previous conclusion: writers are no better at picking MVPs and Cy Youngs today than they were ten years ago. The statistics revolution has, no doubt, changed the landscape of baseball. However, when it comes to filling out award ballots the baseball writers have yet to truly embrace advanced statistics.

Print This Post

Jesse has been writing for FanGraphs since 2010. He is the director of Consumer Insights at GroupM Next, the innovation unit of GroupM, the world’s largest global media investment management operation. Follow him on Twitter @jesseberger.

Sort by:   newest | oldest | most voted

Couple of suggestions to possibly remove some noise:

1) Seems to me like you should be controlling somehow for team performance, even if it’s just a dummy variable.

2) Also, future analysis might want to consider improvement relative to recent performance (e.g., weight three-year average or something), as anecdotally it seems that voters are more likely to reward a good player having a great year than a great player having an average year.