Protecting Players Against Big Data

This is Mike Hattery’s first piece as part of his September residency at FanGraphs. Hattery writes for the Cleveland-based site Waiting for Next Year. He can also be found on Twitter. Read the work of all our residents here.

While there are certainly examples to the contrary, it’s generally the case that outlets such as Baseball Prospectus and this esteemed institution approach their analytical work on an individual player in the context of that player’s value to his team. Since most writers begin as fans, and because fandom in baseball — and, ultimately, most sports — tends to begin with an allegiance to a city or team, this isn’t surprising.

For the actual major-league clubs, this imbalance is naturally even more pronounced. All 30 organizations feature an analytics department of some sort, and all 30 of those departments are constructed ultimately to benefit the team. Even in those instances, for example, where an observation about spin rate aids an individual pitcher, the added value is ultimately passed on to the club.

And here we arrive at a point of concern: while analytical work in baseball offers tremendous insight and can even benefit individual players, it appears that organizations have a significant advantage over players in their access to data and their capacity to use that data in decision making.

Now, if this analytical work were limited merely to assessing player value or estimating the possible range of a player’s outcomes, the informational asymmetry would represent less of a concern. However, as teams and public analysts continue their pursuit in the direction of health-risk modeling, the impact on players is increasingly serious.

In a recent piece illuminating the information gap between teams and agents, R.J. Anderson wrote the following:

The ongoing data revolution has obscured a simple fact. With teams receiving improved information, their greatest competitive advantage is perhaps no longer over one another. Rather, the information gulf now resides between the teams and the players — or, precisely, the players’ agents. With the league investing in new data sources, like Statcast, the gap could continue to grow.

Of most concern, perhaps, is that player agents are guaranteed no greater access to data than the common fan by the terms of the 2017-2021 MLB Collective Bargaining Agreement. Teams, meanwhile, are afforded extensive access to additional data gathered by Statcast technology.

One might argue that this asymmetry isn’t all that problematic, that because teams and players have aligned interests, the informational advantage possessed by clubs is ultimately moot. However, this argument ignores the reality that players and organizations often have different short- and long-term interests.

Consider, for instance, Cleveland right-hander Bryan Shaw. Shaw has an interest in ensuring that his right arm remains connected to the rest of his body. With Shaw’s contract due to expire at the end of 2017, however, the Indians have an interest in extracting every drop of value they can from Shaw before he moves on to another location.

A free agent this winter, Bryan Shaw’s interests aren’t entirely aligned with his club’s. (Photo: Erik Drost)

Consider, as well, the examples of Drew Pomeranz and Colin Rea. For those who may not immediately remember, Padres general manager A.J. Preller was suspended for 30 days for his failure to disclose medical information in the trade of Drew Pomeranz from the San Diego Padres to the Boston Red Sox. At the time of the transaction, the most pressing concern was whether San Diego had acted in good faith while dealing with Boston. Possibly the greatest risk, though, was the one absorbed by Drew Pomeranz. Colin Rea’s case is similar, except that he was ultimately returned by the Marlins back to the Padres. He hasn’t pitched in the major league since July 30, 2016, a few days before the deal was reversed.

In the cases both of Rea and Pomeranz, the San Diego Padres appeared to be intent on moving what they regarded as high-risk assets as quickly as possible, before their injury clocks expired and their value cratered.

The Padres decided to move their assets in order to protect organizational goals; however, the sort of remedial or progressive risk injury action that may have aided the player wasn’t a priority. While protecting arm health over the long term was a priority for Drew Pomeranz himself, it was less pronounced for the Padres than protecting his value as an asset. In this sense, Pomeranz and the Padres had split incentives. Pomeranz had yet to sign an extension and remained in the affordable team-control window. During such a window, a pitcher is both most attractive to clubs and also most incentivized to protect his long-term arm health. The Padres, on the other hand, were staring into the start of a long-term rebuild. Their goal? To acquire prospect talent. With these split incentives, it’s difficult to expect that a player will be capable of making efficient long-term healthcare decisions.

It would be haphazard to assert that the trades of Pomeranz and Rea were driven solely by analytically crafted health modeling. Indeed, faulty medical disclosures were the issue in the Red Sox trade. However, part of the “arms race” for the brightest PITCHf/x analysts and data scientists has been to isolate injury risk. This is, in many ways, positive for players, because organizations with better injury-risk information can improve the quality of information in the treatment’s decision-making process.

Yet, there’s significant danger for players when teams possess both different short- or long-term incentives and a substantial advantage in the amount of data and analysts they can utilize. Where teams have access to deeper wells of data and five times the analysts, agencies are more concerned with adding to and managing their client base. This leaves players at a severe deficit when making decisions.

While the issue is easy to identify, the solutions aren’t as simple. The 2017-2021 MLB Collective Bargaining Agreement doesn’t provide any increased access for agents or groups to Statcast data. Even then, additional data is of little use without any resources or analysts capable of using the information. With this in mind, the MLB Players Association would be wise to pursue two additions in the next CBA: (1) increased sharing of Statcast-generated data; and (2) funding for a player consortium specifically tasked with tracking and counseling players regarding injury risks.

This solution is certainly imperfect; requiring Major League Baseball teams to concede an informational advantage and their intellectual property would be no small task. Yet, as teams continue their pursuit of health modeling through advanced statistics, the player’s union must fight for some protections against the negative externalities that may be a byproduct of the teams’ informational advantage.

Mike is a student at Case Western Reserve University School of Law. He has served as a Resident at FanGraphs, and writes at Waiting For Next Year. Follow him on Twitter @snarkyhatman.

