Is Proprietary Information Disappearing?

Carl Crawford, Adam Dunn, and Jayson Werth signed large free agent contracts with new teams last offseason, and each were unequivocal disappointments in 2011 with their new club. This phenomenon is not limited to free agents. In recent memory, several highly touted prospects have been traded and not lived up to expectations with their new teams: Justin Smoak, Brett Wallace, and Kyle Drabek, to name a few.

Whenever a player changes teams and fails to live up to expectations, I find myself wondering, “Did his old team see this coming?” In these specific examples, we may never know, but we do know that teams have internal information which creates an advantage in personnel decisions. While this advantage may never completely go away, there is evidence to suggest that it’s starting to disappear.

For any given player, all teams have access to public performance statistics, and any team is free to send a scout to watch a major- or minor-leaguer play. That data isn’t proprietary. What is proprietary are the years of internal data, scouting reports, and notes from coaches, doctors, and teammates.

In his press conference after having his pending steroid suspension lifted, Ryan Braun gave some insight into the type of internal data the Brewers  keep on him:

“When we’re in Milwaukee we weigh in at least once or twice a week. I was able to prove that I literally didn’t gain a single pound. Our times are recorded every time we run down the line, first to third, first to home. I literally didn’t get one-tenth of a second faster. My workouts have been virtually the exact same for six years. I didn’t get one percent stronger. I didn’t work out any more often. I didn’t have any additional power or any additional arm strength. All of those things are documented contemporaneously, and if anything had changed, I wouldn’t be able to go back and pretend like it didn’t change.”

In one paragraph, Braun revealed that the Brewers keep internal numbers on his weight, three running statistics, workout performance and frequency, and some kind of data on bat speed and arm strength. When the Milwaukee front office met about Braun’s contract and had to decide whether to offer an extension or let him go to free agency, that data certainly helped inform the decision.

In some cases, Crawford for example, there is hardly a decision to be made — the team simply cannot afford the player anymore. In other cases, a team might have a cheaper prospect ready to take the player’s place. However, one has to wonder how many times a team has internal information which leads to trading the player or not offering a competitive contract in free agency.

This topic has been brilliantly covered by FanGraphs contributor Matt Swartz at The Hardball Times and in their 2012 Annual. Also, this post at The Book Blog  discusses Matt’s findings. The gap definitely exists, and its existence should not come as a surprise. In the long run, whoever has more information will make better decisions, and teams have more information about their own players than the rest of the market. Not every player who changes teams is doomed. However, this result suggests that it is likely that Albert PujolsPrince FielderJose ReyesMat LatosHeath BellCarlos BeltranMichael PinedaJesus Montero, and all the other players who changed teams this offseason are more likely to disappoint their new teams than impress them.

Taking Matt’s analysis a step farther, this post argues that the proprietary information gap is closing. There is more public information available today, and teams are doing more with it, than there was ten or even five years ago. Consider Pitch FX. Today, anyone can quickly find the speed and movement from every pitch in last night’s game. Ten years ago, if one team had this info and another team didn’t, it would make for a huge competitive advantage.

Matt looked at the effect on multi-year contracts, but for trending purposes, let’s just look things on a year-by-year basis. Players who were on the same team as last year are put into one pool and players who changed teams during the offseason are put in the other. Low sample size players and those who changed teams mid-season are thrown out.

Overall, hitters who changed teams did 3.6% worse than those than those who did not, and pitchers did 3.7% worse. There is a problem with this data though. The average age of the “leavers” is about a year older than the “stayers.” This is because teams are more likely to hang onto young players due to the arbitration rules. Team-controlled players are cheaper than open-market players, so teams are more likely to keep them around. To fix this issue, we need to restrict the data set to players who are not arbitration eligible.

Non-Arbitration Eligible Players
Stayers Leavers
Hitters (wOBA) 0.338 0.325
Pitchers (SIERA) 4.04 4.20

The average age for both groups is now almost exactly the same, and the advantage still exists, and is actually slightly larger. Hitters who changed teams did 3.9% worse and pitchers did 4.2% worse. This is not a perfect analysis, and if I were just trying to measure the size of the gap, I would defer to Matt’s work. What this method allows us to do is trend the data. For pitchers, the trend is as follows:

In less than 10 years, the proprietary information gap has gone from about 7% to about 2%. While players who left their teams did better in 2004 and 2010 than players who stayed, that is unsustainable. There will always be a gap, it is simply impossible for opposing teams to have the same amount of information as a player’s current team. However, it appears as if the amount of information, or the usefulness of that information, is decreasing when it comes to pitchers. Now for the hitters:

At first glance, there does not appear to be a trend of any kind. Perhaps this is true. However, note that this graph starts in 1991. Because wOBA is not dependent on batted ball data, where SIERA is, it allows for a longer date range to study hitters. If you just look at recent data, a different story emerges:

Definitive? Of course not. Suggestive? I think so, especially considering it mirrors the trend found in the pitcher data. And, just like the pitcher data, there is a natural asymptote at zero — this trend will approach zero, but never completely disappear.

In a way, these trend lines represent the sophistication of the market. As the data revolution in baseball began, teams who were ahead of the curve could make advantageous trades and make smarter free agent decisions than the teams who were behind. However, as more teams embrace Sabermetrics, it got harder and harder to make one-sided trades and the free agent market became more and more efficient. The proprietary information gap will never completely disappear, but it is probably half the size it was five years ago, and a quarter the size it was 10 years ago.

Print This Post

Jesse has been writing for FanGraphs since 2010. He is the director of Consumer Insights at GroupM Next, the innovation unit of GroupM, the world’s largest global media investment management operation. Follow him on Twitter @jesseberger.

32 Responses to “Is Proprietary Information Disappearing?”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Mark says:

    Articles like this make me glad I visit this site. Very interesting.

    +17 Vote -1 Vote +1

  2. sc2gg says:

    This article is about step 1 in my 150 step plan to solve baseball. Step 10 is to sit down and have a nice discussion with Alex Anthopolous.

    Vote -1 Vote +1

  3. DC Nats says:

    Great article. Really interesting – it’s an intuitive hypothesis that’s worthy or more data to back it up. Thanks for starting to pull some together.

    Vote -1 Vote +1

  4. BDF says:

    Really good. What does the increasing unimportance of proprietary information imply for competitive balance? Proprietary information about GMs … ?

    Vote -1 Vote +1

  5. Klatz says:

    Intriguing but the evidence is much weaker for the hitters than pitchers. While you can just look at recent data, that ignores half the data. Why would there be a sudden change in the availability of proprietary data in 2001? Moneyball and the rise of sabermatrics? Perhaps.

    An alternative theory would be that teams are more likely let players have performed badly go to FA or trades. Teams would more likely retain those who performed well (recently at least). Therefore those who leave are more likely to perform worse based on publically available data. What if you thresholded the data to only look at players above league average?

    Why would the acquiring teams pick up these “bad” players? To fulfill a need that the other team did not have or to do something during the offseason. Relatively speaking the leaving free agents have more value to the acquiring team, value that is not based on performance.

    Vote -1 Vote +1

  6. Seth Samuels says:


    I like the idea a lot, but it seems to me that there’s probably some selection bias here, no? This method seems to implicitly assume that home teams and competing other teams have an equal opportunity to sign a player. But at least some players are going to have a preference for staying put, rather than uprooting their lives and moving somewhere else. If we assume that the previous-year teams often have something of a first right of refusal, then we’d naturally expect this result even if there’s no information imbalance. My guess would be that if you look at the year *before* the team-change, you’d find that players who change teams perform worse than players who don’t. Or, put differently, better players are more likely to stay with the same team, even if we’re only using their publicly available stats to evaluate them.

    One easy way to mitigate this bias would be to compare the year-after-change performance to the year-before-change performance (i.e. Yr1wOBA-Yr0wOBA). It’s not perfect, but it’s a lot quicker and easier than the alternatives. I think there’s a decent chance you’ll see the same trend you’re predicting, but I’d expect the margin to be a bit smaller.

    Vote -1 Vote +1

  7. Antonio Bananas says:

    Pretty interesting.

    Maybe one reason is that teams are more willing to let productive vets leave? I don’t know the age statistics, but it seems like we’re in an era of youth. So maybe teams are letting good players leave more often because they are confident in their younger, cheaper players.

    Again, I don’t have any age statistics, but it’s a thought.

    Vote -1 Vote +1

  8. mcbrown says:

    This is very interesting, but I wonder if there might be a couple of sources of bias in the “stayer/leaver” data. For example, a given “leaver” might be expected by both sides to experience significant decline, and yet be leaving a team that has no need for a low-cost below-average player to go to a team that fields a whole team of such players (Manny Ramirez going to Oakland comes to mind, though obviously he’s not in your sample). Such a player would drag down the “leaver” group performance and yet not imply any kind of information advantage for his prior team. Or a low-cost role player could be leaving a bench or platoon role on a good team for a starting role on a weak team, thereby increasing his PAs and dragging down the “leaver” wOBA even if the transaction reflected no information asymmetry.

    On the pitcher side, is the “leaver” data by any chance skewed towards pitchers who left the NL for the AL? Similar to batters there could also be bias in the form of pitchers coming off injury post-contract, etc., where there is an expected decline but no obvious information asymmetry.

    I think this is a really useful and interesting line of inquiry, but I believe the data set needs to be further refined. I’m not sure how one would do so, however. Maybe the “leaver” data could be limited to players who had either received a raise in free agency (the implication being that they probably weren’t expected to decline significantly, though even that is tough to say in all cases) or were traded in the offseason only for one or more major-league players (no minor leaguers or cash, because inclusion of such consideration could offset an apparent information asymmetry but not show up in the MLB-only “leaver” data).

    Vote -1 Vote +1

    • This is a good question. I agree that there is some self-selection in these groups, and that’s why I would definitely defer to Matt Swartz’s research when measuring the gap outright. On the link to The Book Blog, MGL looks at the payers who left and compares them to their Marcel projections, which I think will answer your question.

      For my purposes, I’m circumventing the self-selection because I’m looking at a time series. As long the level of self-selection is constant over time, which I think it is, we can analyze the trend.

      Vote -1 Vote +1

  9. How many of the players in the changing group are there as a direct result of things like DFA’s etc. Which would naturally suggest a weaker player and thus create a sampling error. But interesting nonetheless.

    Vote -1 Vote +1

    • mike says:


      Also, I think you could confine your study to players who earned a similar or higher salary/contract value when leaving. This will remove the aging role players who leave and who’s declines are obvious.

      Vote -1 Vote +1

  10. night_manimal says:

    It looks like the compensation factor has also been left out entirely as a reason for players being allowed to leave. Teams like the Rays and Blue Jays seemed quite happy to let players walk in exchange for draft picks. In fact it seems many players were signed or traded for mainly for the additional compensation they would offer regardless of performance.

    Vote -1 Vote +1

  11. RC says:

    “The average age for both groups is now almost exactly the same, and the advantage still exists, and is actually slightly larger”

    How much of this though, is just selective park effect? If Joe Replacement is a right handed pull hitter playing in Fenway park, he may get a second contract. And his stats will decline in that second contract. If he comes up in the Mariners system, he gets bounced around a couple of years, and then is out of baseball because he never could hit with that big outfield.

    IE, the guys who get to second teams have an inherent bias: they were in a favorable enough situation on the first deal to get to that point.

    Vote -1 Vote +1

  12. gonfalon says:

    Interesting article. I’d bet the Pittsburgh Pirates have had some kind of internal data that has been used when Neal Huntington took over as GM. with the exception of Jose Bautista (who flew under everyone’s radar), and perhaps Jason Bay and Matt Capps (who each had one more good season left in them), the performance of many players that Huntington has traded or let go fell off a cliff the next season, due to declining skills or injury or both. His decision not to retain veterans like Ronny Cedeno, Paul Maholm, and Ryan Doumit on their club options last year were also stated to reflect some kind of internal valuation, which proved to be accurate given the difference between the options and the amount the player actually signed for 2012.

    This cuts both ways, though. Several of the players that Huntington has traded for or signed went on to grossly underproduce for the Pirates — Andy Laroche, Aki Iwamura, Matt Diaz, Lyle Overbay to name a few.

    Vote -1 Vote +1

  13. Curtiss says:

    I know this would take a ton more work, but the next logical step would be to break down the stayers/leavers into groups based on contract size and evaluate those groups. It seems like that would more definitively isolate trends, as this trend is probably not universal across all salary groups.

    You said in your article that the best players make the most money, and that precludes some teams from signing them. They are superstars and everybody knows that they are superstars. The real question is in what salary range does the propreitary advantage really take effect in the form of significantly better decisions?

    I don’t think I am doing a good job of explaining myself, but I do think if anyone understands the muddle that I just said that it would be truly interesting to examine.

    Vote -1 Vote +1

  14. rea says:

    You’ve got to wonder about proprietary information on the day when the hot young pitcher the Yanks got in return for their best prospect turns out to have a torn labrum

    Vote -1 Vote +1

  15. Wade8813 says:

    You can’t make a blanket comparison with this. The Cardinals were willing to give up an awful lot to keep Pujols, so it doesn’t seem likely that they had some hidden knowledge about him. Same with any other player that received huge offers by their former team.

    That said, if you removed those players from your data pool, you would quite possibly find the data is even more significant

    Vote -1 Vote +1

    • J. Inman says:

      It’s true that the Cardinals were willing to give up an awful lot. But, they reduced their offer after the 2011 season and you have to factor in the value that Pujols brought to the organization outside of his production on the field. He was an icon in St. Louis.

      Vote -1 Vote +1