Over the last couple of weeks, I’ve been looking into how a players’ stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. I used a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there. This hypothesis may be less true for players at the Triple-A level since such a high proportion of these players make it to the majors, but I still think it provides some insight. To address this issue, In the future, I plan to engineer an alternative methodology that takes into account how a player performs in the majors, rather than his just getting there.
For hitters in Low-A and High-A, age, strikeout rate, ISO, BABIP, and whether or not he was deemed a top 100 prospect by Baseball America all played a role in forecasting future success. And walk rate, while not predictive for players in A-ball, added a little bit to the model for Double-A hitters. Today, I’ll look into what KATOH has to say about players in Triple-A leagues. Due to varying offensive environments in different years and leagues, all players’ stats were adjusted to reflect his league’s average for that year. I also only considered what happened during or after the sample season. So if a former big leaguer spends the full season in Triple-A, he’s only considered to have “made it to the majors” if he resurfaces again. For those interested, here’s the R output based on all players with at least 400 plate appearances in a season in Triple-A from 1995-2011.
This output looks pretty similar to what we saw for Double-A hitters, including the “I(Age^2)” coefficient, which adds a bit of nuance into how a players’ age can predict his future success. But in this version, there’s also an interaction between ISO and age. Basically, this says that the ability to hit for power is much more important for older players than younger players at the Triple-A league level.
By clicking here, you can see what KATOH spits out for all players who logged at least 250 PA’s in Triple-A as of July 7th. . I also included a few interesting players who missed the 250 PA cut off, including Mookie Betts, Rob Refsnyder, Ramon Flores, and Kris Bryant. Here’s an excerpt of the top players from Triple-A this year. Joc Pederson tops the charts with an impressive 99.91% probability. Many of these players have already played in the majors, so these values can be interpreted as the odds that said player will play in the majors in the future.
Now that I’ve gone through all levels of full-season ball, I’ll start at the bottom and cycle through the short-season leagues. These samples will be pretty small, but perhaps not completely useless now that those players have a few weeks’ worth of games under their belts. At the very least, it will be interesting to see what KATOH’s able to tell us about batters so far away from the big leagues, even if it’s a little premature to ask KATOH about 2014’s players.
Print This Post