## Using Double-A Stats to Predict Future Performance

Over the last couple of weeks, I’ve been looking into how a players’ stats, age, and prospect status can be used to predict whether he’ll ever play in the majors. I used a methodology that I named KATOH (after Yankees prospect Gosuke Katoh), which consists of running a probit regression analysis. In a nutshell, a probit regression tells us how a variety of inputs can predict the probability of an event that has two possible outcomes — such as whether or not a player will make it to the majors. While KATOH technically predicts the likelihood that a player will reach the majors, I’d argue it can also serve as a decent proxy for major league success. If something makes a player more likely to make the majors, there’s a good chance it also makes him more likely to succeed there.

Things that were predictive for players in low-A and high-A included age, strikeout rate, ISO, BABIP, and whether or not he was deemed a top 100 prospect by Baseball America in the pre-season. However, a player’s walk rate was not significant in predicting a player’s ascension to the majors. Today, I’ll look into what KATOH has to say about players in double-A leagues. For those interested, here’s the R output based on all players with at least 400 plate appearances in a season in double-A from 1995-2010. Due to varying offensive environments in different years and leagues, all players’ stats were adjusted to reflect his league’s average for that year.

Unlike in the A-ball iterations of KATOH, a player’s double-A walk rate is predictive — albeit only slightly — of whether or not he’ll make it to the show. While walk rate is statistically significant, it still matters much less than the other stats: it takes 3 or 4 percentage points on a player’s walk rate to match what 1 percentage point of strikeout rate does to a player’s MLB probability.

This version is also different in that there are a couple of significant interaction terms, signified by the last two coefficients in the above output. The “I(Age^2)” term adds a little bit of nuance into how a players’ age can predict his future success. While the “ISO:BA.Top.100.Prospect” term basically says that if you’re a top 100 prospect, hitting for power is slightly less important than it would be otherwise. Hitting for power and making Baseball America’s top 100 list both make a player much more likely to make it to the majors, but if he does both, he’s a tad less likely to make it than his power output and prospect status would suggest independently. Put another way, a few top 100 prospects hit for power in double-A, but never cracked the majors — such as Jason Stokes (.241 ISO), Nick Weglarz (.204 ISO) and Eric Duncan (.173 ISO). But virtually all of the low-power guys made it, including Elvis Andrus (.073 ISO), Luis Castillo (.076 ISO), and Carl Crawford (.078). For non-top 100 guys, many more punchless hitters topped out in double-A and triple-A.

By clicking here, you can see what KATOH spits out for all current prospects who logged at least 250 PA’s in double-A as of July 7th, as well as a few that fell short of the cutoff — most notably Joey Gallo, Kevin Plawecki, and Robert Refsnyder. Topping the list is Mookie Betts with a probability of 99.95%, and of course the prophesy was fulfilled when the Red Sox called up the 21-year-old last month. Here’s an excerpt of the top players from double-A this year:

Player Organization Age MLB Probability
Mookie Betts BOS 21 100%
Francisco Lindor CLE 20 100%
Gary Sanchez NYY 21 99%
Austin Hedges SDP 21 99%
Alen Hanson PIT 21 99%
Jorge Bonifacio KCR 21 98%
Blake Swihart BOS 22 98%
Kris Bryant CHC 22 93%
Ketel Marte SEA 20 91%
Rangel Ravelo CHW 22 90%
Robert Refsnyder NYY 23 86%
Jake Lamb ARI 23 85%
Jake Hager TBR 21 84%
Joey Gallo TEX 20 82%
Preston Tucker HOU 23 81%
Kevin Plawecki NYM 23 79%
Cheslor Cuthbert KCR 21 78%
Kyle Kubitza ATL 23 77%
Michael Taylor WSN 23 76%
Christian Walker BAL 23 76%
Ryan Brett TBR 22 75%

Keep an eye out for the next installment, which will dive into what KATOH says about hitters at the triple-A level.

Statistics courtesy of FanGraphs, Baseball-Reference, and The Baseball Cube; Pre-season prospect lists courtesy of Baseball America.

Print This Post

Chris works in economic development by day, but spends most of his nights thinking about baseball. He writes for Pinstripe Pundits, FanGraphs and The Hardball Times. He's also on the twitter machine: @_chris_mitchell None of the views expressed in his articles reflect those of his daytime employer.

Member
Aaron (UK)
1 year 10 months ago

I very much like the approach in general but to live up to the title of the post I think you ought to consider stripping out the BA.Top.100.Prospect variables entirely.

Guest
Spencer00
1 year 10 months ago

Seconded

Member
1 year 10 months ago

This is incredibly intriguing research. I’m very interested in this article and the upcoming one for AAA. This has huge advantages for fantasy purposes as well. Keep it up!

Member
evo34
1 year 10 months ago

I think once you get to this level, the test of “making it” or not becomes a bit less meaningful, in that most players will at least make it. Is it possible to look at those who make it and do well vs. those who do not?

Guest
Fish
1 year 10 months ago

It’d be interesting to see what this method says when applied to last year’s AA crop, or the year before.

Member
Josh
1 year 10 months ago

I like this series, good work. It will take some more time but what about if where the prospect was ranked and giving some credit to the prospects who just missed.