Scouting the Minors Pitch by Pitch: Projecting Infield Defense

Nolan Arenado was a pretty great fielder before he graduated to the majors. (via Joey S)

With the advent of Statcast, baseball continues to push the envelope with respect to publicly available, numerical and statistical data. However, despite the relentless push to provide richer data at the major league level, we still have limited access to data below the big league level. To this end, we began to explore what data are available and discovered a few interesting things, most notably that the play-by-play data aggregated at the minor league data is surprisingly robust, so much so that it is able to lend some hints as to future major league level productivity.

None of the prior pieces are required readings for today’s article; however, if you are curious about the technical details behind the model that is behind what will be presented here, please read Part 3: Modelling Infield Defense to get a sense of the strengths and weaknesses of the model. It is not quite UZR, but it at least points in the same direction. One key aspect of minor league data that has always intrigued me that, to my knowledge, is not yet publicly available, is an objective, play-by-play-based measure of defense in the minor leagues. Today, we will explore this and then predict who will be good major league defenders.

3B Defense | R-Square 0.15

We begin by diving into the hot corner, where we see a lukewarm relationship of roughly 0.15 from minors to majors. Here’s a scatter plot showing the relationship:


I always find it encouraging when a model can accurately capture the ends of the spectrum, the top and bottom performers. It is likely that noise level will obfuscate the vast majority of the population that sits within two standard deviations of the mean, whereas the ends of the distribution tails will necessarily be emitting more signal. Kevin Frandsen (-11 UZR/150) at 3B  and Pedro Ciriaco (-9.1 UZR/150) were horrible defensively at third base in the minors before they ever reached the majors. However, guys like Matt Domiguez and Brent Morel (according to this model) were poor minor league defenders but have are measured positively here. Interestingly, UZR has a dimmer view of Dominguez’s major league performance.

On the positive side, we see that Miguel Sano being talked about tongue-in cheek as a defensive superstar would not seem that outlandish had we known that he was a pretty good minor league third baseman. (He has a career +3.9 UZR/150). Guys like Nolan Arenado and Matt Duffy were also good minor league defenders before they got to the majors, same with Jake Lamb, whose major league success is mixed per UZR.

There is obviously a lot of noise mixed in with the signal. However, it can be interesting to look at guys like Sano, Kris Bryant and Maikel Franco, who were mostly brought up for their bats but showed signs, statistically, that they could hold their own in the majors. Franco, by no means a great third baseman, has improved dramatically and was close to average in 2016.

SS Defense | R-Square 0.21


We have a lot more data when it comes to shortstops than we do for any other infield position, and we see some very encouraging results. Addison Russell was the best minor league shortstop in this sample and very near the top at the major leagues, as well. Joining him are other defensive stalwarts such as Andrelton Simmons and Carlos Correa.

This model appears to be far more optimistic about Jordy Mercer than UZR, and I will defer to UZR as the more robust model. However, it is at least internally consistent with its view of Mercer. Ketel Marte‘s dismal defensive performance in 2016 is very consistent with his poor performance in the minors; I wouldn’t be optimistic at his prospects of sticking at shortstop for the long term, especially considering he lacks the offensive firepower to offset his sub-par defense.

Jorge Polanco falls into the same bucket as Marte, with abysmal minor league performance followed up by a pretty terrifying -32.3 UZR/150 in the majors. I wouldn’t hold out hope that Polanco figures it out in the majors. In other words, despite the small sample size of Polanco’s major league defense, if we include his minor league data, it paints a fairly stark picture.

One outlier is Luis Sardinas, who has been pretty awful in the majors. He wasn’t particularly good in the minors, but not nearly as bad as his major league performance would indicate. Eduardo Nunez, another outlier, has recently turned the corner the past two seasons at shortstop (performing a bit above average), which, if we are to believe his minor league performance, would indicate it might actually stick.

Other than Everth Cabrera, no other players who performed poorly in the minors were anything better than average major league shortstops. Similarly, all the top performers save Nunez were at least average major league shortstops.

The takeaway here is that while we’d have a hard time projecting precisely how good or bad a player will be, we can get a fairly good understanding about what the range of outcomes is if they lie on either tail of the curve.

2B Defense | R-Square 0.01 2B to 2B and 0.23 for SS to 2B

When I first tested second base minors-to-majors performance, I was a bit flummoxed when I got a measly correlation of 0.01 and a scatter plot that looked like this:


… which doesn’t really have any coherence to it, except for the fact that it really likes Carlos Sanchez. (UZR is a fan, as well). There are two possibilities here: First, it is quite possible there are some measurement issues with respect to second baseman (please refer to part 3 for more details). Alternatively, it is quite possible that being a good defensive second baseman in the minors is not predictive of success at the majors–though I’m not even close to ready to conclude this).


However, when we switch the analysis to looking at minor league shortstops who ended up playing some second base, we suddenly get a much stronger result, both visually and statistically. The same trends we saw earlier with respect to guys on the extreme ends, such as Grant Green being an awful minor league shortstop (translating into a -11.9 UZR/150 as a second baseman in the majors) and Russell just being an awesome defender wherever he plays.

Josh Rutledge has improved recently, but he never had the minor league pedigree that would suggest major league dominance at second base, though he may be better than his initial performance. Tyler Greene is another massive outlier, though his major league shortstop record is better than his second base record, suggesting there may be a lot of small-sample-size noise playing here.

Preliminary Conclusion Before Projections

The common theme that runs through much of this analysis is a focus on the extremes, meaning if someone was really good in the minors, he’ll probably fall into the very wide range of average-to-very good in the majors. Conversely, if a player was just awful in the minors, he probably will fall into the similarly wide range of very awful to passable. What we’re unlikely to find are the situations in which a player goes from exceptionally bad to exceptionally good.

With that in mind, let’s take a look at some prospects who haven’t played much in the majors and have had at least 200 fielding opportunities in the minors. You may see some names pop into these lists where a player has played a different position in the majors than he played in the minors.

Projecting SS MLB defensive ability


Consistent with the theme outline above, I’ve narrowed the field to just the really bad defenders. The model here would concur with the thought that Alen Hanson is not likely to play shortstop in the majors, but it also would suggest he’s not going to be a very good defender at second base, either. Taylor Motter, who has decent offensive potential, does not project very well at the major league level. All in all, there are not a lot of interesting names here.


Now we have some intriguing names. Tzu-Wei Lin would be an otherwise nondescript prospect; however, these data suggest he may make the major leagues based on his defense alone. Although quite old for a prospect, JT Riddle–despite his less than optimistic scouting report–may just be better than scouts think. Wilmer Difo is intriguing, as his strong showing in the minors might indicate he can stick at shortstop and would naturally be a plus at second as well. Although not on this list, Amed Rosario is at best average and does not project to be a particularly good major league shortstop.

Projecting 3B MLB defensive ability

There were no notable players on the end of the defensive spectrum, so we’ll skip right to the best defensive third basemen in the minors.


We see a couple of interesting names here, starting with Matt Chapman. Chapman is a classic three-true-outcomes type of hitter, so it is quite encouraging that he rates as a good defensive third baseman, as it should give him more wiggle room to figure things out at the major league level. From a fantasy perspective, this would suggest he has a higher probability of sticking at third rather than shifting to first base as his hitting profile would suggest. Eric Wood is definitely not a sexy prospect, but he may have a shot at being close to an average major leaguer. All in all, a pretty boring list.

Projecting 2B MLB defensive ability


The first name that jumps out is the very underrated Tony Kemp. Although a horrible outfielder, he is being blocked at his natural position of second base by the great Jose Altuve. The data suggest (albeit loosely given the limits of these data with respect to second baseman) that were he to move to a different team with an opening at second, Kemp may be quite a useful asset. One guy to keep an eye on is Max Moroff, who made the 2017 All-KATOH second team and who may just have the defensive chops to succeed in the majors

Closing Thoughts

The data, as well as the methodology, could do with some improvement. However, despite the shortcomings, we still can get a little more information than we had before. It appears the best data available to us are with respect to shortstops, where we can have more confidence in a guy like Lin becoming a major leaguer. As time passes and we have more minor league play-by-play data to use, along with a more refined methodology, we can dig deeper into minor league defense.

This research was largely influenced and inspired by Chris Mitchell’s KATOH and certain concepts in UZR, which served as a benchmark/litmus test for the model used herein.

Eli Ben-Porat is a Senior Manager of Reporting & Analytics for Rogers Communications. The views and opinions expressed herein are his own. He builds data visualizations in Tableau, and preps data in Alteryx. Follow him on Twitter @EliBenPorat.
In the Projecting SS MLB Defensive Ability plots, plots first name is Alen Henson with a value of (-0.08). What exactly is the value? Is that his projected UZR/150?


I mean, it’s likely his +/-, not UZR, but is it scaled to 150 games?


This is some interesting stuff, but I am struggling with an underlying assumption for possibly replicating or building on your work: where is your source data from?

I couldn’t find any reference to it in your previous model article, beyond a citation to MLBAM. Is it the same Gameday files as in the Power article, because I was under the impression that they didn’t provide , for example, spray angle.