The 20-80 Scouting Scale, Translated to Wins

Recently, for the electronic pages of Baseball America, Therron Brockish provided a scouting report from the Arizona Fall League on Boston third-base prospect Garin Cecchini. As part of that report, Brockish graded each of Cecchini’s tools on the 20-80 scale (or 2-8 scale, depending) commonly utilized by scouts, where 50 represents major-league average and every 10 points is equal roughly to a standard deviation from same.

The tools are useful as a concise but meaningful instrument by which to summarize a player’s strengths and weaknesses. Because most players in the baseball universe (including minor leaguers and amateurs, as well) are below major-league average, even a 50 grade is a pretty big deal. Cecchini, for his part, received a future grade of 70 on his hit tool from Brockish and about a 50, give or take five points, on the other four traditional tools (power, speed, fielding, arm).

Owing mostly to how Therron Brockish has 20 years of experience as a scout and how I, Carson Cistulli, have 20 years experience mostly in just feeling ashamed of myself, I have no interest, really, in commenting upon Brockish’s assessment of Cecchini. An exercise in futility, is what that would be. I’m well-enough acquainted with futility already that I don’t need to go seeking it out on purpose.

Nor are the particulars of his report really my concern for the moment. What is my concern is attempting to answer a question that has probably occurred to me somehwat vaguely in the past, but, for reasons that are mysterious, presented itself more clearly because of Brockish’s report. The quesion is this question: “If one takes for granted that Cecchini is likely to be a 70 hitter and a basically average everything-else, what does that mean in terms of his likely value so far as WAR is concerned?” Or, phrased more broadly: “Is it possible to ‘translate,’ as it were, all the grades for the five tools into a WAR value?”

What follows represents a mostly responsible attempt, I think, at answering those questions. Answers have been facilitated greatly by a rough WAR calculator of the author’s own invention, but constructed also with no little help from Bradley Woodrum’s De-Lucker. Note, finally, that I’ve used 550 plate appearances as the baseline for an offensive player, for this reason: a batter with league-average rates in everything and 550 plate appearances produces a 2.0 WAR — i.e. the sort of WAR generally associated with a league average player.

This very crowded table (where HRC% represents home-run rate on contact and everything else is pretty self-evident) provides an illustration of the average player such as he appears in the year 2013:

PA BB% K% HRC% BABIP BsR Fld+Pos HR H AVG OBP SLG ISO wOBA Off Def Rep WAR
550 8.0% 19.5% 3.5% .300 0.0 0.0 14 129 .257 .321 .403 .146 .319 0 0 18 2.0

And with that established, below is an attempt at answering the above-mentioned questions.

Hit Tool
Both the Major League Scouting Bureau (by way of Andrew Ball at Fake Teams) and real-live pro scouting coordinator Kevin Goldstein describe the hit tool as a player’s ability to hit for average. Both a multivariate regression (which produces an r-squared of .86) and also just common sense reveal that a combination of strikeout rate and BABIP account for the greatest part of a player’s batting average. Home-run rate is another factor, but less significant and also accounted for by the next tool, power.

The league-average batting average in 2013 was .257; a standard deviation, about 25 points. Using those figures as benchmarks, below is an estimate of the value for each grade along the scouting scale, assuming a league-average walk rate (ca. 8%), home-run rate on contact (ca. 3.5%), and also league-average baserunning and defense. Note that I’ve also attempted to distribute changes in strikeout rate (19.5% average, 6.0% standard deviation) and BABIP (.300 average, .025 standard deviation) evenly.

Grade PA K% BABIP AVG OBP SLG ISO wOBA Off WAR
20 550 31.5% .250 .182 .253 .313 .131 .246 -31 -1.4
30 550 27.5% .268 .207 .276 .343 .136 .271 -20 -0.2
40 550 23.5% .285 .232 .298 .373 .141 .296 -10 0.9
50 550 19.5% .300 .257 .321 .403 .146 .319 0 2.0
60 550 16.0% .316 .282 .344 .432 .150 .341 10 3.0
70 550 12.5% .330 .307 .367 .461 .154 .362 19 4.0
80 550 9.5% .345 .332 .389 .490 .158 .381 27 4.9
Avg 1.0

Value of Every 10 Scouting Points (From 50): About 1.0 WAR.

Power Tool
The sources named above (i.e. the Scouting Bureau and Goldstein) both regard the power tool as roughly descriptive of a player’s ability to hit home runs at the major-league level.

As Mark Smith found in February, the league average home-run rate is approximately 2.5% (or 14 home runs per 550 PA). As Smith also found, a standard deviation is of such a size (about 1.5%) that it prevents any player from recording a home-run rate any lower than about 1.5 standard deviations below the league average (or, about a grade of 35 on power). To address that, I’ve halved the standard deviation for the three grades below 50, so that a 20 and 30 power grade actually mean something. The figures recorded here seem to match, more or less, the thresholds I’ve seen elsewhere for power.

As suggested in the discussion of the hit tool above, home runs can have a substantial influence on batting average at the extremes. Below, for example, is an estimate of all the grades on the power tool produced by changing merely the player’s home-run rate on contact (3.5% league average, 2.0% standard deviation) — and, again, assuming league-average everything else.

Grade PA HRC% HR AVG OBP SLG ISO wOBA Off WAR
20 550 0.5% 2 .240 .306 .310 .070 .281 -16 0.2
30 550 1.5% 6 .246 .311 .341 .095 .293 -11 0.8
40 550 2.5% 10 .251 .316 .372 .120 .306 -5 1.4
50 550 3.5% 14 .257 .321 .403 .146 .319 0 2.0
60 550 5.5% 22 .268 .331 .465 .197 .344 11 3.2
70 550 7.5% 30 .279 .342 .527 .247 .369 22 4.4
80 550 9.5% 38 .290 .352 .588 .298 .394 33 5.5
Avg 0.9

What one finds is that batting average increases a full 50 points between a player with 20 power and one with 80 power without altering strikeout rate or BABIP, at all. This is a reality of the hit and power tools, if one defines them in a particular way: they interact.

I will not speak to which way is right or wrong, necessarily; however, it seems as though the objective in using the tools as an instrument is to isolate different aspects of a player’s overall physical abilities. As such, perhaps a different way to assess power is to assume a lower BABIP for each jump in power, which is what is occurring in the table below.

Grade PA HRC% BABIP HR AVG OBP SLG ISO wOBA Off WAR
20 550 0.5% .321 2 .257 .321 .326 .070 .295 -10 0.9
30 550 1.5% .314 6 .257 .321 .352 .095 .303 -6 1.3
40 550 2.5% .307 10 .257 .321 .377 .120 .311 -3 1.6
50 550 3.5% .300 14 .257 .321 .403 .146 .319 0 2.0
60 550 5.5% .285 22 .257 .321 .453 .197 .334 7 2.7
70 550 7.5% .269 30 .257 .321 .504 .247 .348 13 3.4
80 550 9.5% .253 38 .257 .321 .555 .298 .362 19 4.0
Avg 0.5

Though not intentional, a byproduct of this second method is actually repesentative of how power works. Frequently, batters will produce high home-run rates because a greater percentage of their batted-balls are fly balls. While producing more home runs, fly balls also produce lower BABIPs than line drives or ground balls.

Value of Every 10 Scouting Points (From 50): Either about 0.9 WAR or 0.5 WAR.

Run Tool
Speed or running or whatever one calls it precisely is relatively easy to measure: one uses a stopwatch. Our concern for the present moment, however, is to estimate what that raw speed might be worth in terms of runs and wins. At this site is housed a metric (BsR) that combines runs produced from stolen bases (wSB) and runs produced by other baserunning acts (UBR). Is there a perfect correlation between those figures and raw speed? Likely not. Baserunning production is likely a combination of raw speed and “instincts” and, finally, opportunities.

For the moment, however, we’ll use BsR as a proxy for speed. League average by that metric is zero by definition; a standard deviation, about 3 runs. Using those figures, below is an estimate of the various speed grades (as they relate to baserunning exclusively and not, for example, defensive range or BABIP).

Grade PA BsR Off WAR
20 550 -9.0 -9 1.0
30 550 -6.0 -6 1.4
40 550 -3.0 -3 1.7
50 550 0.0 0 2.0
60 550 3.0 3 2.3
70 550 6.0 6 2.7
80 550 9.0 9 3.0
Avg 0.3

Value of Every 10 Scouting Points (From 50): About 0.3 WAR.

Fielding and Arm Tools
Just as with baserunning, there’s a disconnect between the actual physical skills of fielding and the precise value, in runs or wins, which those skills are likely to produce. For that reason, I’ve made no attempt to separate the differences between the fielding and arm tools. Also for that reason, I must concede that what we’re considering here are proxies for the tools, and not any sort of objective expression of the tools themselves.

The move on this site recently to package positional adjustment and UZR into one position-agnostic metric (called Def and described more thoroughly by Dave Cameron here) has been helpful in terms of comparing defensive production. It’s been clear for a while that Brett Gardner, for example, has been as valuable in left field as many defenders are in center. In fact, Gardner has recorded about +7.5 Def figure for every 150 games he’s played — or, roughly what a +5 center fielder would post over that same interval. Meanwhile, even a slightly below-average shortstop like Mike Aviles or Jamey Carroll or Jed Lowrie has still produced positive defensive figures cumulatively over the last three seasons.

By definition, the league average fielder records a defensive value (again, fielding plus positional adjustment) of zero. The standard deviation is about 10 runs per annum. Below is an estimate of how those figures equate to the scouting grades.

Grade PA Fld+Pos Def WAR
20 550 -30.0 -30 -1.2
30 550 -20.0 -20 -0.1
40 550 -10.0 -10 0.9
50 550 0.0 0 2.0
60 550 10.0 10 3.1
70 550 20.0 20 4.2
80 550 30.0 30 5.2
Avg 1.1

Value of Every 10 Scouting Points (From 50): About 1.1 WAR.

Summary with Many Caveats
If I haven’t stressed it enough above, let me say it here: this is not intended to be a definitive work, but rather a rough attempt to provide some clarity on the actual significance, in terms of wins, of the various scouting grades for each tool. It’s come to my attention that, even if scouts do not always use grades precisely as they relate to the league average and standard deviations as those figures exist in major-league baseball in 2013, it’s generally the case that scouts understand each other so far as the significance of those grades is concerned.

Ultimately, that’s the goal of any sort of language — to communicate meaning. Part of knowing French, I’ve learned, is knowing not how to speak what might be considered “proper” French, but how to speak it like an actual French person does every day. (Ingeniously, I’ve solved the problem by learning neither kinds of French.) There are idioms and expressions and plays on words that are opaque to someone who has only learned the language in a classroom.

The scouting grades are very possibly not unlike any other language. My limited exposure to the scouting world suggests to me that individual scouts are not anxious about the exact league-average BABIP or the standard deviation for 2013 of home-run rate on contact. Still, it appears as though an understanding exists of the difference between a 40 and 60 hitter, or 20 and 50 runner. This post likely isn’t for that crowd, but rather for those of us (the author very much included) who appreciate seeing that sort of thing rendered objectively. If nothing else, it allows for a starting place for translating the grades on a player’s tools into something concrete — like for Garin Cecchini, who, as a 70 hitter and average everything-else, would likely produce about a 4.0 WAR.

Finally, by way of reference, here’s a summary of the findings above:

Tool WAR Every 10 Pts*
Hit 1.0
Power 0.9 or 0.5
Run 0.3
Fld/Arm 1.1

*Above or below major-league average, that is, meaning the ceiling for a player with 550 PAs is ca. 10-12 wins.



Print This Post



Carson Cistulli has just published a book of aphorisms called Spirited Ejaculations of a New Enthusiast.


Sort by:   newest | oldest | most voted
BDF
Guest
2 years 7 months ago

Even by non-Cistulli standards, this is superior thinking and execution. Thanks.

Nathan Nathan
Guest
Nathan Nathan
2 years 7 months ago

This is excellent!

I think that your second interpretation of power is the one scouts should prefer, because it works to be more orthogonal to the hit tool.

I like to take things to extremes to see how they might be improved. Am I right in interpreting that a player that’s 80s across the board would be projected for about 8.6 – 9.9 WAR/550 PAs, depending on the power interpretation?

(Obviously, this would be only true 80s, not 80s that are actually off the top of the scale.)

That works quite well. Very impressive stuff.

Red Pencil
Guest
Red Pencil
2 years 7 months ago

I believe the range would be 10.4 – 11.9 WAR/550 PA, no? 80’s across the board give you 3X(2.8 to 3.3) additional WAR, or 8.4 – 9.9 on top of the 2.0 WAR for the average player. Seems like a reasonable proxy.

Spencer D
Guest
Spencer D
2 years 7 months ago

Isn’t there some degree of compounding? For example, the effect of baserunning is magnified with an on-base machine (a la Mike Trout, Barry Bonds, etc.).

Flyingbiker
Guest
Flyingbiker
2 years 7 months ago

It is worth noting that this whole analysis does not take into account variations in walk-rate, something I am told is pretty important.

Overall, I think it is really impressive that boiling down the 20-80 scale to WAR terms passes the high level/sanity checks. Nice work.

Nathan Nathan
Guest
Nathan Nathan
2 years 7 months ago

Yes, that’s correct. I forgot to add the 2.0 WAR back in for the average player.

And as Kinanik notes, that’s ignoring base-running/OBP type interactions.

Spit Ball
Guest
Spit Ball
2 years 7 months ago

People are all fucked up, but your arithmetic is downright fucking neurotic……whoops Freudian slip, I mean erotic.

Optimistic Twins Fan
Guest
Optimistic Twins Fan
2 years 7 months ago

So that means Buxton’s a certain 9 WAR player annually, right?

Kinanik
Member
Member
Kinanik
2 years 7 months ago

Doesn’t this assume no interaction effects? The difference in value between a 50 power and 70 power is different for a player with a 50 hit tool vs a 70 hit tool, I would guess.

Eminor3rd
Guest
Eminor3rd
2 years 7 months ago

Great work, French Carson.

James K.
Guest
James K.
2 years 7 months ago

It’s Carçon now, actually.

Vlad the Impaler
Guest
Vlad the Impaler
2 years 7 months ago

“Carcon means boy”

gnomez
Guest
gnomez
2 years 7 months ago

garçon

Stapler
Guest
Stapler
2 years 7 months ago

So by this math, a 20/20/20/20/20 player is worth -0.7 WAR, or a better player than Adieny Hechevarria, while a guy like Mike Trout is worth ~17 wins.

Cool system.

Spencer D
Guest
Spencer D
2 years 7 months ago

Mike Trout isn’t all 80s. His arm is ~50 and Speed ~70 by my judgement.

Bip
Member
Member
Bip
2 years 7 months ago

I don’t think he has 80 power either. As some have pointed out, the combination of power and hitting will probably create more home runs than just 80 power. Chris Davis his 54 homers despite a lousy K rate. That is 80 power.

Johnhavok
Guest
Johnhavok
2 years 7 months ago

Have you thrown some real players into the mix with their scouting #’s and see how they add up in MLB right now to a player’s WAR totals?

nada
Guest
nada
2 years 7 months ago

this is what I want to see. Grades are nice and all, but how informative are they as to how a player will actually perform?

Another question I had about this: WAR at what age? Player WAR changes over the course of a career. Does this project to WAR in a player’s prime years? Their career average?

Roger
Guest
Roger
2 years 7 months ago

There are usually 2 sets of grades given: present and future (typically meaning prime years – unless of course the player is past his prime). For prospects such as Cecchini, grades typically imply future, and present is explicitly stated when used.

Johnhavok
Guest
Johnhavok
2 years 7 months ago

Yes I understand that. My question was to take curernt MLB players who have already established what their tools are and use the model to see how accurate it is.

For example Robinson Cano.

Roger
Guest
Roger
2 years 7 months ago

I was replying to nada’s 2nd paragraph.

Comparing these results to scouting reports such as the ones at http://scouts.baseballhall.org/ would be interesting.

Oh, Beepy
Guest
Oh, Beepy
2 years 7 months ago

This is the best Content:Comma ratio Carson Cistulli may have ever achieved.

This is a career post, an outlier, and absolutely great.

Now please go back to abusing commas.

AC of DC
Guest
AC of DC
2 years 7 months ago

I mostly read the whole thing as “Blah, blah, Cistulli excited over high marks given to a guy whose name ends with an i,” and ignored the rest. It was probably good, though.

Bryan
Guest
Bryan
2 years 7 months ago

This also seems like a handy way to come up with “ideal comps” based on scouting reports. If I project from this system that prospect X is expected to produce a certain WAR made up from the various components, I can go out and find a MLB player whose most recent season (or if you prefer, career) looks just the same — and voila, an ideal comp!

MDL
Member
MDL
2 years 7 months ago

Some of the top hitters for each of the last couple years, according to the MLB.com rankings and using the scouting grades listed at MLB.com and BaseballProspectNation.com. These are values compared to average, so a 50-50-50-50/50 player gets a 0 and an 80-80-80-80/80 players gets an 8.4-9.9

2013 (MLB.com)
Byron Buxton 3.7
Oscar Taveras 3.4-3.9
Miguel Sano 1.5-2.5
Francisco Lindor 6.2
Xander Bogaerts 3.1
Carlos Correa 3.8-4.3
Javier Baez 3-4
Nick Castellanos 2-2.5

2012 (BaseballProspectNation.com)
Jurickson Profar 5.1
Wil Myers 5.3-5.8
Nick Castellanos 3.1-3.6
Travis d’Arnaud 1-1.5
Francisco Lindor 3.2-3.7
Billy Hamilton (-0.6)-0.6
Mike Olt 2.9-3.9
Christian Yelich 2.1-2.6

2011 (BaseballProspectNation.com)
Mike Trout 8.2-8.7
Bryce Harper 5.6-7.1
Jesus Montero 2-3
Manny Machado 2.2-3.2
Jurickson Profar 5.1
Wil Myers 5.3-5.8
Anthony Rizzo ?
Devin Mesoraco (-1)-(-0.5)

dose17
Member
dose17
2 years 7 months ago

This article was well above replacement-level. I give it a high-70 grade, Cistulli, if you will.

d_i
Member
Member
d_i
2 years 7 months ago

I’ve often thought about this and contemplated the relative importance of each tool and the interplay between them, but never attempted to quantify it like this – good work.

This makes me miss the Up and In podcast. RIP Kevin.

jim S.
Guest
jim S.
2 years 7 months ago

Merveilleux, monsieur.

tz
Guest
tz
2 years 7 months ago

I love this kind of stuff. Thanks Carson!

Brandon
Guest
Brandon
2 years 7 months ago

I learned a long time ago that the way to tell the difference between a scout and a sabermetrician is by their shame.

RaysFan
Member
RaysFan
2 years 7 months ago

So basically, Mike Trout is a 160 player.

Brandon
Guest
Brandon
2 years 7 months ago

From what I’ve heard he’s pretty good.

nil satis nisi optimum
Guest
nil satis nisi optimum
2 years 7 months ago

I think you are underrating him substantially.
His greatness is beyond measure, it is numerically undefined.

Luke
Guest
Luke
2 years 7 months ago

Great work

Jon
Guest
Jon
2 years 7 months ago

Question: Is a 20 the worst player in the majors, or is it the worst player in the minors, or is it your Uncle Earl who sits on the couch eating Cheetos?

Owen
Guest
Owen
2 years 7 months ago

It’s funny that you used Garin Cecchini as your intro, since his best skills (walks and doubles power) are arguably ignored by the 5-dimensional analysis, and thus by your translation into WAR. Aren’t these also the skills most ignored by traditionalists (BA ignores walks and treats doubles like singles; Triple Crown stats ignore doubles)?

Plucky
Guest
Plucky
2 years 7 months ago

What would happen if rather than adjusting flyball rate, you made the power tool operate at the level of HR/fly and 2B/H rather than adjust fly ball% or BABIP directly? Would have a modest effect on avg, and isolate ISO pretty well

wpDiscuz