Daily Notes, Featuring an Experiment

Table of Contents
Here’s the table of contents for today’s edition of Daily Notes.

1. Experiment: SCOUT+ Batting Leaderboards
2. SCOUT+ Leaderboard: Spring Training Batters
3. SCOUT+ Leaderboard: Arizona Fall League Batters, Revisited

Experiment: SCOUT+ Batting Leaderboards
For the past year-plus, I’ve frequently published in these pages what I’ve called the “SCOUT leaderboards” for winter leagues and (recently) spring training. I’m quoting myself when I write that “SCOUT represents an attempt to derive something meaningful from small samples” and is the average of a player’s standard deviations from the league mean (or, z-score) either in regressed strikeout and walk rate (for pitchers) or regressed home-run rate, walk rate, and strikeout rate (for hitters). SCOUT builds off of work done by Pizza Cutter on when samples for different stats become reliable. By taking a batter’s strikeout rate, for example, after X plate appearances and figuring in the league-average strikeout rate for the remaining plate appearances — up to the reliable sample size for strikeout rate — we’re able to reach a conservative estimate of what that batter’s “true talent” strikeout rate is. (Click here for more on SCOUT.)

Yesterday, I wondered aloud whether it made sense to continue including walk rate in SCOUT for hitters. “There seems to be,” I suggsted “a significant-enough population of hitters who’re able to post high-ish walk rates against minor-league (and, presumably, spring-training) pitching based largely on selectivity, but whose walk rates decline considerably when they face more talented major-league pitchers.” If walk rates dry up upon reaching the major leagues, it follows that they shouldn’t be included in a metric designed to make some kind of comment on a player’s future.

A couple of readers suggested that this might not be the case, at all, however — and, in fact, a recent (and excellent) study by Chris St. John at the Platoon Advantage reveals that minor-league walk rate is actually a useful tool in attempting to analyze a prospect’s chances for major-league success. (Note that I say prospect. There are still likely older minor leaguers who, by virtue of experience, are able to post comparatively high walk rates in the high minors.) St. John’s work has convinced me that SCOUT should include walk rate.

Simulataneous to this, I’ve wondered less aloud whether it might make sense to weight the various elements of SCOUT. Walk rate might be important, but home-run rate is surely more important — and yet, per SCOUT, a batter at 0.5 standard deviations above the league mean in expected walk rate would be valued as highly as a player at 0.5 standard deviations above the league mean in expected home run rate.

In response to that, I submit this experiment: a version of SCOUT called “SCOUT+.” SCOUT+ builds off of work by Bradley Woodrum from last August. In a piece by Woodrum called “Defensive Independent Hitting, Or ShH” (the last bit standing for “Should Hit”), Woodrum found that using the three variables found in FIP — using those plus expected BABIP — that one could predict a batter’s “true talent” wRC+ with some accuracy.

True-talent BABIP is, of course, not something that we can reasonably predict for winter leagues or spring training; the other elements of ShH, however, have already served as the basis for SCOUT.

Accordingly, I’ve used my limited spreadsheeting skills to calculate what we’ll call SCOUT+ for the moment. SCOUT+ is essentially a heavily regressed estimate of what a player’s wRC+ should be — again, minus BABIP. By definition, this will undervalue players who are capable of sustaining higher BABIPs and overvalue players whose “true talent” BABIP is lower than league average. Put another way, SCOUT+ will likely undervalue players who hit the ball hard and/or are fast, while overvaluing players who are either slow or possesses an extreme fly-ball approach.

To calculate SCOUT+, I’ve used a simplified form of Woodrum’s equation — specifically, as proferred by Tom Tango, (12.3*xHR% + 3*xBB% – 2*xK% ) * 92, where xHR%, xBB%, and xK% stand for expected home run, walk, and strikeout rate. To that result, I’ve added a constant that sets the average for all players in the sample at 100. The results seem reasonable, and SCOUT+ appears to account for the different values between home runs and walks and strikeouts in a way that plain SCOUT did not.

SCOUT+ Leaderboard: Spring Training Batters
The idea for SCOUT+ is introduced in belabored fashion above. HR%, BB%, and K% are the raw rate stats so far from spring training. xHR%, xBB%, and xK% (i.e. expected home-run, walk, and strikeout rate) are the regressed versions.

Below is the SCOUT+ batting leaderboard for spring training, for the 149 batters who’d recorded at least 22 spring-training at-bats as of Thursday afternoon. Note, of course, that the samples in question are very small and that the results should be regarded with due restraint.

Name Org PA* HR% BB% K% xHR% xBB% xK% SCOUT+
Dan Uggla ATL 32 9.4% 18.8% 18.8% 3.4% 8.3% 18.2% 114
Mat Gamel MIL 25 12.0% 12.0% 12.0% 3.4% 7.0% 17.1% 113
Melky Cabrera SF 30 10.0% 0.0% 3.3% 3.4% 5.4% 15.1% 111
Eric Sogard OAK 35 5.7% 11.4% 8.6% 3.0% 7.2% 15.9% 111
Alex Gordon KC 29 6.9% 13.8% 13.8% 3.1% 7.4% 17.3% 109
Jaff Decker SD 29 6.9% 10.3% 10.3% 3.1% 6.9% 16.6% 109
Jemile Weeks OAK 32 9.4% 3.1% 12.5% 3.4% 5.8% 16.9% 109
Lorenzo Cain KC 25 8.0% 12.0% 12.0% 3.1% 7.0% 17.1% 109
Aubrey Huff SF 27 7.4% 7.4% 7.4% 3.1% 6.5% 16.2% 109
Hector Luna PHI 26 7.7% 7.7% 7.7% 3.1% 6.5% 16.3% 109

*Estimated.

And here’s the laggardboard:

Name Org PA* HR% BB% K% xHR% xBB% xK% SCOUT+
Chris Gimenez TB 25 0.0% 0.0% 40.0% 2.4% 5.5% 21.7% 89
Bryan LaHair CHC 25 0.0% 0.0% 36.0% 2.4% 5.5% 21.1% 90
Ian Desmond WSH 26 0.0% 0.0% 34.6% 2.4% 5.5% 21.0% 90
Dayan Viciedo CWS 24 0.0% 4.2% 41.7% 2.5% 6.1% 21.9% 90
Chris Heisey CIN 29 0.0% 10.3% 41.4% 2.4% 6.9% 22.6% 91
Danny Espinosa WSH 25 0.0% 0.0% 32.0% 2.4% 5.5% 20.4% 91
Jai Miller BAL 26 3.8% 3.8% 46.2% 2.8% 6.0% 23.0% 92
Brandon Wood COL 24 0.0% 0.0% 29.2% 2.5% 5.6% 19.9% 93
Clete Thomas DET 26 0.0% 0.0% 26.9% 2.4% 5.5% 19.6% 93
Tim Beckham TB 23 0.0% 4.3% 34.8% 2.5% 6.1% 20.6% 93

*Estimated.

SCOUT+ Leaderboard: Arizona Fall League Batters, Revisited
Towards the end of November, I published the final SCOUT batting leaderboard for the Arizona Fall League using the original calculation of SCOUT. Here’s that same leaderboard, except using SCOUT+. (I’ve included some notes on the differences below.)

Name Org PA* HR% BB% K% xHR% xBB% xK% SCOUT+
Robbie Grossman PIT 124 5.6% 16.1% 14.5% 3.9% 13.8% 15.4% 133
Mike Olt TEX 121 10.7% 12.4% 29.8% 6.0% 11.5% 27.8% 128
Michael Choice OAK 75 8.0% 12.0% 16.0% 4.0% 10.8% 17.8% 121
Wil Myers KC 106 3.8% 18.9% 17.0% 3.1% 14.7% 17.7% 121
Jefry Marte NYN 90 4.4% 13.3% 13.3% 3.2% 11.5% 15.8% 117
Nolan Arenado COL 129 4.7% 6.2% 10.9% 3.5% 7.6% 12.1% 117
Jedd Gyorko SD 81 6.2% 12.3% 18.5% 3.7% 11.0% 19.0% 115
Brodie Greene CIN 90 4.4% 11.1% 13.3% 3.2% 10.5% 15.8% 115
Bryce Harper WAS 104 5.8% 10.6% 21.2% 3.8% 10.3% 20.6% 112
Aaron Hicks MIN 120 2.5% 15.0% 17.5% 2.6% 13.0% 17.9% 110

*Estimated.

Notes
• Pittsburgh outfield prospect Robbie Grossman, who finished atop the original SCOUT leaderboard, finishes atop the SCOUT+ leaderboard, too.
• Unlike Grossman, Texas third-base prospect Mike Olt (No.2 here) didn’t even finish in the top 10 on the SCOUT leaderboard — finishing 18th overall, actually, out of 64 qualified batters.
• Gone from the original leaderboard are Milwaukee’s Logan Schafer (who moves down to 17th overall), Boston’s Alex Hassan (14th), and San Francisco’s Joe Panik (16th). Replacing them are Olt, San Diego third-base prospect Jedd Gyorko (who was 12th), and Bryce Harper (who was 19th originally).




Print This Post



Carson Cistulli occasionally publishes spirited ejaculations at The New Enthusiast.


11 Responses to “Daily Notes, Featuring an Experiment”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Baltar says:

    I guess I missed the part that proves that any of this means anything.

    Vote -1 Vote +1

    • Ben says:

      Do you mean SCOUT+ in particular? Baseball in general? Life in the grand scheme of the universe?

      Vote -1 Vote +1

    • William Blake says:

      Have you read my poetry?

      He who shall hurt the little Wren
      Shall never be belov’d by Men.
      He who the Ox to wrath has mov’d
      Shall never be by Woman lov’d.

      The wanton Boy that kills the Fly
      Shall feel the Spider’s enmity.
      The poison of the Snake & Newt
      Is the sweat of Envy’s Foot.
      The poison of the Honey Bee
      Is the Artist’s Jealousy.

      A truth that’s told with bad intent
      Beats all the Lies you can invent.

      Vote -1 Vote +1

    • bill says:

      It’s all vanity.

      Vote -1 Vote +1

    • guesswork says:

      It doesn’t show much, but it’s a fun exercise. Here’s the leader and laggard boards for 2011, min 50 AB:

      128 Mike Morse WSH
      122 Willie Harris NYM
      122 Alex Gordon KC
      120 Ben Francisco PHI
      120 Jake Fox BAL
      120 Kila Ka’aihue KC
      119 Alcides Escobar KC
      118 Lastings Milledge CWS
      118 Aubrey Huff SF
      118 Ian Kinsler TEX

      78 Ryan Langerhans SEA
      78 Scott Cousins FLA
      80 Pedro Alvarez PIT
      80 Drew Stubbs CIN
      82 Willie Taveras COL
      82 Mark Reynolds BAL
      83 Tyler Greene STL
      84 Jhonny Peralta DET
      85 Brandon Inge DET
      85 Matt Wieters BAL

      Vote -1 Vote +1

  2. BDF says:

    Carson,

    “an attempt to derive something meaningful”

    What is that meaningful thing, performance or projection? Since these games don’t count it seems like it must be the latter. Is it too soon to have gone back and see whether SCOUT excellence correlates with MLB excellence?

    Vote -1 Vote +1

    • Very fair question(s).

      The project started as means of providing an alternative to a standard slash-line leaderboard. With regard to the AFL, in particular, it’s pretty frequent that a beat writer or blogger will refer to a prospect’s slash-line as an indication that he’s doing well. Of course, because the AFL is generally hitter-friendly, it’s the case that almost every player’s slash-line is excellent. SCOUT is a means of comparing the players against each other, to get a sense of how well each is actually performing.

      That’s all I intend by “meaningful,” really — that it’s an improvement upon other leaderboards you’ll see for leagues (like the winter leagues, like spring training) that are rife with small sample sizes.

      Insofar as we have years of data telling us that Hector Luna is only slightly better than a replacement-level hitter, it’s wise not to conclude much from his 26 springs PAs. On the other hand, despite his .833 spring OPS and the attendant fanfare, Brandon Wood appears to be hitting exactly like Brandon Wood so far. And for the prospects, SCOUT helps to highlight a guy like Jefry Marte, maybe, who’s been around long enough to have already fallen off some prospect lists, but is still just entering his age-21 season and showed good underlying skills this past fall.

      Vote -1 Vote +1

      • BDF says:

        Thanks, Carson. That’s a measured, intelligent response. My opinion is that you’re claiming SCOUT(+) does exactly what it actually does. This is a fruitful direction for future sabrmetric research, in my opinion, small-bore stats that provide incremental improvement on what already exists rather than attempts to find global narratives in single numbers.

        Vote -1 Vote +1

  3. Snowblind says:

    I really appreciate the attempt at this. This kind of problem seems to make more sabermetricians wave their hand dismissively, say “small sample size” and move on. I think the way you’re defining this, and how it should be used, is very well explained.

    I do wonder if this controls enough for the variance in pitchers that the hitters will see, though. Between raw new guys, veteran retreads trying to prove something, guys working on specific pitches and approaches without caring about the outcomes, etc. the quality of opponent seems to be all over the map.

    If one’s frequent spring training partner is the B-squad AA/AAA guys from the Padres, then total spring training results might be a bit more skewed than if one is playing most games against Phillies starters (ok, bad example, Halladay is off to a rough spring) or something.

    Also, some guys might be looking better than they should because the in-game strategy isn’t the same. In as much as a team uses scouting reports in the regular season, they may not be using them as much in spring training, in favor of working on specific pitches. Say Felix wants to reintroduce his slider into his repetoire again (because in a recent interview, he says he does). He’s probably throwing it more often, in more counts, to different guys, than he normally would. He’s trying to spot the slider and repeat mechanics, not pitch with intent to attack that specific hitter’s weakness on the “right” count. So it doesn’t mean as much if the hitter guesses right or hits one that straightens out a bit.

    In other words: Doesn’t the small sample size and wide, wide variance in talent and game approach, skew the level of pitching competition that hitters see?

    Vote -1 Vote +1

  4. Steve Balboni says:

    Having listened to my first Podcast last night, I think I understand Scout+, or at least its place here at Fangraphs.

    Horticulture is hard and the most beguiling species are the hardest. The best gardeners nurture patiently and flexibly; aerating deep tap roots that produce, say, fruit of calculus, or patiently stringing drip lines along lateral roots and then benignly accepting crypto-occult mystic brambles, or lovingly misting aeroponic roots and then bemusedly chuckling over the ugly bloom of alchemy. Flexibility, because the best gardeners know they aren’t in charge, the species’ innate process dictates whether you get delicious fruit or teosinte. And, like all of us, the best gardeners know that waste fertigates.

    Fangraphs is the gardener and Carson is the species. Scout+ is something.

    (Incidentally, the only species as broadly “spacey-but-together” as Carson on the podcast are, in my experience, (1) St Johns (Maryland, New Mexico) graduates, (2) Classicists, (3) advanced degree holders who took lots of acid, and (4) professionals multi-tasking through conference calls but earning their exorbitant rates nonetheless.)

    Vote -1 Vote +1

    • Cookierojas16 says:

      Steve, you never struck me as such a deep thinker, or expressive writer. Guess I am guilty of stereotyping sluggers.

      Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>