Archive for August, 2013

Free Agent Case Study: Jarrod Saltalamacchia

Dennis Eckersley filled in for Jerry Remy during the Red Sox road trip to play the Giants and Dodgers and has remained on board for the Orioles series. Eckersley’s analysis, cluttered with lingo like “cut the moss,” “throwing cheese,” and “Hello?!”, is also often insightful and informative. Such was the case when he praised Jarrod Saltalamacchia for his consistent season behind the plate for the Sox in 2013.

Jarrod-Saltalamacchia1
It’s hard to imagine another Sox catcher at this point in time.

Saltalamacchia has seemingly overcome the developmental issues that persisted during the early part of his career on the defensive side of the plate. His swing has always provided power and, perhaps most importantly, he has become a trusted game-caller by Boston’s pitching staff. Salty is playing in a contract year in 2013. In this post, I’ll take a look at the market for catchers, analyze Salty’s true value to the Sox, and give a prediction for whether I see him re-upping with Boston this offseason.

The Numbers

Saltalamacchia was once a top prospect in the Braves system and was the center-piece in a 2007 trade for Mark Teixeira. Much of the promise scouts saw in Salty arose from the power he generated from his uppercut swing from the left side of the plate. Like most young players with a long powerful stroke, Salty struggled with strikeouts and inconsistencies in his approach. Salty’s status as a star prospect diminished during his time in Texas due to his inability to put the ball in play, and the Sox took a flier on him at the end of the 2010 campaign. The numbers show the type of hitter that he has been from 2011-2013 with the Sox, but also point to a fundamental change in his approach in his most recent campaign.

From 2011-2013, Salty hit the 6th-most homers amongst catchers in Major League Baseball with 51 (Mike Napoli leads all catchers with 69, despite playing his entire 2013 campaign at first base). Salty is also last among all Major League catchers with a 69.8% contact rate and leads the group with a 30.6% strikeout rate. These are numbers that reflect a hitter who swings to hit the ball out of the park for each and every swing he takes.

Despite the strikeouts (which have been prevalent throughout Salty’s career), it is clear that he has gone through a fundamental change in hitting philosophy during the 2013 campaign. The graph below helps us visualize Salty’s trend during his time with the Sox:

SAlty_swings

The graph breaks down Salty’s batted balls by fly balls, ground balls, or line drives. When he arrived with the Sox in 2010, Salty was in the worst spot, batted ball-wise, of his entire career. His line drive percentage hovered around 5%, whereas his fly ball (and pop up) percentage was at the highest of his career. Since he has been with the Sox, Salty has reversed this trend, culminating in the highest line-drive (and lowest fly-ball) percentages of his career in 2013. This is one reason for Salty’s apparent decrease in power, as ZiPS projects him to hit 14 dingers this year following seasons of 16 homers in 2011 and a career-high 25 in 2012. Despite the slight dip in power, the change in approach has made Salty a more productive overall hitter: his greater propensity to hit line drives has caused his BABIP to rise dramatically from .265 in 2012 to a whopping .379 in 2013. Moreover, it has caused his overall average and OBP to rise to .270 and .341, respectively (up from .222 and .288 in 2012). The only surprising stat after noticing Salty’s decrease in FB% and increase in LD% is that his slugging percentage has not changed at all from 2012 to 2013, even despite the fact that his homer rate is down. But a quick review of Salty’s counting stats reveals that this is due to the fact that he ranks eighth in the Major Leagues with 34 doubles. We can again attribute this to his greater propensity to hit line drives, as many of the long fly balls that stayed up for just too long may be dropping for Salty in 2013.

As his batted ball trends and overall stats suggest, Salty has been on an upward slope as a hitter during his time with the Red Sox.

The Intangibles

While we have just examined the ways in which Salty helps his club with the bat, he holds arguably even more value to the Sox has their primary backstop. This is where the intangibles come in to play, which might be the single biggest factor as to why Salty gets a major pay-day (or why he doesn’t) on the open market. Simply put, there is much more to calculating player value for a catcher than offensive and defensive stats alone.

Catchers can improve a pitching staff with their daily preparation and ability to call a game. In an effort to quantify Salty’s game-calling ability, I’ll reference an article called “Salty’s Defense/Game-Calling Impact” on the Pro Sports Daily forums. As of August 5th, the chart below gives pitcher’s ERA during Salty’s starts as compared to his back-up’s starts over the past three years:

SAlt

While there is much more to calling a game than simply “pitcher ERA”, the trend is a bit alarming when estimating Salty’s value. Numbers don’t tell the whole picture, of course, but they certainly wouldn’t support a claim that Salty improves a pitching staff through his game-calling. One thing is clear: pitchers are doing better in 2013 while Salty is behind the plate, but it remains a mystery whether this is because he’s calling a better game or simply because the pitchers he’s catching are better.

Josh Beckett was one pitcher who spoke out about Salty’s inability to be on the same page as the starter, but many have spoken in defense of the backstop’s ability during the 2013 campaign. Jake Peavy, for one, has commended Salty’s approach to game-day: “I can’t say enough about his willingness. Salty has got some time here, some time in the big leagues. For him to be so humble in his approach, to not say, ‘This is how we do things here’; it was him saying, ‘Hey, man, what do you need to win tonight? What do you need me to do?” In any case, his familiarity with the Sox pitching staff likely makes Salty more immediately valuable to the Red Sox than any other team.

On another note, Salty has been very durable during his time in Boston. He has missed just 4 games due to injury from 2011-2013 and has not been placed on the disabled list once. While durability is always valuable, it is especially valuable in a catcher, in which the day-to-day bumps and bruises are far more prevalent. This should make teams more comfortable offering him a long-term deal.

The Market

Of the 18 catchers on open market following the season, Salty is the youngest at age 29. He will likely be the second-most coveted free agent catcher (behind Atlanta backstop Brian McCann), though that could change if Salty gets hot or McCann gets hurt again (he missed time in 2011 due to an oblique injury and missed time in 2013 due to offseason shoulder surgery). There have been very few free agent catchers over the past 3 years, likely due to the fact that familiarity with a team’s pitchers is very important to front offices. Thus, we notice that there have been many contract extensions for catchers, but very few catchers who actually hit the open market. Salty is, in fact, in a very unique position as a productive free agent catcher who will likely fetch a deal for more than 3 years. In any case, here is how the free agent/extension market has looked over the past two years:

market

Miguel Montero seems to be the most similar comparable by age (29) and overall production (using ZiPS projections, Salty will have a 2.1 average fWAR over the past 3 seasons when he hits the open market). Montero’s signing was actually an extension, so even though his overall production was a bit higher when he signed, it’s certainly not far-fetched to believe that Salty will get a 5-year, $60 million contract when teams are bidding for his services.

The Suitors

I expect the White Sox, Angels, Athletics, Yankees, Braves, Rangers, Rays, and the Red Sox to be in the running for Salty’s services based on their need behind the plate for next season. I doubt the Rays or the White Sox would spend the money on the current Sox backstop, and a signing by the Angels and Athletics seems equally unlikely due to the Angels’ payroll and Oakland’s frugality. This leaves the Red Sox, Yankees, Braves, and Rangers as Salty’s primary suitors, and their free-spending tendencies should make Salty’s eyes light up in free agency.

Prediction

This is a really tough call. I think that if Atlanta does sign a catcher to a long-term deal, they will simply retain Brian McCann (familiarity, as discussed previously, is likely very important to teams when evaluating catchers). A look at the remaining teams’ organizational depth charts could provide insight into Salty’s destination. According to mlb.com, catcher Gary Sanchez is the Yankees’ top prospect with an ETA of 2015. He’s also the second-highest ranked catching prospect in baseball behind Travis d’Arnaud. I expect the Yankees to hold off on signing Salty. While the Rangers’ top prospect is also a catcher, Jorge Alfaro is not expected to arrive until at least 2016 and their 40-man catching depth is weak. The Red Sox have multiple catching prospects in their system (#10 Blake Swihart, #13 Jon Denney, and #15 Christian Vazquez) and have depth on their 40-man roster with Dan Butler and Ryan Lavarnway. If the price does indeed use to my predicted contract of 5 years and $60 million, it’s hard to see Salty re-upping with Sox GM Ben Cherington. Instead, I see him jumping town for Texas, an organization where his prospects once faded, but one that now might make him a very rich man.

Vince D’Andrea is a rising senior at the Massachusetts Institute of Technology. His blog, Dave Roberts’ Dive, can be found here.


Concerning Jim Johnson and Groundball Relievers in General

Despite leading the AL in saves,  Orioles closer Jim Johnson is having a rough year compared to 2012 when he posted a 2.51 ERA and saved 51 games in 54 opportunities. Early in 2013, an enthusiastic Orioles sportswriter named Johnson the best closer in baseball, a statement that doesn’t look quite so good a few months later. As a closer who relies on the groundball, Johnson is something of an odd bird (pun intended). In 2012 his 15.2 K% ranked 130th out of 136 qualified relievers and his Zone-Contact% was 2nd highest. This year his 18.0 K% ranks 111th out of 140 qualified relievers and his Zone-Contact% is 9th highest. While Johnson has struck out a few more hitters, he has also walked slightly more, from 5.6% to 7.1%, and his groundball rate is down. Overall, his fielding-independent numbers are basically the same as last year. Various explanations have been offered for Johnson’s lack of success in 2013 compared to 2012. Bill Castro, the Orioles interim pitching coach (check out his 1979 season) attributes Johnson’s struggles to overthrowing, and a failure to locate down in the zone which has resulted in less early contact outs. I prepared the following chart to check up on these explanations.

Bottom Third% MiddleThird% UpperThird% 2-Seam velo Z-Contact% GB%
2012 14.0 14.5 8.3 94.2 92.9 62.3
2013 14.5 14.4 7.7 93.7 90.6 56.2
    Career 14.7 14.6 8.3 94.2 90.2 57.5

So Johnson is throwing slightly more pitches in the lower third of the zone, and actually getting more swings and misses on pitches in the zone. The overthrowing statement seems faulty, as Johnson’s velocity on his sinker is actually down. A look at the Pitch f/x data shows that his sinker has flattened out slightly from last year, though the difference is slight overall. The following chart shows what kind of contact batters are making off Johnson compared to last year.

BABIP HR/FB HR% HR Per Contact
2012 0.251 6.8 1.1 1.4
2013 0.323 12.5 2.1 2.9
          Career 0.286 8.0 1.5 2.0

And to go in even more detail the following two charts show BABIP by zone and then the slugging by zone for Johnson.

BABIP
Lower Third Middle Third Upper Third
2012 0.289 0.296 0.259
2013 0.292 0.486 0.423
               Career 0.286 0.298 0.343
SLG
Lower Third Middle Third Upper Third
2012 0.294 0.283 0.516
2013 0.321 0.561 0.515
               Career 0.331 0.382 0.503

So balls put in play against Johnson have been falling for hits more frequently this year and those hits have been more damaging in each third of the strike zone. In particular, the pitches Johnson has thrown over the middle have been getting hammered. Last year, the results on those pitches were quite tame. Granted, this is a pretty small sample size of balls in play, and nowhere near the point where BABIP is expected to stabilize, but it goes to show that Johnson has not fared nearly as well when hitters are making contact in 2013. But, this is not an uncommon issue for high-contact, groundball pitchers. David Robertson can suffer through a .335 BABIP in 2012 and still post a 2.67 ERA on the strength of a 32.7 K%. Pitchers like Johnson who cannot strike out hitters regularly are subject to variance on batted balls. Take a look at most groundball, contact-type pitchers, and you’ll find years where BABIP and ERA go through the roof. With the 60-70 inning seasons relievers work, the results can get skewed very badly. To get a sense for where Johnson stands relative to other groundball relievers, I did an analysis of all qualified relievers since 2002 and separated the 30 highest and 30 lowest groundball rates (Johnson was 24th).

GB% K% BB% BABIP LOB% Fbv Fb% HR% HR Per Contact WAR/60 IP SD/MD
League 44.1 19.5 9.5 0.292 73.3 91.5 62.8 2.4 3.5 0.3 1.7
GB-Heavy 60.5 16.9 9.1 0.294 73.0 90.3 72.7 1.7 2.3 0.4 1.7
GB-Light 31.2 24.4 8.7 0.264 77.0 91.2 64.7 3.0 4.6 0.6 2.3

So not a whole lot of good things to say about the groundball heavy group. Jonny Venters was the only member of the group with a strikeout rate above 20%. They limit home runs pretty well, which is to be expected with so few fly balls. However, many of those groundballs are going for hits, while fly balls that aren’t leaving the yard are twice as likely to be outs.  That 30 point difference in BABIP is pretty huge, and that’s over a sample of more than 20,000 balls in play for each group. Overall, the decrease in home runs isn’t worth the extra hits and walks. With guys like Kenley Jansen and Rafael Soriano, it’s not surprising that the fly ball group features a much better ratio of shutdowns to meltdowns. For the most part, the groundball group is filled with situational guys that have bounced around with sporadic success. While relievers of all types tend to be unreliable, groundball and contact types are subject to the additional randomness of batted ball variance.

Seasons with inflated BABIP and ERA should be an expected consequence for a contact pitcher like Johnson. Of course, it would have been very difficult for the Orioles to demote Johnson to a lower-leverage bullpen role after the success he had in 2012. However, all signs indicate that Johnson is an average bullpen arm whose performance last season far outweighed his ability. He is better suited for the role he played in 2010 as a mid-leverage arm who was not limited to one inning. The Orioles should look for a strikeout arm for high-leverage situations. While Buck Showalter has consistently defended Johnson, not too many managers will bring back a closer after a season leading the league in blown saves.


Paul Goldschmidt and his Five Tools

Typically when people think of five-tool players they think of guys like Mike Trout, Andrew McCutchen or Carlos Gonzalez. Basically up-the-middle players who do everything well. Paul Goldschmidt however is not an up-the-middle player but I believe he does have the five tools.

For those who don’t know the five tools are what scouts use (among other things) to evaluate a player. The five tools are hitting for power, hitting for average or contact ability, defense, arm and finally speed.

When looking at Goldschmidt the one tool that stands out is his power. He put up at least a .600 slugging percentage (SLG) and at least a .290 isolated power (ISO) and 2 seasons of 30 home runs in his 3 minor league seasons. His power has showed in the majors as well. In 2012 his first full season in the majors he hit 20 home runs, had a .490 SLG and a .204 ISO. This season his power has taken another step forward. He currently has 31 home runs, .548 SLG and a .251 ISO, all of which currently lead National League first basemen. I specify National League here because that Chris Davis guy has been pretty darn good this season.

Goldschmidt’s hit tool is solid but not close to as good as his power tool. With that being said Goldschmidt is still coming into his own in terms of contact rate. His contact rate has improved each season he has been in the big leagues as per pitch f/x, it rose from 70.7% in 2011, to 77.1 in 2012 and to 78.7% this season. With that contact rate increasing his strikeout rate, as to be expected, has decreased at roughly the same rate. His strikeout percentage has dropped from 29.9% in 2011, to 22.1% in 2012 and to 20.6% this season. Batting average is never the best way to evaluate a player but it does judge a players’ hit tool. His BA has risen from .250 in 2011 to.286 in 2012 to .298 this season. His BABIP is high this season at .333 but it is actually down from last season’s .340.  He had very high BABIPs in the minors and from his batted-ball profile looks like he may be a guy who consistently posts BAPIPs above .300.

His defense is again a work in progress. Defensive numbers take about three seasons to become relevant and we don’t quite have that yet but we do have 2695.2 innings for Goldschmidt at 1B. In those innings he has shown to be an above average defender. This season Goldschmidt has an ultimate zone rating of 4.9 which is fifth among qualified first-basemen. Over the last three seasons Goldy’s UZR is 2.9 which among first basemen with a minimum of 2500 innings ranks 7th out of 13. Essentially an average defender. DRS however tells a different story, this season anyway. Per DRS Goldy has been among the best fielding first basemen. He has saved 11 runs, which is tied for the lead with Adrian Gonzalez and Anthony Rizzo.

A first baseman’s arm is very difficult to judge as it is hardly ever needed. To my knowledge  there are not yet stats that judge a player’s arm. So the only way to evaluate a player’s arm is by scouting the player. I did a quick Google search trying to find a scouting report on Goldschmidt’s arm and I found nothing. Thankfully FanGraphs has a feature where fans can submit their reports on players. Of course this isn’t the most accurate analysis, but it will do. The fans gave Goldschmidt a 48 (0-100 scale) in arm strength in 2011 and a 44 arm strength in 2012. His accuracy was given a 53 and 41 in 2011 and 2012 respectively. We can conclude from this that Goldschmidt has about an average arm.

Finally the last tool to look at is arguably Goldy’s second-best tool, his speed.  Goldy stole 18 bases last season which was tops among qualified first basemen. This season he has 13 which again is tops among qualified first basemen. There is more to speed than just pure stolen bases, the ability to go first to third or score from second on a single.  There is a stat that measures this, called base-running runs above average (BsR). It takes all base-running into account including steals and caught stealing. Goldschmidt has again been elite in this category. This season he has been worth 1.2 runs above average which is 4th among first basemen. Last season he was worth 3.2 runs above average which was tops in the National League and second to only Eric Hosmer.

To conclude, perhaps Goldschmidt is not the five-tool player I had anticipated. He does however have 2 very elite tools in his power and speed. He has 2 average tools in his contact rate and defense. His arm is average to below average but for a first basemen that’s not too important. He isn’t quite a five-tool first baseman but 4 average to elite tools with only 1 below average tool make him about as close to a complete-package first baseman as you’re going to find in the game today.


It’s Time for the Rockies to Change Their Pitching Strategy

Since 2002 (the batted ball era), the Colorado Rockies pitchers have the 4th highest groundball percentage in the MLB. On its face, this seems like a good strategy, as Coors Field has an effect on batted and thrown baseballs that is not pitcher-friendly. Rockies’ pitching coach Jim Wright emphasizes pitching down as key to the success of the staff. However, the emphasis on groundballs has caused the Rockies to get into relationships with pitchers such as Shawn EstesAaron Cook and Jeff Francis, hurlers that lack strikeout stuff. And lest we forget, this same club famously gave Mike Hampton, a groundball pitcher who never averaged more than 6.9 K/9, what was then the richest contract in sports history. As a consequence, Rockies pitchers have the 5th lowest strikeout rate in the MLB since 2002. Compounding this problem is the fact that the Rockies have the 6th highest walk rate. Of course, there is the counter-argument that it is harder for pitchers to get strikeouts at Coors Field due to the effect that the altitude has on offspeed pitches. Additionally, it would seem that a pitcher could be forgiven for nibbling a little at the high altitude. I did a little research to determine how much of the poor strikeout and walk rates are due to Coors Field and how much could be attributed to the pitching style the Rockies advocate. I found that only three teams have higher walk rates in road games, and only four teams have lower strikeout rates. So Coors Field is not entirely at fault for the lack of strikeouts and proliferation of walks.

Since 2002,  Coors Field has the second-highest HR/FB ratio behind only the Reds’ Great American Ballpark. I took a look at the Home/Away splits of the five parks with the highest HR/FB ratios since 2002 to see if they tried to combat the longball in a style similar to the Rockies.

HOME
         Team              HR%          HR Per         Contact         HR/FB            GB%         BABIP            K/9          BB/9           xFIP            ERA
       League 2.6 3.6 10.2 44.3 0.291 7.0 3.1 4.14 4.06
          Reds 3.4 4.6 12.8 42.5 0.291 7.0 3.0 4.21 4.42
      Rockies 3.0 4.0 12.5 46.3 0.311 6.4 3.3 4.27 4.96
    Blue Jays 2.9 4.0 12.1 46.2 0.292 7.1 3.1 3.93 4.27
       Phillies 2.9 4.1 12.1 44.8 0.288 7.4 2.9 4.08 3.99
       Orioles 3.1 4.2 12.1 44.0 0.294 6.4 3.5 4.51 4.61
AWAY
         Team             HR%           HR Per         Contact         HR/FB            GB%         BABIP            K/9           BB/9           xFIP            ERA
       League 2.7 3.0 10.7 43.8 0.296 6.7 3.4 4.36 4.45
          Reds 2.7 3.7 10.5 43.4 0.299 6.5 3.4 4.41 4.43
       Rockies 2.5 3.4 10.0 44.9 0.297 6.5 3.7 4.48 4.56
    Blue Jays 2.7 3.7 10.8 43.4 0.300 6.5 3.7 4.56 4.48
        Phillies 2.6 3.6 10.3 43.9 0.295 6.8 3.2 4.24 4.21
        Orioles 2.9 4.0 11.2 45.1 0.294 6.5 3.5 4.39 4.91
OVERALL
Team HR% HR Per Contact HR/FB GB% BABIP K/9 BB/9 xFIP ERA
League 2.7 3.7 10.4 44.0 0.294 6.9 3.3 4.24 4.24
Reds 3.0 4.1 11.6 42.9 0.295 6.8 3.2 4.31 4.42
Rockies 2.7 3.7 11.2 45.6 0.305 6.5 3.5 4.38 4.76
Blue Jays 2.8 3.9 11.4 45.6 0.293 6.8 3.3 4.08 4.26
Phillies 2.8 3.8 11.2 44.3 0.292 7.1 3.0 4.09 4.09
Orioles 3.0 4.1 11.5 43.7 0.297 6.4 3.6 4.54 4.76

So the Rockies strategy of pitching to groundballs has led to some success in limiting longballs. Among these teams, only the Blue Jays can match the Rockies HR Per Contact rate for home games. The Rockies overall HR rate and HR Per Contact rate is league average, thanks to a road HR rate that only two teams can best. Unfortunately for the Rockies, home runs are not the whole story, and their team xFIP and ERA are 7th and 11th worst on the road. Overall, their team xFIP and ERA are 8th and 2nd worst. The Phillies, Blue Jays, and Reds all have strikeout rates at or above league average. Since 2008, the Blue Jays and Reds have stepped up their strikeout efforts. Meanwhile, the Rockies are 29th in strikeout rate in 2013.

Don’t be fooled by the recent success of Jhoulys Chacin and his HR/FB ratio of 4.9%, the Rockies need to focus more on strikeouts than groundballs. While groundballs have helped limit home runs, the Rockies are still giving up plenty of hits, walks, and runs. Strikeouts need to enter the equation for the Rockies staff to be successful. For a couple of years they had the perfect marriage of both with Ubaldo Jimenez, but none of the three pitchers obtained in the trade with the Indians (a well-timed one) has panned out. Perhaps if the Rockies acquired strikeout pitchers, they could configure their rotation so that those pitchers threw more innings on the road. It’s not as if they haven’t utilized a non-traditional approach with their pitching staff before. The Rockies probably shouldn’t spend 250 million to acquire strikeout pitchers like the Yankees did with C.C. Sabathia and A.J. Burnett when they moved into the cozy confines of the new Yankee Stadium. More than twelve years later, the Mike Hampton signing still has a bad taste in their mouth.  However, there is precedent for developing and acquiring strikeout arms at an affordable cost.

Look at the Reds. Since Walt Jocketty took over as GM in 2008, the Reds have managed to develop and acquire strikeout arms as a means to limit runs in homer-happy Great American Ballpark. The 2013 Reds are 4th in K/9 and 4th in xFIP and ERA. Aaron Harang’s contract was bought out once his strikeout stuff diminished. Homegrown product Tony Cingrani has been striking out hitters at an incredible rate in his first 100 innings with the big league club. The organization’s patience with Homer Bailey has been rewarded. Edinson Volquez posted excellent strikeout rates before being used to acquire Mat Latos and his devastating slider. And of course, they signed the flamethrowing international free agent Aroldis Chapman. While rotation mainstays Mike Leake and Bronson Arroyo are never going to blow anybody away with their stuff their ability to limit walks has allowed the Reds to rely on them as back-end innings eaters. Chapman’s 6 years/30 million is the biggest commitment to any of the above pitchers.

Currently, the Rockies farm system is not loaded with strikeout pitchers. Tyler Matzek is intriguing, but his strikeouts are way down this year as he has moved up a level and attempted to improve his control, and he is far from a sure thing. Tyler Chatwood has shown some promise at the big-league level, but his secondary pitches will have to be refined for him to have long-term success. Chad Bettis was blowing away Double-A hitters before a recent callup, but his innings against MLB competition have been predictably average. Most likely, this is not a quick fix, but more of a long-term strategy which will have to be implemented across several drafts.


Baseball’s Most Extreme Pitches from Starters, So Far

Introduction

After reading Jeff Sullivan’s piece entitled “Identifying Baseball’s Most Unhittable Pitches, So Far” on August 21, I found his methodology to be quite interesting.  It was suggested in the comments rather than looking at whiff rate we should consider who has allowed the weakest contact.  Now, there are a couple of different ways to look at weakest contact.  First, you could look at batted ball velocity.  You could also look at batted ball distance as well.  Both of these techniques would provide some measure of the severity of contact allowed by a pitcher.  At the end of the day though, a warning track fly ball is still as effective for a pitcher as a pop up.  I thought it would be better to look at who got hurt the least with their pitches.

In saying that, I mean to look at what pitchers are theoretically giving up nothing but singles on a pitch versus what pitchers are theoretically giving up nothing but home runs.  A quick calculation to quantify this value is total bases per hit allowed (TB/H).  This is the same as the ratio between slugging percentage and batting average (SLG/AVG).  Values have to be between one and four.  A value of 1.00 corresponds to only singles.  A value of 4.00 corresponds to only home runs.  Any value in between could represent a combination of all hit types.

Baseball Prospectus provides PitchF/X leaderboards for eight different pitch types: four-seam fastball, sinker, cutter, splitter, changeup, curveball, slider, and knuckleball.  I chose to look at only starting pitchers in this study.  Also, to be considered, a pitcher had to have thrown at least 200 of the pitch of interest.  The league leaders in games started are just above 25.  If we are conservative and estimate 80 pitches per start, that allows for 2000 pitches thrown, so 200 would represent roughly 10% of the pitcher’s arsenal.  With that background information now covered, let’s look at the best and worst pitchers in each pitch type.  All data is accurate through August 22.

Data

Four-Seam Fastball

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Jarrod Cosart

HOU

1.20

Lucas Harrell

HOU

2.33

Tyler Chatwood

COL

1.20

Todd Redmond

TOR

2.20

Stephen Fife

LAD

1.22

Allen Webster

BOS

2.20

Bartolo Colon

OAK

1.26

Tyler Skaggs

ARI

2.15

Joe Kelly

STL

1.26

Erik Bedard

HOU

2.10

Sinker

  Pitcher

Team

TB/H

Pitcher

Team

TB/H

Brandon Cumpton

PIT

1.10

Yu Darvish

TEX

2.27

Taylor Jordan

WSH

1.10

Bud Norris

BAL

2.12

John Lackey

BOS

1.21

Aaron Harang

SEA

1.96

Gerrit Cole

PIT

1.22

Scott Kazmir

CLE

1.93

Jonathan Pettibone

PHI

1.22

Jon Lester

BOS

1.92

Wade Davis

KCR

1.22

Cutter

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Clay Buchholz

BOS

1.11

Jeff Samardzija

CHC

2.00

Jenrry Mejia

NYM

1.17

Jerome Williams

LAA

1.95

Lucas Harrell

HOU

1.20

Cole Hamels

PHI

1.90

Jonathon Niese

NYM

1.21

A.J. Griffin

OAK

1.86

Mike Pelfrey

MIN

1.31

Yu Darvish

TEX

1.85

Splitter

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Hiroki Kuroda

NYY

1.22

Ubaldo Jimenez

CLE

1.72

Jake Westbrook

STL

1.31

Tim Hudson

ATL

1.70

Jorge de la Rosa

COL

1.32

Dan Haren

WSH

1.69

Doug Fister

DET

1.33

Tim Lincecum

SFG

1.61

Hisashi Iwamuka

SEA

1.33

Jason Marquis

SDP

1.58

Changeup

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Stephen Strasburg

WSH

1.00

John Danks

CHW

2.21

Matt Harvey

NYM

1.06

Jeremy Hefner

NYM

1.96

Gio Gonzalez

WSH

1.10

Dan Straily

OAK

1.91

Francisco Liriano

PIT

1.14

Randall Delgado

ARI

1.89

Bud Norris

BAL

1.22

Edinson Volquez

SDP

1.87

Curveball

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Clayton Kershaw

LAD

1.00

Homer Bailey

CIN

2.33

Jason Hammel

BAL

1.00

Zack Greinke

LAD

2.09

C.J. Wilson

LAA

1.07

Wandy Rodriguez

PIT

2.00

Dillon Gee

NYM

1.14

Tim Hudson

ATL

2.00

Max Scherzer

DET

1.17

John Lackey

BOS

2.00

Slider

Pitcher

Team

TB/H

Pitcher

Team

TB/H

Tyson Ross

SDP

1.00

Jordan Zimmermann

WSH

2.24

Jorge de la Rosa

COL

1.17

Wade Miley

ARI

2.07

Bartolo Colon

OAK

1.18

Dallas Keuchel

HOU

2.06

Jeremy Hefner

NYM

1.24

Carlos Villanueva

CHC

2.06

C.J. Wilson

LAA

1.24

Hisashi Iwamuka

SEA

1.96

And for completeness,

Knuckleball

Pitcher

Team

TB/H

R.A. Dickey

TOR

1.68

Combining all that data together, we get the following five pitches as the best in baseball so far.

Pitcher

Team

Pitch

TB/H

Stephen Strasburg

WSH

Changeup

1.00

Clayton Kershaw

LAD

Curveball

1.00

Jason Hammel

BAL

Curveball

1.00

Tyson Ross

SDP

Slider

1.00

Matt Harvey

NYM

Changeup

1.06

Also, to complete the picture, here are the worst five pitches in baseball so far.

Pitcher

Team

Pitch

TB/H

Lucas Harrell

HOU

Four-Seam

2.33

Homer Bailey

CIN

Curveball

2.33

Yu Darvish

TEX

Sinker

2.27

Jordan Zimmermann

WSH

Slider

2.24

John Danks

CHW

Changeup

2.21

Analysis

As you can see, there are a lot of “good” pitchers that throw “lousy” pitches.  This metric is far from perfect.  For example, Yu Darvish appears in the bottom five in two different categories.  Does that mean Darvish should stop throwing his sinker and cutter?  No, it most certainly does not.  It just shows that when Darvish makes a (albeit rare) mistake with either pitch hitters are mashing it.  I found this a fun exercise that yielded results that may not be the most meaningful but that are interesting for discussion nonetheless.


Pitcher STUFF Ratings or, It’s Too Bad Rich Harden Couldn’t Stay Healthy

Of course, the concept of “stuff” is very subjective, and my formula is not so much of an attempt to quantify a subjective concept as it is an attempt to measure how well pitchers do things we associate with great stuff. Because I used Pitch f/x data exclusively, the ratings were limited to pitchers from 2007 to the present.

My formula is ((4*O-Zone Swing% *O-Zone Whiff%)+(3*Whiff%)+(5*Zone-Whiff%)+(2*IFFB%)*(FBv/100)*(4))

I will probably tinker with the formula, and will welcome any suggestions with regards to improving it. I have only applied it to starting pitchers. Of course it can be applied to relievers, but their scores run much higher unless some kind of a “relief penalty” is applied. The STUFF ratings for all starting pitches who threw at least 160 innings since 2007 run between 3.4 and 9.7. The following list presents the top 15 career STUFF pitchers since 2007.

1. Rich Harden 9.7. If you’re having trouble remembering just how filthy Harden could be, visit his player page. Harden got swings and misses like no other starter. In 2008 he had an unearthly 48 ERA- and 68 xFIP- despite the fact that injuries had already started to take their toll on his fastball velocity, as it dropped to 91.7, compared to 94.1 the year before. In 141 innings in 2009, he got whiffs on 22.6% of swings on pitches in the zone. Max Scherzer, the 2013 leader in that category, gets whiffs in the zone at an 18.4% clip. When Aroldis Chapman averaged 100 mph on his fastball in 2010, he sat at 21.9%. Unfortunately, a litany of injuries would decimate Harden’s career, and he was recently released by the Twins, an organization known for their disdain for swing and miss stuff.

2. Matt Harvey 9.4. The young right-hander with the dynamic fastball places near the top in all five of the STUFF factors, with only Scherzer, Harden, and Escobar topping his 17.6 Zone-Whiff%. Besides the fastball, Harvey also features a slider, curveball, and changeup. Harvey’s plethora of filthy offerings produces whiffs on over a quarter of his pitches overall. Furthermore, Harvey is one of the rare pitchers who has actually experienced an increase in fastball velocity since his debut season.

3. Yu Darvish 9.2. Darvish uses his assortment of pitches to produce whiffs on over half of swings at pitches he throws outside of the zone, easily the best in the sample. Combine that with a whiff rate of 15.9%  for swings on pitches in the zone and you get an overall whiff rate of 28.6%, also the best in the sample. Pitch f/x credits Darvish with six different pitches, four of which he throws at least 12 percent of the time. Though Darvish averages 92.9 mph on his fastball, he has thrown his slider nearly as often as his four-seamer and two-seamer combined. The unconventional approach has produced five games of 14+ strikeouts in 2013.

4. Kelvim Escobar 8.9. Escobar only had one year of data, but what a year it was. At the age of 31, Escobar’s fastball velocity surged to 94.1, higher than any of the pre-pitch f/x years, and he utilized an excellent changeup to get whiffs on over a third of swings at pitches he threw outside of the zone and a quarter of swings overall. However, in spring training of 2008, Escobar was diagnosed with a shoulder injury that required surgery and except for a 5 inning stint in 2009, he never returned to the majors.

5. Michael Pineda 8.7. Like Escobar, Pineda only has one year of data in the sample due to shoulder surgery. Elite fastball velocity combined with a slider that helped generate swings on a third of the pitches he throws out of the zone and contact on less than sixty percent of those swings earns him this ranking. The big righty also used his height to get one of the highest infield fly rates in the sample. Pineda was placed on the DL shortly after an August 2 rehab start resulted in stiffness in his shoulder, and it appears unlikely that the righthander will pitch again in 2013.

6. Matt Moore 8.6.While Moore’s fastball velocity has dipped steadily since he came into the league in 2011, its overall average is still 93.6. Moore’s ranking is based heavily on his 2012 STUFF rating of 9.3, his 2013 rating has fallen to 7.4. Moore has battled elbow soreness this year, and hopefully this will not be a long-term issue and he can return to the form that generated a dominant 19.0 Zone-Whiff% in 2012.

7. Francisco Liriano 8.6. Liriano’s slider has long been one of the best pitches in the game, and only Darvish can top his whiff rate on pitches outside the zone. Since joining the Pirates, Liriano has been using the slider even more, throwing it on 37.1% of his pitches. Liriano is also throwing his changeup more than he ever has before. While his 13.1 Zone-Whiff% in 2013 is one of the lowest numbers of his career, the offspeed pitches have resulted in a 36.1% chase rate, the highest of his career. It’s anyone’s guess as to how long Liriano’s oft-troubled elbow holds up, but Pirates fans should enjoy the ride while it does.

8. Cole Hamels 8.5. A master of deception, Hamels’ changeup has helped him produce a career whiff-rate of 24.5%. Among pitchers on this list, Hamels 90.9 mph fastball is faster than only fellow changeup artist Johan Santana. However, the 8-9 mph difference between his fastball and changeup produces a 33.8 chase rate, the 5th highest in the sample, and his 37.0 rate in 2013 leads the majors. Hamels has also been very durable, among the top 15 STUFF pitchers, only Justin Verlander has thrown more innings.

9. Stephen Strasburg 8.5. While Strasburg’s fastball velocity has fallen from its pre-Tommy John high of 97.6, his 95.9 average is still tops Felipe Paulino, the next closest in the sample by 0.7 mph. While we will probably not see the pure electricity of the pre-injury Strasburg which produced a 9.5 STUFF rating in 2010, Strasburg still gets whiffs on over 15% of swings on pitches in the zone and 25% overall. If the Nationals’ controversial innings-management plan pays dividends and the 25 year-old can stay healthy, he should be getting whiffs for years to come.

10. Max Scherzer 8.3.  It seems fitting that a noted sabermetrician would obtain a high ranking on a list based on Pitch f/x and batted-ball data. To the misfortune of AL hitters, Scherzer has vastly improved his secondary pitches while maintaining his fastball velocity. Before his trade to the Tigers, Scherzer threw his fastball over two-thirds of the time. With the Tigers, Scherzer’s fastball usage has decreased each year, and his use of secondary pitches, particularly his changeup, has increased. Not surprisingly, this has resulted in higher chase and whiff rates, and his Zone-Whiff%  of 19.9 since 2012 leads the majors.

11. Clayton Kershaw 8.1. Kershaw burst onto the scene in 2008 as a 20 year-old rookie with a 94 mph fastball and 73 mph 12-6 curveball. Since then he has added a slider to make life even more miserable for hitters. Kershaw ranks near the top in all five of the STUFF factors. Kershaw appears to be the odd bird that can use his pitch arsenal as much to suppress BABIP as to generate swings and misses, and this factor probably keeps him from being ranked even higher.

12. Tim Lincecum 8.0. You would be hard-pressed to find a smaller starting pitcher than Lincecum. While that height limits his ability to get infield flies, the dynamic changeup more than compensates for his lack of size. Of the top 15 pitchers, only Darvish and Liriano have higher whiff rates on swings at pitches out of the zone. Lincecum’s fastball velocity has steadily dropped from its high of 94.0 in 2008 to 90.2 in 2013. Since 2011, Lincecum has been throwing a slider more often, and while he has been prone to the longball, he still gets whiffs on a quarter of swings. While Lincecum is no longer the pitcher that won CY Young awards in 2008 and 2009, he is a very intriguing free agent, and at the least, it seems that he could be a dominant reliever.

13. Chris Sale 8.0. The lanky, or perhaps paper-thin lefthander has made a successful transition from the bullpen to the rotation. After experiencing a predictable velocity drop from the move, Sale has actually regained some of that velocity this year, as his fastball has jumped from 91.3 to 92.4. Since moving to the rotation, Sale has added a changeup to go along with his excellent slider. Sale’s herky-jerky sidearm delivery and late movement have helped him generate a 32% chase rate, 5th best among pitchers on this list. While concern’s about Sale’s elbow and durability are certain to persist, Sale is on pace for over 200 innings this year after throwing 192 last year.

14. Johan Santana 7.9. Shoulder troubles robbed Santana of some of his fastball velocity, and his average of 90.3 is the slowest among pitchers in the top 15. However, his changeup was devastating. In its heyday in 2007, Santana had a Zone-Whiff rate of 23.2%. While some of Santana’s best years were in the pre-Pitch f/x era, the Mets still got highlights such as a 36.0 chase rate in 2009, and the no-hitter in 2012. Santana’s changeup also had the effect of suppressing BABIP,  as noted by a .276 career mark. Of the top 15, only youngsters Harvey and Moore can top Santana’s 12.9 IFFB%.

15. Justin Verlander 7.9. It took Verlander a couple of years to fine-tune the curveball, but when he did, he started churning out elite swing-and-miss rates. Since 2012, Verlander has been utilizing the changeup more than the curveball, and it too has produced excellent whiff rates. The secondary offerings go along with an average fastball velocity of 94.8 that only the less battle-tested Stephen Strasburg, Matt Harvey, and Felipe Paulino can top. Since 2007, Verlander has thrown over a 100 more innings than Cole Hamels, the next closest person on this list.

Clearly, the list favors younger, less tested pitchers. But I don’t think there’s anything wrong with that. As pitchers age, their velocity declines, and while Felix Hernandez is a better pitcher throwing 92 then when he was a young flamethrower, he probably doesn’t create the same kind of excitement in fans or fear in hitters when he averaged 96 with his fastball.

I also made a list of the worst 15 starting pitchers by STUFF since 2007. I didn’t think it would be worth anyone’s while to go through the list, but suffice it to say that the worst three were Steve Trachsel, Sidney Ponson, and Livan Hernandez. Yeah, I’d say that sounds about right. Aaron Cook of the 1.9 K/9 in 2012 also made the list. The following table is a comparison of the best and worst 15 starting pitchers since 2007 by STUFF rating.

  BABIP        LOB% xFIP- ERA-
Best 15 0.284 75 85 82
Worst 15 0.304 74 107 112

So the best STUFF pitchers seem to have an ability to limit hits on balls in play and overachieve their peripheral stats, while the worst STUFF pitchers allow hits at slightly above the league average and underachieve their peripherals. Some of this is due to infield flies, which was a factor in the STUFF formula. The best 15 had an IFFB% of 11.0, while the worst 15 had an IFFB% of 7.4. But there are other factors involved. Tim Lincecum has a 7.4 IFFB% and a .296 BABIP while Nick Blackburn has a 8.6 IFFB% and a .309 BABIP while the BABIP of their respective teams since 2007 is .297 and .300. Both of these pitchers are well past the stabilization point for BABIP. So it seems that pitchers with dominant STUFF have some control over hits on balls in play outside of IFFB. Of course I cherrypicked an example, and I’m sure there are counterexamples, but the general idea seems good. Great STUFF can have an effect beyond generating swings and misses.


Concerning Chris Archer’s Future; A Disappointing Comparison

Young players are exciting. They’re fun to watch, fun to talk about, and especially fun to project, and young players that succeed early in their careers are even more exciting. If, over the next few weeks, you find yourself sitting in Progressive Field holding a $4 beer (yes, they’re that cheap) while watching the Indians play a meaningful late-season game for the first time since 2007, mention Danny Salazar to the fans in your section. About the worst thing you’ll hear someone say about him is, “Salazar? Potential front-line arm, but I dunno, maybe he throws too hard?”

I’m just as fascinated with young talent as those title-starved Indians fans drinking their reasonably priced beverages, and one player who’s caught my eye this year is Chris Archer, the 24-year-old, flame-throwing pitching prospect currently shutting down MLB lineups to the tune of a 2.95 ERA over 15 starts this season. When a pitcher with Archer’s level of raw talent shows flashes of that potential brilliance right out of the gate, it’s easy to get carried away and envision him turning into the next Max Scherzer (who Harold Reynolds thinks is the AL Cy Young, hands down), but is that a fair comparison? Are we putting too much emphasis on Archer’s string of early successes?

Unfortunately, we can’t really know the answer to that question until Archer himself is 29 years old and either anchoring the front end of an MLB rotation, filling in at the back end, contributing out of the bullpen, or worse. Fortunately, we can speculate. Even more fortunately, there’s a wealth of data and numbers from which we can speculate.

Using pitch data compiled by FanGraphs and readily available on player pages and custom leaderboards, I looked at every player-season from 2002-2012 for Archer’s closest pitching comparison. I considered factors such as pitcher age and experience, pitch usage rates, velocities, and effectiveness, batted ball distribution, strikeout and walk rates, and even non-pitching factors like height and handedness, which matter for release point and pitch trajectory.

After crunching the numbers, I am officially proclaiming Edwin Jackson the winner.

On the surface, this comparison makes some sense. Back in 2007, Jackson was a 23-year-old pitcher of the same height and weight as Archer is currently listed, and both were getting their first extended major league looks. Jackson was drafted out of high school in the sixth round of the 2001 amateur draft. Archer was drafted out of high school in the fifth round of the 2006 amateur draft.

Both paid their dues in the minors as Jackson compiled a 4.39 ERA over 556 innings in parts of six minor league seasons, whereas Archer was slightly better with a 3.77 ERA in 769.2 innings over parts of eight seasons. Both showed big-time velocity but struggled with control. Jackson’s strikeout rate in the minors was lower than Archer’s, but so was his walk rate. All told, Jackson posted a strikeout-to-walk ratio of 1.91. Archer’s was nearly identical at 1.80.

Pretty similar, huh? Well, it gets a little eerier.

The table above shows Archer’s pitch usage, velocity, and effectiveness over his first 15 starts of 2013 versus what Jackson did during his first full season back in 2007. Right off the bat, we see three-pitch pitchers who featured a fastball and slider prominently while occasionally mixing in a change-up. Both could dial up the heat, and both used the slider as their out pitch.

By now I think we’ve done a pretty good job of establishing just how similar these two pitchers are (though if you’d like, you can check out the deliveries of Jackson here and Archer here), but what does that mean for Archer’s future? Let’s pretend for a moment that he does in fact follow in Jackson’s footsteps. How good has Jackson been?

Well, Jackson’s been about as average as they come. His career ERA is 4.45, and he’s never finished with a single-season ERA better than 3.62. In six-plus seasons since Jackson became a full-time starter in 2007, there have been 50 pitchers that have logged over 1,000 innings (Jackson has tossed 1,295). Of those 50, Jackson’s 4.36 ERA ranks 44th, ahead of only Kevin Correia, Jason Marquis, Barry Zito, Roberto Hernandez/Fausto Carmona, Joe Blanton, and Livan Hernandez. Jackson’s 17.6 WAR over that span is good for 26th, but his WAR/IP drops him down to 35th. His most notable accomplishments are ranking 17th among that group in innings pitched and 11th in games started.

Jackson has been very durable, and there’s something to be said for durability, but if all Archer turns into is a league-average starter best known for taking the mound every fifth day, then Rays fans will long for the days when Archer unexpectedly bolstered Tampa’s rotation and when he showed filthy stuff, a fiery demeanor and, most importantly, promise.


Major League Baseball Should be All Over the Quantified Self Movement

This post originally appeared in slightly different form on my blog: Biotech, Baseball, Big Data, Business, Biology…

Baseball players break down.  Their performances fluctuate.  As a group there are some interesting generalities with respect to how pitching, hitting and fielding change with age.  But the error bars are huge.  There are many things we still don’t know about baseball players, about why one prospect hits the ground running and another flames out.  And we also don’t know if there is any way to know, since the task of putting together the skills needed to play major league baseball may be one of the most complex of the major sports, and understanding complexity is hard.

But it seems worthwhile to give it a try.

The Mystery of the Missing Ligament

Let’s talk about R.A. Dickey for a minute.  Not because he’s a highly interesting human being, although he is.  And not because he’s a knuckleballer, which is fun and interesting due to rarity and the entertaining sight of six foot athletes flailing at baseballs traveling with the flight path of a drunken small-nosed bat.  But rather because he was drafted in 1996 in the 1st round by the Texas Rangers, and only during his physical workup was it discovered that he was missing a key ligament in his arm.  The Ulnar Collateral Ligament (UCL) to be exact.  Without which, it is assumed, a pitcher cannot pitch.

Well, except  that he did.  This shouldn’t be under-emphasized.  Pitching without a UCL is thought to be akin to trying to play tailback for the Seahawks without an Anterior Cruciate Ligament (ACL) in your knee.  And yet he pitched and pitched well for years without a UCL.  RA Dickey got his UCL replaced and then knocked around the major and minor leagues for several years, eventually learned how to throw a knuckleball, and now has pitched successfully in the majors for several years more.

A story like this illustrates two points.  One, we may be making assumptions that aren’t always supported by the data—for example, that the UCL is required for pitching.  And two, you can learn a lot just by looking and measuring.

Measure by Measure

What should be measured and how?  I think an area to look into might be the tools being developed now to support self-measurement.  The quantified-self movement has gained enough prominence that magazines like Newsweek are running profiles.  For people in the movement, the motivation for participation stems from a desire to better understand themselves; to have data that will give them a data-driven view of what is going on in their bodies and minds.  The goals are often better health, losing weight, tracking mood, athletic prowess, increasing the levels of good indicators and decreasing the levels of the bad.

One of the distinguishing elements of how this is being done is granularity.  Apps on a smartphone, portable electronic devices, and logging tools can capture data in intervals ranging from several times a day up to a more or less continuous stream.  Even tests and procedures that might normally be performed once a year at an annual physical become fair game for more frequent monitoring, as long as you have the money to pay for the testing.  The open question is whether collecting all of this data will reveal new insights.  Or, to put it graphically, if you tested a metric infrequently, and got this graph:

graph1

Would the result of more frequent testing look like this?

graph2

Or like this?

graph3

This example is borrowed from the site of Ginger IO, a company that is developing tools for continual measurements of health related metrics, among other things.

Where baseball comes in to this is I believe MLB teams are continually in a search for new ways to gain an advantage in building a quality team.  You know, that extra 2%.  A baseball team has vast resources, and those resources are focused on getting the most out of the several hundred baseball players that comprise the major and minor league talent of the team.  There are trainers, and doctors, and team dieticians, and masseuses, and coaches.  What would it take to add an additional technological and analytical group dedicated to gathering data on the players and seeing whether any of this information provides additional retrospective or prospective insight into individual performance?

Here is where an enterprising team could probably reach out to a couple of different groups for help in setting this up.  One would be device and software manufacturers who are building tools in this space.  I’ve written before about EmotionSense and have also learned recently about GingerIO (HT to @Dshaywitz).  Another highly interested party would be the nearest medical school and those researchers looking into patient reported outcome (PRO) techniques and patient monitoring efforts.  If an MLB team doesn’t already have its own high-powered statistical analysis group (or even if it does), it could reach out to suppliers of software tools for analyzing large scale datasets and finding patterns, like Ayasdi or Google.

I could also see a viable group for a partnership being other professional sports teams.  Many MLB teams are in the same city as NFL, NBA, NHL, and/or MLS franchises.  To spread the investment costs as well as providing control groups for each other, it would be useful to collaborate with these other franchises to learn more about the effect of sports training in general.

A speculative area for data collection and analysis could be in genomics, transcriptomics and proteomics.  Michael Snyder of Stanford University has been demonstrating for some years now how a program of monitoring personal molecular information about one’s health, along with other more conventional measures, provides new insights into health and disease.

The metrics should also include the conventional.  Going back to the example of R.A. Dickey, wouldn’t it be useful to perform elbow and shoulder scans for every player on major and minor league rosters on at least a yearly basis?  So often in sports you hear the term “typical wear and tear” when describing an elbow or shoulder or knee.  My question is, how do you know it’s typical?  Until you have a large, well-defined baseline that you follow for years under the rigorous conditions that baseball players are subjected to, how can you know what real wear and tear is?  And if you did know, wouldn’t that help you in making decisions about training and protecting your own players, to say nothing of evaluating free agents?  One of the truisms of baseball is that every team knows more about their own players than anyone else, leading to information asymmetry in trading and signing.  It seems an imperative for each team to reduce or reverse that asymmetry if at all possible.

An additional area that personal monitoring could help in is understanding on-field performance.  I’ve already touched on how MLB could use various kinds of GPS and positioning sensors to more accurately measure defense, for example, so I won’t elaborate further except to point out Chip Kelly is bringing this approach to the Philadelphia Eagles, and it will be interesting to see if we get reports on the effectiveness of using GPS to monitor his NFL players’ movements.

Biological passports

Another benefit of building a baseline for different kinds of metrics in your team would be helping to detect the possibility of doping.  This seems to be in the news right now for some reason, so let me just say that if a team began collecting, analyzing and storing biological samples on a regular basis, this would help in detecting those who are taking performance-enhancing substances.  This isn’t a new idea; the World Anti-Doping Agency is advocating this approach already.  However, I think MLB could take it to a high level of rigor and quality.  Would this have to be negotiated?  Sure.  But there is probably no better time than now to see if such an agreement can be forged between the union and the MLB owners.

Essentially, by taking samples from enough players over time, as well as healthy, age and ethnicity-matched volunteers as a control group, an MLB team could build up a comprehensive profile of what normal is with respect to the known indicators of performance enhancement such as hemocrit levels, not just as an average, but on an individual basis.  With this kind of data, a rapid, unusual change in specific metabolites could provide grounds for more intensive investigation.  When athletes come up with a positive test, a standard argument has been that he or she always has had an unusually high level of the tested substance.  Well, you know, the only way to know that for sure is to have a record dating back years that demonstrates outlier status or not for that athlete and that test.  Continual sampling is almost certain to deter many would-be attempts to use performance enhancing substances.

This would be invasive.  No doubt about it.  Which is why there should also be stringent controls on data and better maintenance of privacy than we’ve seen so far in the Biogenesis saga.  However, there is also probably no better time to negotiate these kinds of tests as baseball strives to clean its image again.

Too much data?

Of course, collecting all this data provides no guarantee of actually finding out something specifically useful and actionable for any given MLB team.  As Nate Silver has pointed out many times in his columns and book, given enough data you can find a correlation for almost anything.  However one thing is certain: you can’t find new things when you don’t look, and trying to apply concepts of the quantified self to MLB teams will lead to a whole lot of cross-discipline interactions and innovative thinking, which a forward-looking team might be able to parlay into the next big market inefficiency in baseball.


Does Your Team Have a Winning Core? Profiling Sustainable Roster Construction

Thanks to an atrocious month of May, the 2013 Milwaukee Brewers were abruptly transformed from a fringe contender into a rebuilding baseball club.

Most people agree that the Brewers need to build a new core, but what does that mean? Many teams have young players in the midst of an above-average season, but that doesn’t necessarily translate to sustainable success for the roster as a whole. And the opinions expressed about so-called core players are usually subjective and not expressed in a way that allows direct comparisons between teams.

We could really use a metric to compare the rosters of teams who are developing potentially sustainable talent with those who aren’t. My effort to do this is called Core Wins, which summarizes the extent to which a team’s success is being driven from players most likely to constitute core talent, as opposed to players on their way out the door, probably in decline, or both.

To do this, we need define what it means to be a core player, and specifically the factors by which we evaluate a core player’s respective contributions to the team.

The Core Player

In my view, core players do three things: (1) contribute significantly to their team’s success, (2) do so while under extended team control, and (3) do so at or before they reach their peak ages of likely productivity. Each of those attributes needs to be mathematically summarized to reduce these contributions to a measurable value.

The first factor is the easiest: a core player is expected to contribute, and to do so above what could be found in an entry-level minor-league call-up. A major league player’s ability to do so over the course of a season is commonly summarized in some version of the wins above replacement (WAR) metric, which attempts to combine the player’s batting, fielding, and if applicable, pitching contributions. A counting statistic also fits our needs best, since we are looking for aggregate contributions over the course of a single season. So, we’ll use WAR, as calculated by Fangraphs (fWAR).

The second factor, team control, is more complicated. Player control comes in two primary forms: (1) players under club control due to the terms of baseball’s collective bargaining agreement, and (2) players who have signed freely-negotiated contracts. The collective bargaining agreement keeps players under club control for at least six major league years. Free agent contracts range from one-year stop-gaps to those lasting a decade or longer. Most ballclubs are a collection of young players under sustained club control, long-term (and typically expensive) free agents, and stopgap players on value contracts. But teams with a sustainable core should be drawing significant production from players who will actually be around in future years. If too much production is coming from departing or declining players, the club is asking for trouble.

The third factor — player age — is less significant, but still important. Younger players are cheaper than older players, and thus easier to afford and keep around. Younger players are less frequently injured, meaning they will be in the lineup more often. Younger players who have not yet reached their peak production age will also probably continue to improve, whereas players beyond their peak age will probably decline.

However, age can be overemphasized. The primary advantage of youth— extended club control — is already being considered. Moreover, mature players signed to long-term contracts tend to be some of the most valuable players in the game — Joey Votto, Felix Hernandez, and their peers. And while prospects are important, most ballclubs would strongly prefer Joey Votto over a 22-year old prospect who may, but probably won’t, someday turn into Joey Votto. So while age matters, it is not as important as control.

So to summarize: we need to weigh player value, but do it in a way that primarily emphasizes team control while still placing some value on a player’s age.

Method

Player Contributions

All WAR figures were drawn from Fangraphs. The figures for batting fWAR (which incorporates fielding) and pitching fWAR were combined into one spreadsheet for each team year. When a player generated values for both batting (plus fielding) and pitching WAR, those values were summed, including the effect of any negative values. Once a net value was obtained for all players on a team roster for the year, all zero or net negative WAR values were disregarded.

Player Control Index

Player control numbers were drawn primarily from Cot’s Contracts, and cross-checked with Baseball Reference, other sources, and common sense as needed. Cot’s provides individual player contract data from 2009 onward, so only data from 2009 through 2012 was used. Control years were weighted identically, regardless of whether they arose from the CBA or a free agent contract. A player subject to a club option was considered to be under club control for that year. The author’s best estimate of remaining club control was necessary in a few cases when contract details were unclear, but not surprisingly, most of those players were fringe contributors that would not constitute core talent anyway.

A player was assigned one control year if his contract expired after the current season, two control years if his contract expired after the following season, and so on. For practical reasons — including the frequent shuffling from the minors experienced by young players, and the oft-diminishing returns of the longest contracts — the maximum number of control years considered for a player was 5. A Control Index was then calculated for each player in each roster year, with the number of control years as numerator, and an assigned denominator of 2 — for the minimum years that would constitute extended organizational control. So, for example, a player with an expiring contract would have a Control Index of 0.5 (1 season left divided by 2), and a typical player in their final pre-arbitration year would have a Control Index of 2.0 (4 seasons of control divided by 2). The maximum Control Index is 2.5.

Age Index

A player’s “baseball age” — their age on July 1 of a given season — was drawn from Fangraphs. An Age Index was then calculated for each player using an assigned value for a typical peak performance age as the numerator and the player’s baseball age for each season as the denominator. There has been some debate on the overall peak performance age for ball players, but, taking a strong hint from one of my reviewers, I used 27. To give some sense of the value range, the Age Index in 2012 for Mike Trout would have been 1.35 (27/20) and for Livan Hernandez would have been 0.73 (27/37).

Determining Core Win Value

In my formula, Core Win value is a weighting exercise. To calculate a player’s Core Win value to a roster, I multiplied the player’s net fWAR for each season by the Control Index and the Age Index. The Control Index has a greater range (0.5 to 2.5) and thus a greater potential weight than the Age Index, which seems appropriate for the reasons stated above. The combined effect of these indices means young prospects that produce at a level of 2 fWAR or higher are weighted the most heavily. This makes sense: players who promptly adjust to the difficulty of the major leagues, yet still have years of probable improvement ahead of them, all while under extended team control, are those most likely to constitute a sustainable core of talent for the ballclub.

Discussion

Now that we have a formula for Core Win Value, we need to decide what it means to have a winning core. That cut-off is ultimately in the eye of the beholder, but I looked to the gold standard: the Tampa Bay Rays. The Rays are widely acclaimed for their ability to acquire and maintain control of young talent, often through early buy-outs of free agent years, combined with club options that retain team flexibility. This has been particularly true over the years covered by this study: 2009 through 2012.

To provide some contrast with the Rays, we will also consider the roster construction during that same time period of the New York Mets and the Oakland Athletics.

The Gold Standard: The Rays

Not surprisingly, the Core Wins formula likes the Rays very much. Indeed, three characteristics of the Rays between 2009 and 2012 suggest a working definition of a team with a strong, sustainable core: (1) the Rays consistently feature five or more players producing a Core Win Value of 5 or higher per season, which is my working definition of a “Core Player”; (2) they have accomplished this feat in multiple consecutive years (all four years I studied, in fact) and (3) at least two of these Core Players were usually pitchers.

Let’s start with 2009. For ease of viewing, in each of these tables, I’ve bolded wins figures for potential Core Players (five or more Core Wins). I’ve also italicized the names of pitchers who cross the Core Wins threshold, to distinguish them from position players.

2009 Tampa Bay Rays

Name fWAR Age Control Years Control Index Age Index Core Wins
Evan Longoria 7.5 23 5 2.50 1.17 22
Ben Zobrist 8.5 28 5 2.50 0.96 20
James Shields 3.5 27 5 2.50 1.00 9
Matt Garza 2.9 25 5 2.50 1.08 8
Jason Bartlett 5.3 29 3 1.50 0.93 7
Carl Crawford 5.6 27 2 1.00 1.00 6
B.J. Upton 2.1 24 4 2.00 1.13 5
David Price 1.3 23 5 2.50 1.17 4

In 2009, the Rays won 84 games, featuring seven players that delivered 5 Core Wins or more. This depth, plus MVP-level performances from Evan Longoria and Ben Zobrist, prepared the Rays for the eventual departure of Carl Crawford, whose dwindling team control was removing him from the team’s core. Note that the team’s two best pitchers in 2009, James Shields and Matt Garza, were both under team control for 5 more years. David Price generated only 1.3 fWAR in 2009, and thus barely missed the Core Wins cut, but he was on the upswing.

2010 Tampa Bay Rays

Name fWAR Age Control Years Control Index Age Index Core Wins
Evan Longoria 7.6 24 5 2.50 1.13 21
David Price 3.9 24 5 2.50 1.13 11
Ben Zobrist 3.7 29 5 2.50 0.93 9
B.J. Upton 3.8 25 3 1.50 1.08 6
John Jaso 2.3 26 5 2.50 1.04 6
Sean Rodriguez 2.1 25 5 2.50 1.08 6
Matt Joyce 1.7 25 5 2.50 1.08 5
James Shields 1.7 28 5 2.50 0.96 4
Carl Crawford 7.4 28 1 0.50 0.96 4
Matt Garza 1.5 26 4 2.00 1.04 3

In 2010, the Rays maintained 7 players at a Core Win level of 5 or more, culminating in 96 team wins and a first-place finish in the AL East. Only one pitcher (David Price) made the Core Win cut-off of 5 this time, but James Shields just missed it. Matt Garza regressed a bit (and was promptly traded to the Cubs for more prospects, without any negative effect). Carl Crawford, despite an MVP-level year of 7.4 fWAR, is discounted out of the team core by the Core Wins formula, due to his team control ending that year.

2011 Tampa Bay Rays

Name fWAR Age Control Years Control Index Age Index Core Wins
Ben Zobrist 6.2 30 5 2.50 0.90 14
Evan Longoria 6.2 25 4 2.00 1.08 13
David Price 4.3 25 5 2.50 1.08 12
Matt Joyce 3.5 26 5 2.50 1.04 9
James Shields 4.4 29 4 2.00 0.93 8
Desmond Jennings 2.3 24 5 2.50 1.13 6

2011 featured more of the same. Carl Crawford was gone, but the Rays did not miss him, as the formula anticipated. Six Rays met the Core Win threshold, two of them pitchers (Price, Shields). Superstar contributions by Zobrist and Longoria, combined with ascending contributions from four others — including Price and Shields — resulted in a highly-successful season from Tampa Bay’s controlled talent, and others. The Rays won 91 games and made a wild-card playoff appearance.

2012 Tampa Bay Rays

Name fWAR Age Control Years Control Index Age Index Core Wins
Ben Zobrist 5.8 31 4 2.00 0.87 10
David Price 4.8 26 4 2.00 1.04 10
Desmond Jennings 3.3 25 5 2.50 1.08 9
Matt Moore 2.4 23 5 2.50 1.17 7
Evan Longoria 2.5 26 5 2.50 1.04 6
Alex Cobb 2.0 24 5 2.50 1.13 6
Jake McGee 2.0 25 5 2.50 1.08 5
James Shields 3.9 30 3 1.50 0.90 5

By 2012, the Rays had developed an astonishing eight players that crossed our Core Win threshold. An incredible five of these players — over half the team’s core, under our formula — were starting pitchers with at least four years of team control remaining. This means that the Rays’ entire starting rotation was under long-term control. Despite a hamstring injury that kept him out for over three months, Evan Longoria still contributed 2.5 fWAR to the effort, and his new contract provided the team with the long-term control to keep him in the team’s core. The 2012 Rays won 90 games: not enough for even a wildcard in the American League that year, but a terrific season nonetheless.

Before the 2013 season, the Rays dealt James Shields to Kansas City for the bat of Wil Meyers and other prospects. As of the publication of this article, Fangraphs projects them to win 93 games in 2013, on a payroll of only $62 million. In sum, the Rays have been, and continue to be, the prototypical team that demonstrates what it means to have a sustainable core of controlled talent.

By Stark Contrast, the New York Mets

The Mets have been bad for years, and the Core Wins formula identifies major flaws in roster construction as a possible culprit.

2009 New York Mets

Name WAR Age Control Years Control Index Age Index Core Wins
David Wright 3.4 26 5 2.50 1.04 9
Johan Santana 3.2 30 5 2.50 0.90 7
Angel Pagan 2.8 27 4 2.00 1.00 6

Dreadful: there is no other way to describe the 2009 Mets. That year, the Mets spent $140 million for 70 team wins, generating only three Core Players under our formula. Even those players gave only ok performances. From a Core Wins perspective, this roster was terrible. One of the three players to meet the Core Wins threshold, and the only starting pitcher — Johan Santana — is heading past his probable prime.

2010 New York Mets

Name WAR Age Control Years Control Index Age Index Core Wins
Ike Davis 3.1 23 5 2.50 1.17 9
Johan Santana 3.6 31 5 2.50 0.87 8
Angel Pagan 5.1 28 3 1.50 0.96 7
David Wright 3.5 27 4 2.00 1.00 7
Jon Niese 2.1 23 5 2.50 1.17 6
Mike Pelfrey 2.2 26 4 2.00 1.04 5

The results for the Mets weren’t much better in 2010 — 79 wins — but their roster at least improved. Six players made Core Player-type contributions, and two of those players were starting pitchers. If these performances proved to be sustainable over multiple years, or at least into 2011, the Mets had some reason for optimism.

2011 New York Mets

Name WAR Age Control Years Control Index Age Index Core Wins
Daniel Murphy 2.8 26 5 2.50 1.04 7
Jon Niese 2.1 24 5 2.50 1.13 6
Ruben Tejada 1.6 21 5 2.50 1.29 5
Ike Davis 1.3 24 5 2.50 1.13 4
Jose Reyes 5.8 28 1 0.50 0.96 3
David Wright 1.7 28 3 1.50 0.96 3

But it didn’t work out. In 2011, the Mets were right back to a pathetic three Core Player performances, with only one starting pitcher among them. In fact, the Mets’s strongest core performance in 2011 came from 2.8-win Daniel Murphy. Not good. Ike Davis promptly regressed out of the core, David Wright fought injuries, and Johann Santana didn’t play all year, which is why Core Wins discounts the value of aging players. Although Jose Reyes provided a superstar WAR of 5.8 and a batting title, as a departing free agent, that performance provided no ongoing value to the team, and the Core Wins formula discounts it accordingly. It all amounted to 77 wins, and low expectations for the following season.

2012 New York Mets

Name WAR Age Control Years Control Index Age Index Core Wins
Jon Niese 2.7 25 5 2.50 1.08 7
David Wright 7.4 29 2 1.00 0.93 7
Ruben Tejada 1.7 22 5 2.50 1.23 5
Matt Harvey 1.5 23 5 2.50 1.17 4
R.A. Dickey 4.4 37 2 1.00 0.73 3

Validating this expectation, the 2012 Mets did even worse, winning only 74 games. Only three players could pass the Core Wins threshold, and one of their best players — R.A. Dickey — could not even quality as a Core Player, despite 4.4 fWAR. The Core Wins formula discounts the going-forward value of 37-year-old performances, and Dickey’s 2013 performance with the Blue Jays has validated that skepticism.

But, the Mets get enough bad news, so let’s focus on some positive aspects. In 2012, David Wright performed at an MVP level. And while the Mets had only four Core Win players in 2011, two of them are starting pitchers, which is an important positive from our study of the Rays. In fact, one starter, Jon Niese, was signed to an early long-term contract a very Rays thing to do, putting a competent starter under extended team control. Matt Harvey also looks to be a championship-caliber ace, and remains under maximum team control.

So far, 2013 is not being kind to the Mets either — Fangraphs currently projects them to finish with 76 wins — but there are hints that things may soon be looking up, particularly if their farm system can continue to develop strong rotation talent, as many project that it will.

Trending in the Right Direction: The Oakland Athletics

Finally, let’s conclude with what turns out to be a Goldilocks example: the team that like the Mets, tried and failed to improve their core, but stuck with it and seems to have gotten the hang of it lately: the Oakland Athletics.

2009 Oakland Athletics

Name WAR Age Control Years Control Index Age Index Core Wins
Brett Anderson 3.6 21 5 2.50 1.29 12
Ryan Sweeney 3.9 24 5 2.50 1.13 11
Rajai Davis 3.7 28 5 2.50 0.96 9
Kurt Suzuki 3.1 25 5 2.50 1.08 8
Dallas Braden 2.7 25 5 2.50 1.08 7
Andrew Bailey 2.3 25 5 2.50 1.08 6

In terms of roster-building, the 2009 Athletics took a fairly solid approach: they ended up with six potential Core Players, and three of them are starting pitchers. All these players offered at least five years of team control. However, the 2009 Athletics also underscore that just because your wins are coming from the right place does not mean you are getting enough of them. The best performance in this group is still only 3.9 fWAR — good, not great. The 2009 Athletics won only 74 games, although at least they didn’t have to pay Mets prices to get there.

2010 Oakland Athletics

Name WAR Age Control Years Control Index Age Index Core Wins
Daric Barton 4.8 24 5 2.50 1.13 14
Cliff Pennington 3.4 26 5 2.50 1.04 9
Gio Gonzalez 2.9 24 5 2.50 1.13 8
Brett Anderson 2.4 22 5 2.50 1.23 7
Dallas Braden 3.3 26 4 2.00 1.04 7
Trevor Cahill 1.6 22 5 2.50 1.23 5

In 2010, the Athletics were better. Leveraging some of the previous year’s young talent, they ended up 81-81. There were six core-type player performances, and four of them pitchers: ordinarily, a good thing. But notably, there was not a significant amount of improvement from 2009’s core contributors. In fact, the strongest core contributors in 2010, Daric Barton and Cliff Pennington, were marginal contributors the year before, raising the possibility of fluke performances. And, only two core performances came from position players, which didn’t leave much room for error going forward in the scoring department. So, the 2010 Athletics showed hints of a developing core, but a fragile one.

2011 Oakland Athletics

Name WAR Age Control Years Control Index Age Index Core Wins
Gio Gonzalez 3.2 25 5 2.50 1.08 9
Jemile Weeks 1.7 24 5 2.50 1.13 5
Trevor Cahill 2 23 4 2.00 1.17 5

And indeed it was. The Athletics rotation was devastated by injuries in 2011: Dallas Braden needed shoulder surgery, and Brett Anderson needed Tommy John surgery. That would be a tough blow for any team, but particularly for Oakland, which did not have much behind them. What was left of the rotation (and roster) collapsed to three core-type players. The two core bats of consequence in 2010, Daric Barton and Cliff Pennington, immediately regressed and revealed themselves to be one-year wonders. The only developing bat remaining was an average, but unspectacular debut by Jemile Weeks, whose own performance later proved unsustainable.

Although two out of the three core players were starting pitchers, there was little to support it. Brandon McCarthy actually had a very good year (4.5 fWAR), but since he was completing a 1-year-deal at the time, he offered the A’s no core value.

Things looked bleak. Fortunately, the A’s stuck to their guns and kept developing young talent. Then, 2012 happened.

2012 Oakland Athletics

Name WAR Age Control Years Control Index Age Index Core Wins
Josh Reddick 4.5 25 5 2.50 1.08 12
Jarrod Parker 3.4 23 5 2.50 1.17 10
Tommy Milone 2.8 25 5 2.50 1.08 8
Yoenis Cespedes 2.9 26 4 2.00 1.04 6
Brandon Moss 2.3 28 5 2.50 0.96 6
Sean Doolittle 1.6 25 5 2.50 1.08 4

2012 found the Athletics again having restocked their core, this time with a balance of bats and pitching talent. Five core players are represented, and their values are not all projection, either: Josh Reddick produced 4.5 fWAR, Jarrod Parker generated 3.4 fWAR, and two other controlled players produced close to 3 fWAR. Two core players are starting pitchers. Furthermore, in 2012, the A’s finally enjoyed a little luck. They outplayed their Pythagorean expectation by a few wins, got 2+ win performances from non-core starters on short-term deals — Brandon McCarthy and Bartolo Colon — and ended up with 94 wins and an AL West title, on top of what appeared to be developing core.

If you thought that the Athletics were finally getting the hang of this roster-building thing, you may be right. The Athletics have spent much of 2013 on top of the AL West, and Fangraphs currently projects them to finish with 91 wins — on a budget of $62 million. A very Rays-like experience all around, which corresponds with quality roster construction.

Conclusion

The Core Wins metric profiles the extent to which team performances are being delivered by so-called Core Players, and also tracks the progression of players in and out of the club’s core over time. Even herculean performances by impending free agents (see Carl Crawford, 2010) tend to wash out of the metric, while young players who initially impress, but fail to sustain (see Ike Davis, 2011) also fall out of the measured core, despite their built-in advantages of youth and team control. As such, Core Wins strikes me as useful and if nothing else, an improvement over the prevailing practice of eyeballing the roster and cherry-picking performances by younger players.

Because it is based on WAR (a counting statistic), Core Wins is primarily backward-looking. But, the general method can also be used prospectively. For example, if you input projections from your preferred player projection system, you could forecast the extent to which your team is likely to get future contributions from sustainable sources — a useful thing to know when deciding between trades, farm system call-ups, or free agent signings. Similarly, if you want to focus on particular positions of concern — (third base, starting rotation) — or skill sets (batter OBP, pitcher FIP) — you can adjust the Age Index to account for the peak performance ages corresponding with those particular positions or skills. Those analyses can be retrospective or prospective.

Of course, superior roster construction does not guarantee superior performance, as the Oakland A’s can attest. Previously healthy players can be felled by injury, and promising talents too often fail to sustain early achievements. But in general, developing Core Players makes good sense, and certainly seems to be delivering results for the league’s most efficient ballclubs. So if your favorite team seems incapable of stacking success, you might check to see how good of a job the front office has been doing in generating Core Wins.

Special thanks to Paul Noonan and Tom Tango, who both offered helpful comments on the general direction of this article. All errors are entirely my own, including some table pasting errors in the original version. Thanks to Andrew Yuskaitis for pointing those out. They have now been corrected.


The Basic Fortune Index (Or bFI, If You Are So Inclined)¹

Note: I have no idea if I’m the first to do this, but quite frankly I don’t care.

Last Friday against the Rockies, Matt Wieters had a plate appearance that perfectly epitomized his 2013 season. Coming to the plate in the bottom of the 3rd, with the Orioles up 2-0, two outs in the inning, and the bases loaded, Wieters worked Juan Nicasio for an eight-pitch full count; on the ninth pitch of the at-bat, Wieters hit a perfect, textbook line drive…right to DJ LeMahieu at second, for the third out of the inning.

While watching this game with my father, I was forced to restrain him from destroying the flatscreen upon which this atrocity had been viewed. My level of outrage was not nearly at that of my progenitor’s, however, for I–being more statistically inclined than him–knew that Wieters had been rather unlucky on batted balls this season; after another lineout in Saturday’s game, and two more on Tuesday against the Diamondbacks², Wieters now has a .596 BABIP on line drives, “good” for 170th out of 183 qualified players. At this late in the season, a player’s numbers start to level off to what they’ll be at season’s end, and despite the reassurances of experts, Wieters has not ceased to be unlucky.

Which got me thinking…

Would there be a way to measure how lucky or unlucky a player has been as a whole? Not just for one individual stat, but for an entire stat line, over the course of a whole season? After exhaustive Google searches returned nothing, I decided to take matters into my own hands. Using my rudimentary statistical knowledge, and the findings of Mike Podhorzer–who created equations for xK% and xBB%–and Jeff Zimmerman–who devised an xBABIP equation–I created a basic equation to determine how lucky a player has been 0verall³. Because I have absolutely no idea how linear weights and all that shit works, I kept it simple:

bFI = 100*((xK%–K%) + (BB%–xBB%) + (BABIP–xBABIP))

I call it the Basic Fortune Index; I would’ve called it the Luck Index, but I didn’t want to confuse it with Leverage Index. Basically, I took the difference between each player’s xK% and K%, BB% and xBB%, and BABIP and xBABIP, added them together, and then multiplied it by 100 for shits and giggles. Since a lucky hitter would have a lower K% than expected (as opposed to a higher BB% and BABIP than expected), I took the difference from xK% to K%, instead of the other way around. A positive bFI would indicate a lucky player, and a  negative value would indicate an unlucky player. Also, due to time constraints, I was only able to compile stats for the AL.

On to the leaderboards⁴!

Player K% xK% kdiff BB% xBB% bbdiff BABIP xBABIP bdiff bFI
Joe Mauer 0.175 0.218 0.043 0.12 0.119 0.001 0.383 0.343 0.04 8.4
Miguel Cabrera 0.144 0.147 0.003 0.138 0.097 0.041 0.363 0.335 0.027 7.1
Billy Butler 0.145 0.18 0.035 0.129 0.116 0.013 0.323 0.304 0.019 6.7
David Ortiz 0.138 0.156 0.018 0.123 0.109 0.014 0.333 0.301 0.032 6.4
Josh Donaldson 0.168 0.2 0.032 0.109 0.109 0 0.33 0.319 0.012 4.4
Mike Trout 0.17 0.194 0.024 0.138 0.13 0.008 0.376 0.366 0.01 4.2
Jhonny Peralta 0.225 0.221 0.004 0.08 0.087 -0.007 0.379 0.339 0.04 3.7
Mike Napoli 0.337 0.336 0.001 0.109 0.119 -0.01 0.36 0.314 0.046 3.7
Evan Longoria 0.238 0.235 -0.003 0.108 0.111 -0.003 0.318 0.279 0.04 3.4
Torii Hunter 0.164 0.166 0.002 0.042 0.042 0 0.343 0.316 0.027 2.9
Dustin Pedroia 0.113 0.165 0.052 0.108 0.102 0.006 0.317 0.347 -0.03 2.8
Adrian Beltre 0.097 0.113 0.016 0.07 0.069 0.001 0.324 0.317 0.007 2.4
Carlos Santana 0.178 0.218 0.04 0.135 0.133 0.002 0.299 0.317 -0.018 2.4
Jose Bautista 0.16 0.2 0.04 0.129 0.131 -0.002 0.259 0.274 -0.015 2.3
Jacoby Ellsbury 0.145 0.149 0.004 0.077 0.088 -0.011 0.34 0.311 0.029 2.2
Jason Kipnis 0.215 0.23 0.015 0.115 0.13 -0.015 0.35 0.329 0.021 2.1
Victor Martinez 0.107 0.148 0.041 0.08 0.092 -0.012 0.298 0.306 -0.008 2.1
Daniel Nava 0.178 0.195 0.017 0.1 0.111 -0.011 0.342 0.327 0.015 2.1
Kendrys Morales 0.17 0.176 0.006 0.067 0.077 -0.01 0.325 0.3 0.025 2.1
Adam Lind 0.202 0.216 0.014 0.1 0.099 0.001 0.319 0.314 0.004 1.9
Desmond Jennings 0.202 0.218 0.016 0.091 0.099 -0.008 0.306 0.299 0.007 1.5
Chris Davis 0.292 0.277 -0.015 0.103 0.103 0 0.354 0.327 0.027 1.2
Lorenzo Cain 0.197 0.216 0.019 0.08 0.079 0.001 0.317 0.326 -0.008 1.2
Colby Rasmus 0.301 0.271 -0.03 0.08 0.099 -0.019 0.363 0.306 0.057 0.8
Prince Fielder 0.175 0.19 0.015 0.11 0.103 0.007 0.288 0.303 -0.015 0.7
Ben Zobrist 0.143 0.136 -0.007 0.103 0.094 0.009 0.302 0.298 0.003 0.5
Kyle Seager 0.165 0.19 0.025 0.088 0.104 -0.016 0.309 0.313 -0.004 0.5
Mitch Moreland 0.206 0.234 0.028 0.08 0.091 -0.011 0.265 0.279 -0.014 0.3
Robinson Cano 0.13 0.133 0.003 0.115 0.095 0.02 0.311 0.333 -0.022 0.1
Nick Markakis 0.099 0.124 0.025 0.079 0.075 -0.004 0.295 0.318 -0.022 -0.1
Alejandro De Aza 0.217 0.224 0.007 0.073 0.094 -0.021 0.33 0.317 0.013 -0.1
Jason Castro 0.261 0.258 0.003 0.098 0.103 -0.005 0.345 0.343 0.001 -0.1
Eric Hosmer 0.138 0.153 0.015 0.068 0.071 -0.003 0.32 0.333 -0.013 -0.1
Nelson Cruz 0.239 0.234 -0.005 0.077 0.089 -0.012 0.299 0.284 0.014 -0.3
Alex Gordon 0.207 0.217 0.01 0.08 0.098 -0.018 0.311 0.306 0.005 -0.3
Justin Morneau 0.179 0.193 0.014 0.066 0.07 -0.004 0.294 0.308 -0.013 -0.3
Brandon Moss 0.275 0.267 -0.008 0.09 0.087 0.003 0.29 0.289 0.001 -0.4
Adam Jones 0.185 0.177 -0.008 0.03 0.029 0.001 0.33 0.328 0.002 -0.5
Albert Pujols 0.124 0.159 0.035 0.09 0.089 0.001 0.258 0.288 -0.031 -0.5
Shane Victorino 0.114 0.141 0.027 0.052 0.068 -0.016 0.309 0.327 -0.018 -0.7
Chris Carter 0.368 0.355 -0.013 0.118 0.112 0.006 0.296 0.296 0 -0.7
Manny Machado 0.156 0.136 -0.02 0.039 0.056 -0.017 0.338 0.31 0.028 -0.9
James Loney 0.128 0.13 0.002 0.074 0.066 0.008 0.337 0.357 -0.019 -0.9
Ian Kinsler 0.093 0.132 0.039 0.088 0.109 -0.021 0.271 0.301 -0.03 -1.2
Mark Reynolds 0.317 0.32 0.003 0.11 0.107 0.003 0.288 0.306 -0.018 -1.2
Vernon Wells 0.163 0.148 -0.015 0.062 0.047 0.015 0.266 0.28 -0.013 -1.3
Howie Kendrick 0.171 0.171 0 0.051 0.051 0 0.344 0.357 -0.013 -1.3
Edwin Encarnacion 0.098 0.142 0.044 0.122 0.117 0.005 0.255 0.317 -0.063 -1.4
Erick Aybar 0.088 0.103 0.015 0.043 0.44 -0.001 0.299 0.328 -0.029 -1.5
Brett Gardner 0.201 0.202 0.001 0.083 0.097 -0.014 0.333 0.336 -0.002 -1.5
Nick Swisher 0.218 0.23 0.012 0.121 0.118 0.003 0.292 0.322 -0.03 -1.5
Michael Bourn 0.228 0.216 -0.012 0.063 0.073 -0.01 0.344 0.338 0.006 -1.6
Mark Trumbo 0.26 0.254 -0.006 0.083 0.07 0.013 0.274 0.298 -0.024 -1.7
Austin Jackson 0.21 0.208 -0.002 0.095 0.083 0.012 0.32 0.35 -0.03 -2
Salvador Perez 0.12 0.093 -0.027 0.042 0.038 -0.004 0.299 0.29 0.01 -2.1
Alexei Ramirez 0.1 0.071 -0.029 0.03 0.008 0.022 0.314 0.328 -0.014 -2.1
Jed Lowrie 0.136 0.106 -0.03 0.083 0.081 0.002 0.315 0.308 0.007 -2.1
Nate McLouth 0.14 0.147 0.007 0.088 0.084 0.004 0.305 0.338 -0.033 -2.2
Coco Crisp 0.114 0.148 0.034 0.109 0.11 -0.001 0.256 0.312 -0.056 -2.3
Alex Rios 0.167 0.151 -0.016 0.066 0.07 -0.004 0.315 0.318 -0.003 -2.3
Ryan Doumit 0.168 0.19 0.022 0.084 0.094 -0.01 0.272 0.308 -0.036 -2.4
Yunel Escobar 0.124 0.125 0.001 0.086 0.092 -0.006 0.286 0.308 -0.022 -2.7
Drew Stubbs 0.29 0.257 -0.033 0.072 0.068 0.004 0.333 0.333 0 -2.9
Yoenis Cespedes 0.233 0.23 -0.003 0.076 0.079 -0.003 0.256 0.283 -0.027 -3.3
Mike Moustakas 0.137 0.14 0.003 0.066 0.086 -0.02 0.251 0.268 -0.018 -3.5
Jose Altuve 0.133 0.107 -0.026 0.055 0.041 0.014 0.311 0.335 -0.024 -3.6
Brian Dozier 0.188 0.212 0.024 0.081 0.094 -0.013 0.278 0.327 -0.049 -3.8
Lyle Overbay 0.222 0.207 -0.015 0.068 0.076 -0.008 0.303 0.318 -0.015 -3.8
Adam Dunn 0.285 0.286 0.001 0.132 0.145 -0.013 0.283 0.31 -0.027 -3.9
Matt Wieters 0.172 0.175 0.003 0.081 0.088 -0.007 0.244 0.28 -0.036 -4
Michael Brantley 0.108 0.094 -0.014 0.073 0.076 -0.003 0.3 0.323 -0.023 -4
Elvis Andrus 0.143 0.155 0.012 0.081 0.098 -0.017 0.301 0.343 -0.041 -4.6
Paul Konerko 0.146 0.158 0.012 0.078 0.071 0.007 0.26 0.326 -0.066 -4.7
J.J. Hardy 0.118 0.124 0.006 0.057 0.07 -0.013 0.253 0.296 -0.043 -5
Matt Dominguez 0.164 0.162 -0.002 0.038 0.056 -0.018 0.248 0.283 -0.035 -5.5
Josh Hamilton 0.246 0.24 -0.004 0.067 0.067 0 0.264 0.317 -0.052 -5.6
Alcides Escobar 0.126 0.118 -0.008 0.032 0.025 0.007 0.271 0.325 -0.055 -5.6
Alberto Callaspo 0.106 0.159 0.053 0.072 0.119 -0.047 0.256 0.319 -0.064 -5.8
Asdrubal Cabrera 0.22 0.211 -0.009 0.06 0.075 -0.015 0.288 0.323 -0.035 -5.9
Ichiro Suzuki 0.097 0.108 0.011 0.045 0.047 -0.002 0.292 0.364 -0.072 -6.3
Maicer Izturis 0.094 0.097 0.003 0.069 0.066 0.003 0.248 0.326 -0.078 -7.2
Raul Ibanez 0.256 0.249 -0.007 0.069 0.084 -0.015 0.278 0.33 -0.052 -7.4
David Murphy 0.117 0.128 -0.011 0.076 0.083 -0.007 0.228 0.288 -0.061 -7.9
Jeff Keppinger 0.088 0.07 -0.018 0.039 0.049 -0.01 0.263 0.33 -0.067 -9.5
J.P. Arencibia 0.295 0.255 -0.04 0.04 0.06 -0.02 0.253 0.324 -0.071 -13.1

Wieters ended up 70th out of the 85 players, as his xBABIP wasn’t as high as I thought it would’ve been.

After compiling this table, I noticed a trend (one that has been noticed by others before me): the “lucky” players were mainly good players, whereas the “unlucky” players were mainly bad offensive players. I then matched each player’s wRC+ up with their bFI, and made a table of the result⁵:

Player bFI wRC+ Player bFI wRC+ Player bFI wRC+
Joe Mauer 8.4 143 Nick Markakis -0.1 91 Coco Crisp -2.3 96
Miguel Cabrera 7.1 207 Alejandro De Aza -0.1 104 Alex Rios -2.3 99
Billy Butler 6.7 124 Jason Castro -0.1 120 Ryan Doumit -2.4 91
David Ortiz 6.4 160 Eric Hosmer -0.1 114 Yunel Escobar -2.7 101
Josh Donaldson 4.4 139 Nelson Cruz -0.3 123 Drew Stubbs -2.9 87
Mike Trout 4.2 179 Alex Gordon -0.3 99 Yoenis Cespedes -3.3 98
Jhonny Peralta 3.7 125 Justin Morneau -0.3 101 Mike Moustakas -3.5 80
Mike Napoli 3.7 109 Brandon Moss -0.4 115 Jose Altuve -3.6 83
Evan Longoria 3.4 138 Adam Jones -0.5 125 Brian Dozier -3.8 100
Torii Hunter 2.9 118 Albert Pujols -0.5 111 Lyle Overbay -3.8 98
Dustin Pedroia 2.8 110 Shane Victorino -0.7 102 Adam Dunn -3.9 121
Adrian Beltre 2.4 142 Chris Carter -0.7 108 Matt Wieters -4 91
Carlos Santana 2.4 127 Manny Machado -0.9 110 Michael Brantley -4 106
Jose Bautista 2.3 133 James Loney -0.9 124 Elvis Andrus -4.6 69
Jacoby Ellsbury 2.2 110 Ian Kinsler -1.2 101 Paul Konerko -4.7 77
Jason Kipnis 2.1 137 Mark Reynolds -1.2 96 J.J. Hardy -5 99
Victor Martinez 2.1 101 Vernon Wells -1.3 79 Matt Dominguez -5.5 80
Daniel Nava 2.1 123 Howie Kendrick -1.3 116 Josh Hamilton -5.6 93
Kendrys Morales 2.1 124 Edwin Encarnacion -1.4 145 Alcides Escobar -5.6 54
Adam Lind 1.9 124 Erick Aybar -1.5 94 Alberto Callaspo -5.8 94
Desmond Jennings 1.5 110 Brett Gardner -1.5 104 Asdrubal Cabrera -5.9 91
Chris Davis 1.2 183 Nick Swisher -1.5 111 Ichiro Suzuki -6.3 78
Lorenzo Cain 1.2 88 Michael Bourn -1.6 90 Maicer Izturis -7.2 63
Colby Rasmus 0.8 122 Mark Trumbo -1.7 114 Raul Ibanez -7.4 122
Prince Fielder 0.7 115 Austin Jackson -2 103 David Murphy -7.9 75
Ben Zobrist 0.5 113 Salvador Perez -2.1 85 Jeff Keppinger -9.5 51
Kyle Seager 0.5 128 Alexei Ramirez -2.1 84 J.P. Arencibia -13.1 70
Mitch Moreland 0.3 99 Jed Lowrie -2.1 112
Robinson Cano 0.1 136 Nate McLouth -2.2 105

Apparently, the correlation was not as strong as  I had initially hoped (thanks, Dunn and Ibanez!), as the .53746 R Squared implies.

In the end, it’s probably not a very good statistic–more of a Pseudometric–which, to be fair, is why I named it the Basic Fortune Index. Like most everything I post here, there really wasn’t a point to this whole thing. In addition, it’s fairly likely that, if this is actually published, someone will be so kind as to inform me that there is already a better stat out there for determining the luck of a hitter, and that–despite the disclaimer–I should care about this. If, however, this is an original idea, I invite those more statistically knowledgeable than myself to expound upon it (assuming, of course, I receive all the credit).

———————————————————————————————————————————-

¹How should that be capitalized?

²I refuse to use their nickname, and usage of it by anyone else should be considered cause for legal euthanasia.

³I wanted to use HR/FB%, but since Parts 6 and 7 of this series were never released, I was forced to go without.

⁴All stats are as of Tuesday, August 20th.

⁵I tried to put in the graph, but couldn’t figure out how.