Archive for June, 2014

The Effects of Tommy John Surgery on Batters

The new prevailing trend in major league baseball is a disturbing one. It is a trend of exponentially more frequent Tommy John surgeries. During the surgery, the ulnar collateral ligament is replaced by a different tendon from elsewhere in the body. As would be predicted, pitchers suffer from the injury much more than batters because they are constantly stretching their arm to full extension and pitching at high velocities. However, there are times when batters must have their UCL repaired. The unfortunate truth is that there is little data on what may happen to batters when they return. Most analysts report that the surgery has little to no effect on batters’ performance. This isn’t true.

My search for answers began when I heard the news that Matt Wieters, the Baltimore Orioles catcher, would need to undergo Tommy John surgery. Suddenly, I realized that nobody really knows how he will fare when he returns next season. Same thing applies to Minnesota Twins’ top prospect Miguel Sano. Sano, the Twins’ powerful third baseman of the future, had to have his UCL replaced before the season began to the disappointment of prospect and Twins fans alike. The same kind of disappointment felt when Jose Fernandez needed to have Tommy John surgery. The injury is affecting more players at an exponential rate and there is little data (particularly in regards to batters) that suggests how it will affect them when they return.

I scoured the internet for the complete list of players who have undergone the procedure and came across a massive list of 737 confirmed players (major and minor leagues) and crossed out everyone that was not a position player. I was left with a meager list of 29 names from the major leagues (minor league players were excluded because of the distinct differences from each minor league level). After removing even more names of players who may have appeared briefly in the major leagues or had the surgery and never returned to playing, I was left with just 15 confirmed names. Stars of the times like Paul Molitor, one of the very first recipients of the surgery, and lesser known players like Kyle Blanks both stood out on the list.

The next step in the process of unraveling the mystery behind the surgery was to figure out how the surgery affects the batters. In other words, I wanted to test if different tools were affected and in what ways. Did batters hit for the same amount of power as they did before? To begin, I collected data to test for three different measures of arm strength. Batting Average on Balls in Play (BABIP) determines the rate at which balls put into play are turned into hits. While this is not entirely based on arm strength, arm strength is a large factor in placement of the ball coming off the bat. A more powerful swing will lead to more balls in play being turned into hits. More on that here. Slugging percentage (SLG) was the next piece of data I tested for. If a batter could hit the ball further, then they could have more extra-base hits. Similarly, I tested for Home Run to Fly Ball percentage (HR/FB). This measures the rate at which fly balls go over the outfield walls and become home runs. Another barrier to success, as can be seen in the image below, was that there was no recorded advanced fielding data prior to 2002. So it is possible that the HR/FB data is less diluted by sample size than the other measures.

TJ Batter Data

Honestly, the results were surprising. Like most analysts, I believed that they would be right in saying that the surgery has little to no effect on batter strength. I found this to be wrong though because, on average, most batters did experience a non-negligible decrease in BABIP, SLG, and HR/FB.

TJ Batter

Of the 15 tested batters, 12 experienced a decrease in BABIP, culminating to an average decrease of 0.015. In the sabermetrics world, statistics dictate all research and this is no exception. A 0.015 decrease is another way to say, “1.5% less balls in play lead to hits”. Whether this can be attributed to luck, fielding, or less power is another question. But with over 65,000 at bats worth of data, there should be a sizable amount of batter-driven results rather than deferring the results to worse luck or better fielding. In perspective, a 1.5% decrease in batting average causes a drop from .300 to .285.

Slugging percentage was the most impactful finding though because, of the 15 batters, 11 experienced a decrease in slugging percentage. A reminder that each surgery occurred at different points in the batters’ careers, meaning that natural weakening with age should be filtered out. Overall, the data combined to form a 0.419 drop in slugging percentage or an average 0.028 decrease post-Tommy John surgery. 2.8% less hits were extra-base hits for the remainder of these batters’ careers. A significant amount when considering that some of these batters had careers lasting fifteen years or more. Home Run to Fly Ball rate had to be adjusted to take into account the emergence of fly ball data in 2002 (I removed the home runs hit before 2002 before calculating). Of the 9 batters tested, now 7 of them experienced a decrease in their HR/FB rates. This all comes out to be a 0.018 decrease, meaning 1.8% less fly balls zoomed out of the park and into the stands. The major league average usually stands at 10% but these batters saw their power drop from 10.1% to 8.3% after the surgery.

The only thing left to say is that analysts and fans alike need to recognize the fact that Tommy John surgery does have a negative effect on a batters’ power. Mostly though, I’m disappointed Miguel Sano’s power will never be what it could have been.

Thanks to FanGraphs for all batting and advanced fielding data and BaseballHeatMaps.com for the complete Tommy John surgery encyclopedia

NotGraphs: Only Congress Can Declare WAR, But What About FIP?

Let’s face it: we’re all nerds here at FanGraphs. But it takes a special kind of nerd to bring FanGraphs’ brand of sabermetric analysis to that other realm of the dull and dweeby: the United States Congress.

Every summer, a handful of the 535 senators and congressmen who represent you in Washington divide into teams to play the Congressional Baseball Game, a charity event at Nationals Park. Despite its informal nature and the, ah, senescent quality of play, the game is a serious affair (something its participants often have experience with). This is no friendly softball game; the teams practice for months before the big day, and the players take the results very seriously.

So seriously, in fact, that players keep track of (even send press releases about) their hits and RBI. A small group of baseball-obsessed politicos scores and generates a box score for the game every year. With their help, I was able to take their record-keeping to the next level. This is where this becomes the dorkiest FanGraphs article ever—for the first time, we now have advanced metrics on the performance and value of U.S. congressmen’s baseball skills.

Using recent Congressional Baseball Game scoresheets, I made a Google spreadsheet that should look familiar to any FanGraphs user—complete with the full Standard, Advanced, and Value sections you see on every player page. (Though this spreadsheet is more akin to the leaderboards—since the game is only played once a year, I treated the entire, decades-long series as one “season,” and each line is a player’s career stats in the CBG.) From Rand Paul’s wOBA to Joe Baca’s FIP-, all stats are defined as they are in the Library and calculated as FanGraphs does for real MLBers—making this the definitive source for the small but vocal SABR-cum-CBG community.

That said, unfortunately the metrics can never be complete—there’s just too much data we don’t have. Most notably, although the CBG has a long history (dating back to 1909), I capped myself at stats from the past four years only—so standard small-sample-size caveats apply. (This is mostly for fun, anyway.) Batted-ball data is also incomplete, so I opted to leave it out entirely—and we don’t have enough information about the context of each at-bat to calculate win probabilities. For obvious reasons, there’s also no PITCHf/x data, and fielding stats are a rabbit hole I’m not even going to try to go down.

It’s still a good deal of info, though, and there’s plenty to pick through that goes beyond what you might have noticed with the naked eye at the past four Congressional Baseball Games. But why should I care to pick through them, you might ask; what good are sabermetrics for a friendly game between middle-aged men? Well, apart from the always-fun Hall of Fame arguments, they serve the same purpose they do in the majors: they help us understand the game, and they can help us predict who will win when the Democrats next meet the Republicans (how else would the teams be divided?) on the battle diamond—this Wednesday, June 25.

You probably don’t need advanced metrics to guess that the Democrats are favored. They’ve won the past five games in a row, including the four in our spreadsheet by a combined score of 61 to 12. That’s going to skew our data, but by the same token, Democratic players have clearly been better in recent years. Going by WAR, a full five Democrats are better than the best Republican player, John Shimkus of Illinois.

But the reason we expect Democrats to win on Wednesday is the player who tops that list: Congressman Cedric Richmond of Louisiana. Richmond’s 1.1 WAR (in only three games!) is 0.9 higher than the next-best player (Colorado’s Jared Polis), putting him in a league of his own. In each of the past three CBGs, the former Morehouse College varsity ballplayer has pitched complete-game gems that have stifled the Republican offense. He carries a 40.0% K% and 28 ERA- into this year’s game. (Note: Congressional Baseball Games last only seven innings, so the appropriate pitching stats use 7 as their innings/game constant in place of MLB’s 9.)

The GOP has a few options to oppose Richmond on the mound—it’s just that none of them are good. The four Republicans on the roster with pitching experience have past ERAs ranging from 8.08 to 15.75. If there’s any silver lining, it’s that Republican pitchers have been somewhat unlucky. Marlin Stutzman has a .500 BABIP, and Shimkus has an improbably low 20.8% LOB percentage. Thanks to a solid 15.0% K-BB%, Stutzman has just a 5.98 FIP—high by major-league standards, but actually exactly average (a FIP- of 100) in the high-scoring environment of the CBG. (Another note: xFIP is useless in the congressional baseball world, as no one has hit an outside-the-park home run since 1997.) A piece of advice to GOP manager Joe Barton of Texas: Stutzman is your best option for limiting the damage on Wednesday.

On offense, it’s again the Cedric Richmond show. His 8 wRC and 4.6 wRAA dwarf all other players. In a league where power is almost nonexistent, he carries a .364 ISO (his full batting line is a fun .818/.833/1.182); only eight other active players even have an ISO higher than .000. Other offensive standouts for the Democrats include Florida’s Patrick Murphy, he of the 214 wRC+ and .708 wOBA (using 2012 coefficients), and Missouri’s Lacy Clay, who excels on the basepaths to the tune of a league-high 0.5 wSB. With a 1.4 RAR (fourth-best in the league) despite only two career plate appearances, Clay has proven to be the best of the CBG’s many designated pinch-runners who proliferate in the later innings. (Caveat: UBR is another of those statistics we just don’t have enough information to calculate.) Democrats might want to consider starting him over Connecticut Senator Chris Murphy, however; Murphy is a fixture at catcher for the blue team despite a career .080 wOBA and -2.5 wRAA.

As on the mound, Republicans don’t have a lot of talent at the plate. Their best hitter is probably new Majority Whip Steve Scalise, who has a 168 wRC+, albeit in just four plate appearances. (Scouting reports actually indicate that Florida Rep. Ron DeSantis is actually their best player, but injury problems have kept him from making an in-game impact so far in his career—and he’s missing this game entirely due to a shoulder injury.) Meanwhile, uninspired performers like Jeff Flake (.268 wOBA) and Kevin Brady (.263 wOBA) continue to anchor the GOP lineup, potentially (rightfully?) putting their manager on the hot seat. Some free advice for the Republicans: try to work the walk better. Low OBPs are an issue up and down the lineup, and they have a .279 OBP as a team. Their team walk rate of 8.2% is also too low for what is essentially a glorified beer league. If someone is telling them that the way to succeed against a pitcher of Richmond’s caliber is to be aggressive, they should look at the numbers and rethink.


Breaking Down the Aging Curve: Early 20s

If you missed the first part and want a little more explanation about what I am doing click here.  I am going to start getting into the meat today with larger sample sizes and more typical groups of players.

Age 21 cohort:

There were 102 players in this group, three played only 1 season and were removed.  This is not as necessary with this group, but it becomes pretty important in the later cohorts as you will see.  The main thing is that for the max % part it is automatically 100% for the first year for any player with only one full season.  The 99 players left have an average number of 10.3 full seasons in the majors, so less than the previous cohorts as expected but still long careers on average.  There were 10 players that posted their max wRC+ in that first full season, and 9 posted their max WAR.  Said another way, about 90% of the players went on to have their best season later in their careers making it unlikely that a 21 year-old reaching the 300 PA plateau minimum is showing you a career year.  Again, part of this is that they on average have 9+ seasons to go so they have a lot of opportunities to have better years which the older cohorts will not have.

We also start to see something else I was expecting.  The players who max out in their first year tend to have shorter careers because they are not as good of players on average and that first year max was not very high.  Those that maxed wRC+ averaged only slightly over 4 years of 300+ PAs, and the ones that maxed WAR were only 3.25 years on average (with one active player in the group.  There is some overlap, but the two groups are different and will be for every cohort.  It is likely the trend here continues as well.  If you max WAR your first season it means you are not showing overall improvement later and leave the league quickly.  Those that max wRC+ but not WAR are likely getting more playing time later due to defense or other peripheral skills that are making them better players overall.  On to the max % chart:
 photo 21percentofmaxchart_zps33a3ba20.jpg

It looks like there is some slight improvement in the first couple of years in hitting.  The increase is more drastic in WAR, partly because those that stick in the majors get more playing time and thus accumulate more WAR, but the increase might be more than that especially if the slight uptick in hitting is for real, though I will spend more time trying to tease that out after I have this base run through all the cohorts done.  You will notice that these players peak younger than our traditional understanding of peaks.  The group peak is around 24 and hitting stays around that level until their early 30s, but the WAR starts dropping the next season.

Age 22 cohort:

This group started with 200 players of which 41 only played 1 season and were removed.  The one season group in this case held a lot of current young players such as Wil Myers and Yasiel Puig, so this might be an interesting group to follow over the coming years.  The average tenure of the remaining 159 players was 8.6 full seasons.  Of those 159, 27 had their best wRC+ in their first season and 26 had their best WAR.  Now instead of 90% having better seasons later in their careers, we are down to 83 or 84%.  About one out of every six 22 year-olds never improve on their first full season.  The average number of full seasons for those that did max in year 1 was 4 years for both wRC+ max and the WAR max group with the second being only a few hundredths of years below the first.
 photo 22percentofmaxchart_zps34ed058b.jpg

The chart shows a less distinct increase in the first few seasons, but is upward sloping for both wRC+ and WAR until the age 26 season.  There is a similar decline pattern to the 21 year-old group.  The 21 cohort just had a steeper early incline and younger peak.

Age 23 cohort:

Now we start getting into the largest cohorts.  The most likely time for a player to get their first full season is from ages 23 through 25, and if you haven’t made it by then your odds as a player of ever getting a full season in the majors start to drop off.  This age group started with 320 players total and 43 were removed as one year players like before 7 of which are active players.  Of the 277 left they average number of full seasons played was 7.6 and now 56 had max wRC+ in year 1 and 52 a max WAR.  That is nearing the mark where a full quarter of the players are never better than their first full season.  Of those that maxed in year 1, the wRC+ group had an average of 4.3 full seasons and the WAR group was 3.9 years.  Frank Thomas was in the max WAR group, so despite playing 14 more seasons above the 300+ PA  level after 1991 (only 240 PAs in 1990) he never posted a higher WAR.  He had 2 seasons where is wRC+ were equal or greater than that first one, but didn’t amass enough PAs to accumulate more WAR, though in 1997 he tied the WAR and wRC+ of that first full season.  Anyway, chart time:

 photo 23percentofmaxchart_zps2715039d.jpg

It’s harder to see much of any improvement in hitting with this group. There might be a slight improvement peaking in the 26 season again.  WAR shows an increase that is fairly steady until age 27 and then another similar decline phase.  Another thing to note, the hitting % of peak average at its peak is consistently in the low 80%.  For WAR it is declining so far.  If you look at the WAR line on the three charts, the first hits a peak of 60.3%, the second at 56.4%, and the third at 55.8% and might be worth keeping an eye on as we go on to the next set of cohorts.  For now though I will wrap it up rather than going on for the 3 or 4 thousand words all of the cohorts and summaries might take.


The Unique Path to Success in Oakland

 Two roads diverged in a wood, and I–

I took the one less traveled by,

And that has made all the difference.

— Robert Frost

There are many things that stand out about this year’s Oakland A’s. Their incredible run differential has reached a near historic level, their breakout star from last year has proven that last season was no fluke, and the top three starters are pitching at incredible levels. They’ve been marauding through the American League like Heisenberg’s nemesis through Janjira. However, there’s one aspect of this team that flies under the radar: of their current 25-man roster, only two players were acquired through the amateur draft – Sonny Gray and Sean Doolittle. The rest were acquired through a mix of trades, free agency, waiver claims, purchases, and even one conditional deal.

Billy Beane made his name a while ago by not being afraid to stray from the pack, and in fact looking for those market inefficiencies that could save him a buck or two with the low payroll A’s. By trading for players who may have disappointed at other spots across Major League Baseball, or claiming players put on waivers, Beane is once again finding talent in the most frugal way possible. So is this a new phenomenon in Oakland? Let’s see what the numbers say. Here’s the acquisitional (who says you can’t invent words?!) breakdown of the Oakland A’s roster the last thirteen years.* This includes any hitters who made at least 100 plate appearances and any pitchers who pitched in at least ten games in addition to this year’s current 25-man roster.

* Why thirteen years? Because, Moneyball, of course!

A’s Roster Construction Since 2002
Year AD* FA** T*** AFA^ WC^^ P^^^ CD’ R5” MD”’
2014 2 4 13 1 2 2 1 0 0
2013 4 4 16 1 4 2 1 0 0
2012 7 9 16 2 2 1 0 0 0
2011 6 7 17 0 1 1 0 0 0
2010 9 7 12 1 2 1 0 0 0
2009 11 6 14 1 3 2 0 0 0
2008 8 5 16 1 2 3 0 0 0
2007 10 5 10 1 4 2 0 0 0
2006 8 5 15 0 1 0 0 0 0
2005 10 4 15 0 1 0 0 0 0
2004 8 7 11 0 1 0 0 0 0
2003 8 6 9 2 0 0 1 1 0
2002 6 8 16 2 0 0 0 0 1

AD*= Players acquired through amateur draft;  FA**= Players acquired through free agency;  T***= Players acquired through trades;  AFA^= Players acquired through amatuer free agency;  WC^^= Players acquired through waiver claims;  P^^^= Players acquired through purchases;  CD’= Players acquired through conditional deals;  R5’’= Players acquired through the rule 5 draft;  MnD’’’= Players acquired through minor league draft

 

While the A’s have always built their roster through trades more than through the draft (the only years those numbers were even tied was in 2007 and 2003; every other year there were more players acquired via trade than draft), the trend is becoming more and more evident as of late. On the A’s current 25-man roster, there are a measly two players who the A’s acquired through the amateur draft versus sixteen acquired through trades. Granted, the number acquired through the draft was bound to be a bit smaller so far this season than in previous years since a 25-man roster was used this season, instead of qualified players (again, players who had either 100 plate appearances or ten games in which a player pitched in that given season), which totaled between 27 and 37 in the previous twelve seasons. However, given that the season with the second lowest number of players acquired via the draft was last season, there definitely appears to be a trend here.

Now the question becomes, “how does this compare to the league as a whole?”

Usually Beane is at the forefront of certain trends, so if the A’s roster composition varies greatly from the rest of the league, could it be the start of a league wide trend, especially given the A’s incredible success so far? To answer that question, data on all 30 teams’ roster composition was collected for the 2013 season. Given the same requirements as the previous A’s seasons (100 plate appearances or ten games pitched), how did other rosters across Major League Baseball look last year?

League Wide Roster Construction in 2013
Team AD* FA** T*** AFA^ WC^^ P^^^ CD’ R5”
BOS 26.47 35.29 26.47 5.88 0.00 5.88 0.00 0.00
STL 65.63 12.50 18.75 0.00 0.00 3.13 0.00 0.00
OAK 12.50 12.50 50.00 3.13 12.50 6.25 3.13 0.00
ATL 33.33 10.00 33.33 6.67 16.67 0.00 0.00 0.00
PIT 28.57 21.43 42.86 3.57 0.00 3.57 0.00 0.00
DET 18.75 40.63 31.25 6.25 3.13 0.00 0.00 0.00
LAD 21.88 34.38 34.38 6.25 0.00 3.13 0.00 0.00
CLE 13.79 24.14 58.62 3.45 0.00 0.00 0.00 0.00
TBR 22.58 29.03 41.94 0.00 3.23 3.23 0.00 0.00
TEX 29.03 32.26 19.35 12.90 0.00 3.23 0.00 3.23
CIN 40.00 23.33 23.33 10.00 3.33 0.00 0.00 0.00
WSN 37.50 25.00 31.25 3.13 3.13 0.00 0.00 0.00
KCR 33.33 13.33 36.67 6.67 3.33 6.67 0.00 0.00
BAL 25.81 12.90 35.48 3.23 9.68 6.45 0.00 6.45
NYY 25.81 35.48 22.58 9.68 3.23 0.00 0.00 3.23
ARI 16.13 25.81 45.16 9.68 3.23 0.00 0.00 0.00
LAA 37.84 29.73 21.62 5.41 5.41 0.00 0.00 0.00
SFG 33.33 36.67 10.00 6.67 10.00 3.33 0.00 0.00
SDP 31.43 20.00 40.00 0.00 2.86 0.00 2.86 2.86
NYM 31.58 31.58 13.16 13.16 10.53 0.00 0.00 0.00
MIL 39.39 36.36 12.12 3.03 6.06 3.03 0.00 0.00
COL 36.36 27.27 21.21 12.12 3.03 0.00 0.00 0.00
TOR 24.32 21.62 45.95 2.70 2.70 2.70 0.00 0.00
PHI 35.00 37.50 20.00 7.50 0.00 0.00 0.00 0.00
SEA 27.27 30.30 30.30 9.09 0.00 0.00 0.00 3.03
MIN 33.33 33.33 12.12 6.06 9.09 0.00 0.00 6.06
CHC 11.43 42.86 22.86 11.43 8.57 0.00 0.00 2.86
CHW 30.00 33.33 20.00 10.00 6.67 0.00 0.00 0.00
MIA 30.30 24.24 39.39 6.06 0.00 0.00 0.00 0.00
HOU 15.00 22.50 40.00 5.00 10.00 0.00 0.00 7.50

That’s a lot of numbers, so let’s take a step back and look at some of the numbers that stick out. First of all, instead of using raw totals, percentages have been used to even out the variance among how many players each team had qualify for this roster construction study. It’s also important to note that the highest and lowest percentage in each column has been bolded (this was used only for the three primary ways of acquiring players – the amateur draft, free agency, and trades). One may think of the old adage, “there’s more than one way to skin a cat” when looking at the top of the league. Apparently this adage holds true for baseball roster construction, as well as cat mutilation, as the St. Louis Cardinals – you know, that franchise that has won four of the last ten NL pennants with a pair of titles, and has the self-proclaimed best fanbase in baseball – has gone the complete opposite direction as the A’s to build their squad, relying more on the amateur draft than any other team in baseball, and doing so with great success. Then there are last year’s World Series champions, the Boston Red Sox, who were among the league leaders in players brought in through free agency.

One consistent, league-wide trend was that teams at the bottom of the league standings had far more players qualify for the 100 plate appearance/ten games pitched minimums. This is a bit of a “chicken or the egg” type observation, where the cause can sometimes be confused with the effect. There are several teams among the league’s cellar dwellers that went through numerous players throughout the season in an attempt to find effective players (the “throw the spaghetti at the wall and see what sticks” approach Jonah Keri has referenced on multiple occasions). This would be your Marlins, Astros, and Cubs. However, there are also teams among the lower tier of the standings that were forced into more personnel choices due to injuries; your Phillies, Blue Jays, and Angels. Whatever the reason, it is noticeable that nearly all the teams at the top of the standings at the end of the year have fewer players qualified for the 100 plate appearance/ten games pitched minimums thanks to good health and a clear vision – two staples of successful franchises (interestingly enough the one team that was an exception to this rule in 2013 was the Boston Red Sox; however, given their disaster of a 2012 season, it’s not as surprising to see that they tinkered a bit with their roster throughout the season).

The data supports what many baseball fans would already think, which is that the teams with higher payrolls usually are among the most reliant on free agents, and, in order to compete, the smaller market teams need to find other ways to build their rosters. For example, the top eight teams who built through free agency were: the Cubs, the Tigers, the Phillies, the Giants, the Brewers, the Yankees, the Red Sox, and the Dodgers. Of those eight, the Tigers, Philles, Giants, Yankees, Red Sox, and Dodgers make up the top six teams by payroll in 2014. The Cubs are in the middle of a complete roster overhaul, and Theo Epstein seems to be constructing a team built for flipping at the deadline for future prospects, so cheap free agents are a prime commodity. The Brewers are the odd team out, and would make for an interesting case study.

On the flip side, the top nine teams created by trading players were: the Indians, the A’s, the Blue Jays, the Diamondbacks, the Pirates, the Rays, the Astros, the Padres, and the Marlins. Of those nine, the A’s Pirates, Rays, Astros, Padres, and Marlins made up the six lowest teams by payroll in 2013; the Indians were not far off, with only the 21st biggest payroll of 2013; and the Blue Jays and Diamondbacks both have super aggressive front offices that prefer to bring in players via (usually poor) trades.

There is, of course, the caveat that while this study looks at general roster construction it does not have the nuance to differentiate between a team that is loaded with free agents that are big money free agents (like the Yankees and Red Sox) versus a team loaded with replacement level free agents (like the Cubs). If each player’s salary was totaled by how he was acquired, and then turned into percentages of roster construction again, this would show us how much each team is truly investing into each method of roster construction from a financial point of view. This could be used to compliment Jonah Keri and Neil Payne’s recent study that looked at roster construction. In their piece, Keri and Payne look at roster construction through the lens of a stars and scrubs roster versus a balanced roster. Although there might be some discrepancy based on the arbitrary 100 plate appearance and ten games pitched cut-offs, the data likely wouldn’t be vastly skewed from the current results.

Todd Boss, of Nationals Arm Race did an interesting study somewhat similar to this one, looking at the core players (the 5-man starting rotation, the setup and closer, the 8 out-field players, and the DH for AL teams) for the playoffs teams in 2013, and put the teams into four different categories of roster construction: draft/development, trade major leaguers, trade prospects, and free agency. The results were similar to what was found here, and help to support the idea that the arbitrary cut-offs of 100 plate appearances and 10 games pitched didn’t have a negative impact on the study. The only slightly different result was that Boss found the Rays to be relying more on the draft than on trades.

Having looked at the league-wide breakdown for roster construction last season, let’s take a look at roster construction from an historical perspective. To make a long story short, when Curt Flood took on Major League Baseball, and eventually the Supreme Court, in his fight to turn down a trade to Philadelphia (who can blame him?), he opened up the Floodgates (couldn’t help myself) for the eventual implementation of free agency in baseball. So, has successful (being judged by the extremely arbitrary “ringz” perspective) roster construction changed since then? Let’s take a look with yet another chart (Marshall Eriksen would be proud), this time looking at the past 40 World Series winners, and how each team was constructed.

Roster Construction of World Series Winners Since 1974
Year Team AD* FA** T*** AFA^ WC^^ P^^^ CD’ R5” MD”’ DC+ XD++
2013 BOS 26.47 35.29 26.47 5.88 0.00 5.88 0.00 0.00 0.00 0.00 0.00
2012 SFG 37.50 37.50 15.63 6.25 3.13 0.00 0.00 0.00 0.00 0.00 0.00
2011 STL 39.39 33.33 21.21 3.03 0.00 3.03 0.00 0.00 0.00 0.00 0.00
2010 SFG 31.25 50.00 15.63 3.13 0.00 0.00 0.00 0.00 0.00 0.00 0.00
2009 NYY 21.88 43.75 12.50 15.63 0.00 6.25 0.00 0.00 0.00 0.00 0.00
2008 PHI 29.63 44.44 14.81 3.70 3.70 0.00 0.00 3.70 0.00 0.00 0.00
2007 BOS 20.00 46.67 23.33 0.00 3.33 6.67 0.00 0.00 0.00 0.00 0.00
2006 STL 16.13 41.94 32.26 0.00 0.00 3.23 0.00 6.45 0.00 0.00 0.00
2005 CHW 14.81 40.74 40.74 0.00 3.70 0.00 0.00 0.00 0.00 0.00 0.00
2004 BOS 9.09 39.39 30.30 0.00 12.12 6.06 3.03 0.00 0.00 0.00 0.00
2003 FLA 10.00 30.00 50.00 10.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
2002 LAA 35.71 28.57 14.29 7.14 14.29 0.00 0.00 0.00 0.00 0.00 0.00
2001 ARI 10.00 50.00 20.00 6.67 0.00 3.33 0.00 0.00 0.00 0.00 10.00
2000 NYY 25.00 31.25 31.25 9.38 3.13 0.00 0.00 0.00 0.00 0.00 0.00
1999 NYY 20.00 36.00 32.00 12.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1998 NYY 16.00 44.00 28.00 12.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1997 FLA 6.45 29.03 38.71 16.13 0.00 0.00 0.00 0.00 3.23 0.00 6.45
1996 NYY 12.12 27.27 39.39 18.18 0.00 3.03 0.00 0.00 0.00 0.00 0.00
1995 ATL 40.00 40.00 16.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1994 BOO XX XX XX XX XX XX XX XX XX XX XX
1993 TOR 29.63 37.04 33.33 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1992 TOR 44.00 20.00 24.00 0.00 0.00 0.00 0.00 8.00 0.00 4.00 0.00
1991 MIN 33.33 29.63 33.33 0.00 0.00 0.00 0.00 3.70 0.00 0.00 0.00
1990 CIN 32.00 12.00 52.00 0.00 0.00 0.00 0.00 4.00 0.00 0.00 0.00
1989 OAK 32.14 32.14 32.14 3.57 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1988 LAD 32.14 35.71 28.57 0.00 0.00 3.57 0.00 0.00 0.00 0.00 0.00
1987 MIN 33.33 11.11 51.85 3.70 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1986 NYM 30.77 11.54 50.00 7.69 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1985 KCR 38.46 19.23 26.92 11.54 0.00 3.85 0.00 0.00 0.00 0.00 0.00
1984 DET 42.86 17.86 28.57 3.57 0.00 7.14 0.00 0.00 0.00 0.00 0.00
1983 BAL 32.14 21.43 32.14 10.71 0.00 3.57 0.00 0.00 0.00 0.00 0.00
1982 STL 19.23 3.85 65.38 7.69 0.00 3.85 0.00 0.00 0.00 0.00 0.00
1981 LAD 43.48 17.39 21.74 8.70 0.00 8.70 0.00 0.00 0.00 0.00 0.00
1980 PHHI 39.29 14.29 39.29 3.57 0.00 3.57 0.00 0.00 0.00 0.00 0.00
1979 PIT 32.00 12.00 40.00 16.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1978 NYY 18.18 13.64 63.64 4.55 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1977 NYY 18.18 13.64 63.64 4.55 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1976 CIN 28.00 N/A 44.00 28.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1975 CIN 33.33 N/A 45.83 20.83 0.00 0.00 0.00 0.00 0.00 0.00 0.00
1974 OAK 25.00 N/A 37.50 29.17 0.00 8.33 0.00 0.00 0.00 0.00 0.00

#DC+= Players acquired through free agent draft compensation;  #XD++= Players acquired through the expansion draft

The first note that needs to be made is regarding the 1997 Marlins and 2001 Diamondbacks. Both rosters had skewed roster construction due to how soon after the team’s inception they were able to win a championship. The Marlins have by far the lowest reliance on the amateur draft, and the Diamondbacks have tied for the highest reliance on free agents, but both of these numbers were driven up (or down) by the limited time for drafting and moving along prospects before their championships.

After accounting for the 2001 Diamondbacks season, the steady rise of reliance on free agents since the mid-seventies is notable – up until three years ago that is. It’s hard to tell whether baseball is undergoing an actual “grass roots” movement, with teams relying less and less on big market free agents to succeed, or if this is simply a three-year blip in the radar, but it is certainly notable that the last three World Series winners have had notably lower reliance on free agents than the previous seven years’ winners. The 2011 Cardinals, 2012 Giants, and 2013 Red Sox have not, however, relied on trades, but instead their farm systems more so than other winners of this millennium (not including the 2002 Angels).

In fact, excluding the fluky 2003 Marlins, there has not been a World Series winner as reliant on trades as the 2013 A’s (50 percent) since the mid-eighties Twins and Mets. What’s even more troubling for the A’s is that there hasn’t been a team to use the draft and free agency combined as little as the 2013 A’s since the 1982 Cardinals, a team built during the dawn of free agency.

When judging by championships, in fact, the picture of baseball as a sport in which you need to be in a big market, with the ability to sign big name free agents becomes unfortunately evident. The roster composition of nearly all of the World Series winners this century is quite similar to that first group of teams mentioned above as big market teams built through free agency. This is no surprise to any real baseball fans, however. Look at the cities that have hosted World Series parades since the Yankees’ dynasty of the nineties began. Sure, there are the success stories in Florida and Arizona, but other than that it’s a who’s who of big market teams. While the Cardinals play themselves off as plucky little underdogs, their payroll was the eleventh largest in baseball last year, almost exactly twice that of the A’s.

That’s why this year’s A’s team could be so special. If they are able to continue their regular season success, and finally make the breakthrough they have been struggling so much to make in recent years, they could continue the recent trend of teams moving away from a strictly free agent diet to fulfill their championship dreams. Of course, this has been the case for a couple of years in Oakland now, and it hasn’t happened yet. However, with the top three in the A’s rotation looking as good as any in baseball right now, baseball’s secret superstar at third, and the fact that it is the 25th anniversary of the last A’s World Series title, suddenly it doesn’t seem that unlikely that the A’s could make ole Bobby Frost proud this October.


What’s Changed for J.D. Martinez?

Before the 2012 season, some folks drafted J.D. Martinez as a deep sleeper, coming off a decent debut with the Astros in 2011 and a solid minor league profile. He went on to slug only 11 HR in 439 PA and hit a disappointing .241/.311/.375.  What went wrong? Well, he pounded the ball into the ground at a 51.8 % clip. His line drive % dropped to 16.6 % and he hit only 31.6 % flyballs. It’s hard to hit HR’s and hit for average with that kind of batted ball profile.

He got demoted to AA after failing to impress in 2013 and got injured. This year, for the Tigers, he mashed in AAA, was called up in late April, and has already hit 7 HR’s in only 117 PA with a .312/.342/.596 batting line. So what has changed? Read the rest of this entry »


The Resurgence of Starlin Castro and Anthony Rizzo

The struggles of Starlin Castro and Anthony Rizzo during the 2013 season were well documented. Chicago Cubs fans’ hopes and dreams rested on these two young players to be the cornerstones of the long and painful “rebuild” on the North Side and it appeared that maybe they were not cut out for such lofty expectations. The lineup around them offered little in the way of quality. Pitchers shifted most of their focus on these two and they struggled terribly. Starlin Castro owned a triple slash of .245/.284./.347. which led to the questioning of his focus and ability. Anthony Rizzo did not exactly turn any heads either, batting .233/.323/.419. At least Rizzo’s peripherals offered some hope that some positive regression was in store for the 2014 season. To say the least, 2013 was a down year for both young players.

When the 2014 season arrived, the script was quite different. Castro and Rizzo set out to silence the critics. With the disappointing 2013 season in the rearview mirror, both are producing at all-star levels so far this season. Castro’s mainstream statistics look spectacular, with a triple slash of .287/.331/.484 including 11 home runs and 43 RBIs (already matching his 2012 counting stats). That production at the premium position of shortstop makes it all the better. Here’s a look at Castro’s underlying statistics from 2013 and 2014:

O-Swing% BB% K% ISO wOBA wRC+ WAR
2013 32.6 4.3 18.3 .102 .280 70 -0.1
2014 29.8 5.5 17.2 .197 .356 122  1.7

Castro has improved greatly across the board. He is swinging at less pitches out of the zone which is paying dividends towards his BB% and K%. He ranks 3rd in both wOBA and wRC+ among all shortstops, behind Troy Tulowitzki and Hanley Ramirez. It is amazing to think that he is still pre-peak in the power category since he has been in the MLB for almost five full seasons. He is on pace for a career high in home runs this year collecting 11 so far. I think that it is safe to say that last year’s Castro was an illusion. He appears to be on his way to stardom just as the Cubs rebuild comes to a close.

Over at first base, Anthony Rizzo looks like the player Theo Epstein and Jed Hoyer thought he was going to be when they traded for him. This year, his production is nothing short of spectacular with a .278/.400/.506 triple slash including 15 home runs and 48 RBIs. That production is drawing comparisons to Joey Votto. The growth in his game can also be seen in his sabermetric stat line from 2013 and so far in 2014:

O-Swing% BB% K% ISO wOBA wRC+ WAR
2013 31.1 11.0 18.4 .186 .325 102 1.6
2014 26.9 15.5 19.4 .227 .393 148  2.6

Just like Castro, Anthony Rizzo drastically improved across the board (minus K%). Rizzo ranks 4th in wOBA and 5th in wRC+ among all first basemen. He has improved his defense and looks very comfortable at the plate. He too is on pace for a career high in home runs, racking up 15 already. Rizzo is showing that he can be a huge threat at the plate for years to come.

This was a crucial season for both Castro and Rizzo. The Cubs organization, having given out long term contracts to both, depended on them becoming mainstays in the lineup when they finally become threats in the NL Central. With Rizzo on pace for 4+ WAR this season and Starlin on pace for 3+ WAR, it looks like they really are the budding stars that Epstein and Hoyer believed they would be. With these two all-stars, Javier Baez, Kris Bryant, and the other top prospect talent the Cubs possess, the future looks very bright on the North Side of Chicago.


Taking a Closer Look at Hitting with Runners in Scoring Position

In baseball, part of what is commonly debated is how important it is to hit with runners in scoring position. Viewers of their teams will often have their sad sigh when their team leaves runners stranded in scoring position and will look up how their team does in those situations and say, “this is why we don’t score runs” or “this is why we don’t win games.” They will also look at other teams and see how good of an offense the other team might have and immediately make the assumption that they are going to be better at hitting with runners in scoring position than most other teams if their offense is better. But just how much of a team’s success is based on hitting with runners in scoring position and how much of hitting with runners in scoring position is based on team success?

I. Impact of Hitting with Runners in Scoring Position

One of the old clichés in baseball is, “you can’t win without hitting with runners in scoring position.” Many people link that to why the Cardinals had done so well in the past and why they haven’t really been able to get going this year. In years past, they have consistently been not only one of the best teams in baseball, but also the best at hitting with runners in scoring position.

Many people in the game consider it also to be one of the most important stats when it comes to judging a player’s hitting ability. In a press conference at the beginning of the season, Matt Williams had sabermetricians finally thinking that someone with their ideology was becoming the manager of the Washington Nationals when he said, “If you don’t get with the times, bro, you better step aside.” When I heard that, I immediately thought that he would be talking about more advanced hitting metrics than batting average and home runs and RBI’s. He followed that comment up with, “My favorite stat right now and always has been the stat of hitting with runners in scoring position. Because batting average and on-base percentage and all of those things are great, but who is doing damage and how can they hit with guys in scoring position.” When I heard that, I immediately slunked back in my chair and placed him in the category of old-school.

And listening to one of the Reds games (as I always do), listening to Marty Brennaman (who I think is a good broadcaster for his catchy phrases and also because he’s from where I’m from), I heard him talk about Votto and he said, “Votto will take a 3-0 pitch an inch off the outside corner, when he could do with it what he did Wednesday. I believe in expanding your strike zone when you’ve got guys on base.” For those who don’t know, what he did on Wednesday (a while ago), was drive a 3-0 pitch from Matt Harvey (that shows how long ago it was) for a home run to left field in New York. Unfortunately, for a while now Marty Brennaman has been seemingly leading a war of the old-school against his own team’s star first baseman Joey Votto over hitting. Namely hitting with runners in scoring position or men on base. Again, while listening, I slide back in my chair, disappointed in Marty for being so illusioned and confused and broadcasting his wrong opinion to many of the people who listen to him on the radio.

Williams and Brennaman aren’t the only people that have this mindset though. The thing that they and many other people think is that if you can’t hit with runners in scoring position, you can’t win games and you can’t score runs. For these people, it is for the most part a blind hypothesis, just assuming it is true because it seems that it should be true.

For examining this data, I am going to look at the coefficient of determination, or R2 (I have below this the formula for R, correlation coefficient, that when squared equals the coefficient of determination). For those who don’t know, when looking at the data and calculating a formula of best fit, R2 shows a percentage value of how many of the samples of the x-value fit the line of best fit (the line that in perfect situations can calculate the y-values). I am going to call the dependent variable, or y-value, wins and runs and the independent variable, or x-value, the various offensive statistics that I will use to test my hypothesis (hitting with runners in scoring position does not have much to do with determining how many wins a team gets in a season or how many runs a team scores). Basically it is how dependent team wins and runs are on hitting with runners in scoring position. Before I look at hitting with runners in scoring position, it is important to establish which three offensive statistics are the best at determining wins and runs.

In terms of influencing the scoring of runs from 2002 to 2013, the three best offensive statistics are:

1. OPS with an R2 of .9132 (91% of the OPS x-values fit the formula: y = 2059.2x – 791.27)
2. ISO with an R2 of .5801 (58% of the ISO x-values fit the formula: y = 3279.75x + 238.02)
3. wOBA with an R2 of .3999 (40% of the wOBA x-values fit the formula: y = 3482.9x – 389.93).

When it comes to which statistics determine wins the most, the three best statistics are:

1. WAR with an R2 of .5329 (53% of the WAR x-values fit the formula: y = 1.1243x + 59.614)
2. wRC+ with an R2 of .4302 (43% of the wRC+ x-values fit the formula: y = 0.8977x – 5.4636)
3. wRAA with an R2 of .3632 (36% of the wRAA x-values fit the formula: y = 0.1033x + 81.239)

There are a couple things to notice when looking at this data. One of those things is that most offensive statistics have a much weaker coefficient of determination when looking at wins, largely in part to the fact that pitching is kept completely out of the equation. Another thing to know is that if there was a bigger sample size, the R2 values would be different but using this sample size (which I will use for RISP), these are the R2 values that show up.

The purpose behind collecting those statistics in terms of offense in general as opposed to just RISP is because this way there will be statistics to use when looking at how much RISP influences offense. Looking at determining runs scored in an overall season with RISP numbers:

1. OPS has an R2 of .3099 (31% of the OPS x-values fit the formula: y = 948.7x + 19.173)
2. ISO has an R2 of .2395 (24% of the ISO x-values fit the formula: y = 1812.2x + 470.92)
3. wOBA has an R2 of .2898 (29% of the wOBA x-values fit the formula: y = 2391.5x – 35.754)

It is quite a dramatic change, especially when looking at OPS that clearly had a big hand in determining runs scored in a season. While some of them still have some modest effect in determining runs scored, it is still not quite at the same level as those that covered a full season and not just a given scenario. Now looking at how those other statistics determine wins with runners in scoring position:

1. WAR has an R2 of .29 (29% of the WAR x-values fit the formula: y = 2.5609x + 68.94)
2. wRC+ has an R2 of .2739 (27% of the wRC+ x-values fit the formula: y = 0.5518x + 27.727)
3. wRAA has an R2 of .2366 (24% of the wRAA x-values fit the formula: y = 0.2366x + 80.996)

As I had mentioned before, it should be expected that these numbers ought to be low because there is much more that goes into a win than just offensive ability. There has to be great pitching too that is not put into account. With that said, these numbers are quite far from being great in determining wins as is evidenced by their still being far away from even the 50% mark that they should be close to.

For Matt Williams’ sake, I also looked at how much batting average with runners in scoring position determines wins and runs:

1. For scoring runs, AVG has R2 value of .181 (18% of AVG x-values fit the formula: y = 2005.8x + 213.05)
2. For wins, AVG has R2 of .1427 (14% of AVG x-values fit the formula: y = 257.76x + 13.255)

So Matt, not to rain on your parade, but batting average with runners in scoring position has very little to do with determining runs or wins. And Marty, it’s just limiting Votto’s overall production to a small sample size that doesn’t have a whole lot to do with winning games. No one will argue that hitting with runners in scoring position can help to win games because it does often result in scoring a run but it should not be looked at as one of the key stats in a player’s production.
II. Is it dependent on overall strength of offense?

Now back to those St. Louis Cardinals. Last year, with runners in scoring position, they put up not only unreal numbers, they put up numbers that are really just plain stupid. I mean, they batted .330 with runners in scoring position, had a .370 wOBA, and a 138 wRC+, and won 97 games, 32 games over .500. Like I have previously established, those numbers are intrinsically worthless considering that it is such a small sample size but those are still just gaudy numbers. This year, for lack of a better word, they’re awful with runners in scoring position. A .244 batting average, .293 wOBA, and 86 wRC+ all those with runners on second or third and have won 39 games, only 4 over .500.

Many people look at that and think that clearly, their inability to hit with runners in scoring position this year has caused the drop off in production. Of course, the low .303 wOBA, 92 wRC+, OPS of .681, and AVG of .250 are a bit of a drop off from the .322 wOBA, 106 wRC+, .733 OPS, and .269 AVG of last year might have something to do with that drop off in offense too. The Cardinals offense is also scoring about a run less this year than they did last year (4.83 Runs/9 innings in 2013 and 3.67 Runs/9 innings in 2014) meanwhile their pitching has practically been identical to last year with a FIP of 3.31, xFIP of 3.66, and SIERA of 3.60 this season compared to last year’s 3.39 FIP, 3.63 xFIP, and SIERA of 3.57. But is hitting with runners in scoring position dependent on how the offense overall is? I’m sure you can already see what coefficient we’re going back to.

The process was similar to last time, with the dependent variable, or y-value, being hitting with runners in scoring position, and the independent variable, or x-value, being the same statistic only looking at the value over the course of a full season. I found that wRC in a year has by far the strongest effect in determining how a team hits with RISP with an R2 of .7527 with 75% of the x-values fitting into the equation of y = 0.3364x – 51.232. OPS is after that with an R2 of .6487 and 65% of the x-values fitting the equation of y = 1.0184x + 0.0025. And then there is wOBA that has an R2 of .6258 and 63% of the x-values fitting the equation of y = 0.9807x + 0.0062. Some other values are:

• wRAA that has an R2 of .5811 (58% of the x-values fit into the equation: y = 0.2586 + 0.5721)
• wRC+ that has an R2 of .5558 (56% of the x-values fit into the equation: y = 0.9678x + 3.3038)
• WAR that has an R2 of .3831 (38% of the x-values fit into the equation: y = 0.2005x + 0.8901)

So a case could be made that the strength of a team’s offense overall does dictate how that same team hits with runners in scoring position. While by no means is it an overwhelmingly strong coefficient of determination in any of the cases, in most cases the strength of an offense determines at least 50% of hitting with runners in scoring position which is good enough to at the very least say that better offensive teams are more likely to hit better with runners in scoring position than weak offensive teams.


How Jose Abreu’s Career in Cuba Reflects His Future MLB Success

Before coming to the MLB and smashing 20 home runs in just his first 58 games, Jose Abreu had a prolific career in the Cuban Baseball National Series (Cuba’s top championship), starting at the very early age of 16, when he would play at first, second, third or in the outfield. While doing so, he averaged .271 with five homers and 21 RBIs in 71 games. He seemed like a very hot prospect, taking into account how old he was (or how young, for that matter), and for that very short stretch (say for the 2003-04 and the 2004-05 seasons) he seemed overpowered by pitchers, some of whom were old enough to be his father. From then on, he owned them.

For his career in Cuba, Abreu fell shy 16 homers of 200 in ten seasons. Yet, it was his youth that kept him from getting them early season-wise. Up to his 21-year-old season, his career-high in dingers was 13 (that very year) and had collected more than ten only once (11 in 2005-06), when he had what could be called his breakthrough year, hitting .337, with 105 hits and 64 RBIs in 84 games. Read the rest of this entry »


Breaking Down The Aging Curve

Ever since I read Jeff Zimmerman’s aging curve article in December I have been thinking more about aging curves in general.  That has lead me to take a step back and start digging through players in a different way.  Jeff gave a couple of plausible reasons for the difference in aging curve, teams are developing players better prior to appearing in the majors and that they are doing a better job of identifying when they are ready.  I’ll throw another out there before I start this.  MLB has gotten younger recently and to do that you need to be pulling in more young players.  In general you would expect players first pulled up at each age point are in the farthest region of the right tail of the talent distribution and then you move left as you add more players from that group.  Maybe a larger percentage of the younger players being brought up just are not as good and won’t ever thrive at the big league level.  Anyway, let’s get to what I have started working on to see if breaking things apart can shed any light on the subject.

To start I pulled every position player year for rookies in the expansion era (after 1960) and ended up with 2,054 players and 11,585 player seasons including active players not just completed careers.  Then I broke players into age cohorts with when they played their first season with at least 300 plate appearances which I will refer to as full seasons the rest of the way.  I will be working through to see if players age differently based on what age they reach the majors and get regular playing time.  To do this I will mostly be looking at percent of peak wRC+ and WAR.  For this post I am only doing the first couple of cohorts and then I will work through more in the coming weeks.

The first cohort I broke down was the age 19 group.  Only one player amassed the 300 plate appearances necessary at age 18, Robin Yount, so there is not much to learn there except that if you can hack it at the big leagues when you are 18 you are probably really, really good.  That will be true for the 19 and 20 year-olds as well, but there are more of them.  The age 19 cohort is also small with only 8 players; Ken Griffey Jr., Edgar Renteria, Bryce Harper, Cesar Cedeno, Tony Conigliaro, Ed Kranepool, Jose Oquendo, and Rusty Staub.  This will be the only cohort small enough that I will list everybody.  Interestingly the age 20 cohort has a lot more star power as Griffey is the only Hall of Famer (I know he isn’t in yet, but he will be on the first ballot).

Of the seven 19 year-olds that have retired, the average number of full seasons played is almost 13, so they did have long careers as you would expect.    None of the players peaked in wRC+ or WAR in their first full season, which is not surprising.  The more seasons you are in the majors, the lower the probability that the first season will be the best one just because you have more opportunities to best it.  Harper actually put up a better wRC+ in year 2, though his rookie WAR was better and this year isn’t looking like a new high for him so far.  If you take their average percent of peak at each age and chart it this is what you get:
 photo 19percentofmaxchart_zps8c04fc32.jpg
The sample size here is so small I wouldn’t want to believe it too much, but we might see some improvement for this cohort early in their careers.  The peak, if there is one, looks like 25 to about 27 especially in WAR.  Then it is all decline.  Again, these are players from the ERA that showed this before, not from players in the last 10 years that are not showing improvement in Jeff’s article.

Let’s move on to a bigger group and see what happens.  The age 20 cohort includes 37 players with 10 current players.  There are Hall of Fame or near HoF players all over.  Rickey Henderson, Roberto Alomar, Ivan Rodriguez, and Johnny Bench are in along with Alex Rodriguez, Joe Torre, Andruw Jones, Gary Sheffield, Alan Trammel, Adrian Beltre, and Miguel Cabrera.  Mike Trout  is the only young guy I would assume has to eventually make it, but there are a couple others there that might eventually be that good too.  In my opinion, about a third of this group are HoF caliber or will be after their career is done.  That is 1 out of every 3 players that stick in the bigs at age 20 will be good enough to make it to Cooperstown.  Way better than the 19 year olds.  The average career length for those that are not active was over 11 years, so again most should not max out in their first year.

Only three players had their best hitting season as a rookie, but it was because all three of them had their only 300+ plate appearance season at age 20 so it was the only season in the sample.  Danny Ainge was one of the three though, so we could go see when his basketball career peaked instead maybe.  All three therefore also had their best WAR season at 20, but there was a fourth player who had his max WAR in that first full season, Claudell Washington.  Washington had 14 full seasons as a major leaguer and his best by WAR was year 1, and he had only one wRC+ better than that first year.  If we look at the chart for the age 20 cohort chart it looks way different than the 19 cohort.
 photo 20percentofmaxchart_zpse95fa760.jpg
Again, this is not a large sample, and it is overwhelmed by extremely good players.  There seems to be an increase in the first couple of seasons followed by a long, flat peak that for wRC+ goes all the way into their early 30s.  WAR is more volatile and might start declining a couple of years sooner.

I expect that this will get more informative as we get into more normal players and larger samples, but it is fun to look at elite players.  I’ll break down a couple of more age groups in the near future, and eventually try and build a regressed model for the bigger cohorts to control for the era and some of the other effects that aren’t rolled into wRC+ or WAR.


Is David Price Actually Improving?

Casual fans who look at David Price‘s stat-line this year definitely come away unimpressed. On the surface, his 4-6 record with a 3.97 ERA are sub-par for a pitcher of his caliber, especially one who has been pegged as an ace for his entire major league career. Along with the underwhelming initial stat-line, his average fastball velocity is still down from its apex at about 95-97 MPH to around 92-94 MPH. All of this looks like it spells disaster for both the Rays, who want to ship him out at the deadline for future cornerstone players, and for Price, who is a free agent after the 2015 season.

This table can show you the slight but meaningful decline in Price’s velocity since his Cy Young Award winning season in 2012:

Velocity (MPH)
Fastball    Sinker    Change    Curve    Cutter
2012    96.49       96.17        84.93      79.55     89.88
2013    94.51        94.47       84.72      80.32     89.15
2014    94.38       93.96       85.63      79.88     87.26

But if you delve deep into the world of statistics, it appears that David Price is arguably improving as a pitcher.

His K/9 is sitting at a career best 10.02 along with a career best BB/9 at 0.90. If you look a little deeper at the sabermetric stat-line Price is also performing at a career best FIP and xFIP, which are 2.97 and 2.66, respectively. These two stats portray how Price’s ERA is not indicative of his actual performance. Continuing this trend, his LOB% sits at below average 67.5%. High strikeout pitchers like Price usually have more control over their LOB%, so its very likely that Price will positively regress toward his career average of about 75%. It could even be better due to his increase of strikeouts and decrease in walks. He also is sporting a career high 12.3% HR/FB that is contributing to his inflated ERA.

And if you look even deeper into the statistical world, Price is changing how he pitches—-and its actually improving his performance from its already lofty level. The only problem is the surface stats are not catching up with his actual performance…… just yet. Here is a table that shows Price’s pitch usage over the past three years:

      Pitch Usage
Fastball    Sinker     Change    Curve    Cutter
2012    12.56%    48.39%    12.15%    10.85%    16.06%
2013    15.07%    39.43%    16.61%    11.02%    17.87%
2014    15.97%    40.45%    17.02%    10.93%    15.64%

 With the velocity decrease in mind, the data is portraying that Price has had to adapt as a pitcher in order to continue having success. His fastball and changeup usage has increased because he can no longer blow it by hitters with ease. Along with this:

 Whiff Percentage
Fastball   Sinker  Change  Curve   Cutter
2012     9.24        6.15       12.37      20.25    9.74
2013     9.83        4.49      17.38      6.73       6.29
2014     9.28        9.20      19.09     12.88    12.02

In 2014, Price is rocking better whiff rates than in his amazing Cy Young Award winning 2012 season. His whiff rates have increased across the board other than his curveball. This means that David Price has adjusted his game around his diminishing velocity and has adapted from a power pitcher to a smarter, more crafty pitcher that changes speeds and does not solely rely on velocity to put away hitters. These increased whiff rates are the reason that Price is sporting a career best K/9 ratio. He is throwing a career best 72.1% of pitches for strikes on the first pitch of an at-bat, which contributes to his career best BB/9.

Overall, a simple glance at Price’s stat-line would give the impression that he is declining. But after looking deeper at his actual performance this season, the underlying facts show that he is changing the way he pitches and could quite possibly be getting better. There are rumblings that scouts no longer view Price as an ace that can lead a team deep into the playoffs. From a scouting perspective that may appear to be true, but with the knowledge of these underlying statistics, I believe that Price is still the pitcher he always has been, if not better.