Archive for February, 2015

Time from Draft Until Majors

Yesterday, Nate Andrews asked the following question on Carson’s Instagraphs post 

I’m jus a little curious of the idea of length to reach the major leagues though. It’s definitely interesting to see the difference between high school and college draftees, but I’d be interested in looking at say, the average length to reach the major leagues (especially in first round draftees, considering they make it more often). On average, does it take a high school kid significantly longer to compared to someone who went to a 4 year college? I’d assume yes, but just something I would find interesting to see.

To answer that question, I did a very quick analysis. I compiled all first-round draft picks including supplemental first-rounders from 2008-2011 (dates arbitrarily chosen). Of those 200 players, 87 of them have reached the majors. Ignoring the three JC players, we are left with the following average time to majors:

HS: 3.8 years
4 Year: 2.5 years

So from this it appears that it takes high-school players a little over a year longer than college players, at least for those taken in the first round.

Of course, this analysis has a huge flaw, namely that I’m ignoring those 113 players that haven’t reached the majors. Many of them never will, but several will get there, thus biasing my numbers too low.

To deal with issue, I turned to something in statistics known as survival analysis. The name stems from biostatistics, where as the name implies, they are interested in the time until an individual dies. However, many medical studies are only run for a few years, and inevitably some individuals do not die in the period. Thus the idea of ‘censoring’ was born, where we know that someone survived until some time, but we do not know when they will actually die. These individuals still provide information for the researchers, which is modeled using a censoring mechanism. If anyone is interested in survival analysis, there are tons of references, but you can start with Wikipedia.

However, in biostatistics, we typically know that the event will eventually occur. However, in our context (time to reach majors), an event may not occur. There are good ways to deal with this, but I am going to be lazy. Instead, I just dropped the four players from my group that have not played professionally since 2002. This will likely bias the numbers too high, but still provides a fun exercise.

Once we account for censoring, the average time to majors is now:

HS: 6.5 years
4 Year: 4.5 years

While biased high, the difference between high school and college should not be affected. Now we have evidence that it takes high-school players about two more years than college players, at least on average. For those interested, JC players come in at 4.7 years (extremely small sample size warning!).

Just for fun, I did a similar analysis for draft position and position. Again, these numbers will be biased a little too high, but are interesting nonetheless.

The first overall pick is expected to reach the majors in 3.6 years. For every pick after that, we expect an additional 0.07 years, on average, to reach the majors. Thus the 10th overall pick should reach the majors in ~4.3 years, the 30th overall in 5.7 years, and so forth.

For position, a player’s position is whichever position is assigned to him on Baseball Reference’s draft page. I have no idea if this represents their position on draft day, or something else. Left fielders and right fielders are lumped together here. The results are as follows:

C: 5.0 years
1B: 3.1 years
2B: 4.5 years
3B: 3.9 years
SS: 5.1 years
LF/RF: 6.3 years
CF: 5.6 years

RHP: 5.9 years
LHP: 5.5 years

It should be said again that these numbers are a bit high. Furthermore, I am well aware that the sample size is low, so expect rather high uncertainty on these numbers. If anyone wants to further this analysis, I highly encourage it!

If you want to email someone for any reason, please feel free to email this specific someone at

Competitive Bidding for the All-Star Game?

It was announced recently that the Miami Marlins will host the 2017 All-Star Game, making this the first time the Marlins will host the Midsummer Classic. Fourteen years ago the Marlins were in line to host the 2000 All-Star Game but after their fire sale following their 1997 World Series championship, MLB flipped the game to Atlanta.

Traditionally, the All-Star game has alternated between leagues. The last time the same league hosted back-to-back All-Star Games was 2006-2007, when the game was played in Pittsburgh’s PNC Park then San Francisco’s AT&T Park, both in the National League. Before that, you have to go all the way back to 1950-1951 to find All-Star Games hosted by the same league in back-to-back seasons (the White Sox’ Comiskey Park in 1950, Detroit’s Briggs Stadium in 1951). Awarding the 2017 game to the Marlins means that National League will host the game three years in a row, following Cincinnati this year and San Diego in 2016.

In the case of the Marlins in 2017, it appears that outgoing commissioner Bud Selig had slated the Marlins to get an All-Star Game in the near future after the team recently opened a new ballpark in 2012. Other teams in contention were the Baltimore Orioles, who last hosted the game in 1993, and the Washington Nationals. The Nationals have not hosted a game since their move from Montreal in 2005.

Along with the 2017 All-Star Game announcement, incoming commissioner Rob Manfred had this to say in a recent interview with ESPN’s Jayson Stark: “One of the things that I am going to try to do with the All-Star Games is—and we’ll make some announcements in the relatively short term—I am looking to be in more of a competitive-bidding, Super Bowl-awarding-type mode, as opposed to [saying], ‘You know, I think Chicago is a good idea’”.

The Super Bowl bidding process is quite a thing to behold. In an article in the Minneapolis Star-Tribune from June of last year, the paper listed many of the concessions made by the city to get the Super Bowl. The paper also posted a copy of the NFL’s “Host City Bid Specifications and Requirements”, which is 153 pages long. Among the items in this document:

  • NFL controls 100 percent of the revenues from all ticket sales, including suites, and exclusive access to all club seats.
  • Exclusive, cost-free use of 35,000 parking spaces for gameday parking.
  • Full tax exemption from city, state, and local taxes for tickets sold to the Super Bowl, including the NFL Experience, the NFL Honors show and other NFL Official Events.
  • NFL has the option to install ATMs that accept NFL preferred credit/debit cards in exchange for cash and to cover up other ATMs.
  • Host city is asked to pay all travel and expenses for an optional “familiarization trip” for 180 people in advance of the Super Bowl to inspect the region.
  • NFL requires the usage of three top-quality, 18-hole golf courses in close proximity to one another and greens and cart fees at these three courses must be waived or otherwise provided at no cost to the NFL. (Golf courses in Minneapolis in February? Really?!)
  • NFL requires the reservation of up to two quality bowling venues at no rental cost.

And those are just a few of the request by the NFL. The “competitive-bidding, Super Bowl-awarding-type mode” is great for NFL bigwigs. It’s not surprising that MLB owners would want to get on board this gravy train.

Other sports have their own bidding processes for their showcase events. Notably, FIFA has had numerous scandals associated with the bidding process for the World Cup. Most recently, after awarding the 2018 World Cup to Russia and the 2022 World Cup to Qatar, FIFA had an investigator create a report looking into accusations of impropriety in the World Cup bidding process. They then announced that the report “cleared its integrity and should constitute closure” despite the fact that the 42-page summary of the report identified numerous instances of corruption, collusion, and vote-buying. Even as the chief chair of FIFA’s ethics committee patted himself on the back, the man who created the report, U.S. prosecutor Michael Garcia, claimed that the portrayal of his report was “erroneous and incomplete”.

Of course, the long-standing king of bidding process corruption would be the Olympics. Do a Google search on “corruption in the Olympics bidding process” and numerous articles as far back as 1999 turn up dealing with shenanigans when it comes to awarding the Olympics to cities competing for the honor.

In a 1999 article at the New York Times, the International Olympic Committee (IOC) acknowledged corruption in the Salt Lake City bidding for the 2002 Winter Olympics. Richard Pound, a lawyer who led the IOC investigation said, “We have found evidence of very disappointing conduct by a number of IOC members. Their conduct has been completely contrary to everything the Olympic movement has worked so hard to represent.” HA! That’s quite funny, in hindsight. Rather than clean up the process, things have just grown worse in the last 15 years.

Sure, the Salt Lake City Olympics scandal triggered reform that was supposed to ban gifts and favors to Olympic committee members, but less than a decade later the Olympics were awarded to Sochi (won the bid in 2007, hosted the 2014 Winter Olympics). This article at proclaimed the Sochi Olympics the “most corrupt Olympics ever.”

In the case of the NFL, the process for bidding to host the Super Bowl is not necessarily corrupt; it’s just pure greed. Fat cat NFL owners realize there’s no event bigger than the Super Bowl, so they can demand anything they want from prospective host cities. Is it at all surprising that the billionaire baseball owners want in on this action?

Personally, I like that the baseball All-Star Game gets passed around from city to city. In the past 25 years, 24 different MLB teams have hosted the All-Star game (Pittsburgh hosted the game twice, in two different ballparks). In that same time period, just 14 NFL cities have hosted the Super Bowl, with four cities hosting three or more Super Bowls during that stretch. In the 49-year history of the Super Bowl, the game has been played in Miami and New Orleans ten times each, and another seven times in Los Angeles, meaning more than half of the Super Bowl games have been played in just three cities. Of course, the NFL does prefer to have the game played in warm-weather cities, unless you have a dome (Minneapolis, Indianapolis, Detroit) or you are New York (???). Major League Baseball doesn’t have to worry about the weather for the Midsummer Classic.

We’ll have to see what MLB commissioner Rob Manfred has to say about the “competitive-bidding, Super Bowl-awarding-type mode” he’s considering for the All-Star Game. Will there be a mechanism in place so the same city doesn’t host the game every few years? Will cities without major league baseball get a chance to bid for the game? And just how many top-quality, 18-hole golf courses will MLB owners demand?

Do Pitchers Adjust to Their Receivers’ Strengths and Weaknesses?

Rob Arthur published a really interesting piece at Baseball Prospectus, where he presented evidence in favor of the idea that batters are aware of the relative framing ability of the receiver they’re facing. That’s really fascinating to me, because it suggests that this skill, which the baseball research community has only recently begun to quantify, has been understood by players for a long enough time to show up in the behavior of major leaguers.

If that were true, batters are not the first component of an at-bat I’d expect to adjust to the receiver. Quotes from pitchers in the past have suggested that they’re aware of when their catcher is helping them out, and how; I want to know if that awareness is reflected in their pitch tendencies. Specifically, I want to know if pitchers are aware of the particular framing skills of their receivers. This article, by Community Blog Overlord Jeff Sullivan, is a little old, but it was one of the first framing articles I read, and the first I remember suggesting that some catchers were not just better at framing than their counterparts, but framing in specific parts of the zone. This more recent article, where Dave Cameron discusses the possibility of voting for Jonathan Lucroy as NL MVP, does talk about pitcher tendencies based on receiver skill, but it’s one pitcher and one catcher. Additionally, I’m just as interested to see how pitchers react to bad receivers, which as far as I can tell, hasn’t been covered. Do pitchers throw to their receivers’ strengths, and do they avoid their weaknesses?

The first thing to do is to establish how catchers do in different sections of the strike zone. I’m using Pitch F/X data from the wonderful Baseball Savant, which splits the zone like so:

strike zone

For the purposes of this article, I’m concerned with the relative ability of receivers to preserve and gain strikes in different parts of the zone. As such, I’m going to categorize all pitches as “high in-zone” (in zones 1, 2, and 3), “high out-zone” (11 and 12), “low in-zone” (zones 7, 8, and 9), and “low out-zone” (13 and 14). It is a little unfortunate that this doesn’t pick up the relative horizontal skill of receivers, but these divides should still allow for some real differentiation between catchers while also keeping our sample sizes large-ish. If we pick too narrow a slice of the zone, the results might get a bit iffy.

Calculating relative framing ability took a few steps. To begin with, I looked at receivers with at least 30 pitches in each of the four zones, which picks up 87 catchers. That’s might be way too small a sample size, but the least pitches caught by any of these receivers is 1,040, which is not terrible. For each receiver, I calculated their rate of strikes for each of the four zones, and took the ratio between their strike rate and the average strike rate for the sample, and averaged together that ratio for the two low zones and the two high zones. That left two ratios for each player, high and low, where a number greater than 100 indicated better than average framing ability and a number less than 100 indicated worse than average framing ability.

Now, this is not a very good framing metric, but it does allow for a zone-oriented measure. I then divided the high-zone ratio by the low-zone ratio to get a final ratio, where greater than 100 indicated a receiver relatively better at getting the high strike, and less than 100 indicated a receiver relatively better at getting the low strike. Catchers notably better in the lower part of the zone: George Kottaras (.68), Jeff Mathis (.72), and Travis d’Arnaud (.73). Jonathan Lucroy, mentioned as a good low-ball framer, had a score of .89, but as he was good in both parts of the zone, there was a limit to how extreme his ratio could be. Catchers notably better in the high part of the zone: A.J. Ellis (1.46), Adrian Nieto (1.33), and Brett Hayes (1.29), again, three catchers with pretty bad receiving reputations.

So we now have a rough indication of how much better catchers are in the bottom and top of the zone. What kind of relation does this have to how they were pitched to? To estimate that, I stayed simple – I ran a linear regression, with the high/low ratio as the independent variable and the percentage of low or high pitches the catcher was thrown as the dependent variable. This, again, is a very rough measurement, since different pitchers are throwing to these catchers, but looking on a battery-by-battery basis would make the sample sizes tiny. Additionally, sometimes a catcher is catching a given pitcher because he’s good at receiving in a certain part of the zone that pitcher throws to frequently. So while this might be picking up manager actions as well as pitcher actions, it should be picking up something.

Results! Two graphs.

graph1     graph2

Both graphs show the expected relationship, with this blunt measurement of relative framing ability doing a fairly good job of predicting the distribution of low and high pitches thrown to a given catcher. Obviously there’s more at play here, but clearly pitch selection is impacted by the strengths of the receiver behind the plate.

There’s another question that can potentially be answered using this metric: do pitchers react differently to strengths and weaknesses? If one catcher is 30% better at framing low pitches than high pitches, and very good at framing low pitches, and another catcher is also 30% better at framing low pitches than high pitches, but very bad at it (and apparently even worse at framing high pitches, I guess (he is a very good hitter)), is one of them more likely than the other to get an increased rate of low pitches? In other words, are pitchers more inclined to avoid the bad, or seek the good?

To answer this question, I split the receivers into above-average and below-average low pitch receivers (46 and 41 in each group) and above-average and below-average high pitch receivers (51 and 36 in each group), using the scale described above. I then plotted the rate of pitches in the appropriate zone against each group separately. Following: more graphs!






What we see here is a higher R2 value in both of the below-average samples, indicating that the high vs. low ability of bad framers appears to influence pitcher decisions more than the high vs. low ability of good framers. The gap for low pitches isn’t huge, but the gap for high pitches is fairly substantial. While this analysis is way too rough to conclusively show anything, this would seem to suggest that pitchers behave differently when throwing to good and bad framers, and may be more inclined to avoid weaknesses than to seek out strengths.

As I said (several times), this is a rough analysis that relies on a rough metric, but I think it provides some evidence for some very interesting pitcher behavior. I’d love to hear about other ways of identifying receivers’ strengths and weaknesses in different parts of the zone, so if anyone knows of articles doing so, or has some different ideas, say so in the comments!

Whiffs of Success? Theo Rolls the Dice

Jeff Sullivan recently sent up a warning flare regarding Kris Bryant’s potential swing and miss problems, and this post is essentially riffing off that one, so you’ll probably want to read that first if you haven’t already checked it out. I’m using strikeout rate rather than contact rate, but the message is similar.

Bryant isn’t alone among Cubs prospects with contact avoidance issues. Here’s what some of their bigger names did last year:

Player                                        Level            K%             wRC+

Javier Baez                             MLB            41.0                 51

Arismendy Alcantara        MLB            31.0                 70

Jorge Soler                             MLB            24.7               146

Kris Bryant                                 AAA            28.6               164

Three of those four (i.e., the non-Alcantaras) are thought to be integral parts of The Future for the Cubs. But those are strikeout rates that have not generally led to long-term career success.

Here are the top ten career K rates for hitters with over 5000 plate appearances:

Player                            K%           wRC+          WAR

Adam Dunn               28.6            123              22.7

Ryan Howard           28.1             126              19.9

Jose Hernandez      27.3              86              12.9

Carlos Pena               26.8             117              16.9

B.J. Upton                  26.4              99              21.7

Jim Thome                 24.7            145              67.7

Dave Kingman          24.4            113              20.4

Gorman Thomas      24.4            114              20.4

Dan Uggla                   24.2             110              22.8

Dean Palmer             24.2             104              11.0

So it’s not impossible to have a long and relatively successful career striking out more than a quarter of the time, but it hasn’t happened much – just five times using my admittedly somewhat arbitrary 5000 PA cutoff. With strikeout rates continuing to rise, a few more players will edge ahead of Dean Palmer in the coming years, but in all likelihood, many more will fall by the wayside long before reaching that somewhat less than august plateau. As Sullivan points out, players who whiff this often need to max out their other skills in order to be useful to a team, putting enormous pressure on those other skills to develop.

And there does seem to be a correlation between hitters’ strikeout rates and overall team success, as Joe Sheehan has noted elsewhere. Here are the teams with the five highest hitter K rates from last year:

Cubs               24.2          73-89

Astros            23.8          70-92

Marlins          22.9          77-85

Braves            22.6          79-83

Reinsdorfs     22.4          73-89

None of these teams came close to making the playoffs. You have to go all the way down to tenth on the list to find a playoff team (the Nats, at 21.0%).

And here are the five least K-lacious teams:

Royals             16.3        89-73

A’s                    17.7        88-74

Rays                 18.1        77-85

Tigers              18.3        90-72

Cards               18.6       90-72

Yankees           18.6       84-78

Numerate readers will have grasped that there are actually six teams on this list, since the Cards and Yankees tied at 18.6%.  Only the Rays and the Evil Empire failed to make the playoffs, and only the Rays failed to break .500.

Not all of the young Cubs  windmill at the plate: Addison Russell and Kyle Schwarber kept their K rates under 20% last year in the minors, as did Anthony Rizzo and Starlin Castro in the majors. And as Sullivan noted, players do develop – Bryant et. al. are not necessarily trapped for eternity in the seventh level of Strikeout Hell. But for now, a significant part Theo Epstein’s plan to bring glory to Wrigleyville depends on whether these players can either find a way to strike out less, or to succeed without doing so, something that few have managed thus far.

Automate the Strike Zone, Unleash the Offense

Hello World! As a software developer, automation is my way of life. It kills me to see the tedious yet important job of calling balls and strikes performed at less than 90% accuracy. Worse, catcher framing is now a thing, which is essentially baseball’s equivalent of selling the flop.

Today, I want to talk about how automating the strike zone would affect the MLB run-scoring environment. Don’t we all want to save the environment?

Let’s pretend that before the 2014 season, home plate umpires were fitted with earpieces giving them a simplified Pitch f(x) feed of balls and strikes. They heard a high beep for a strike, a low beep for a ball. They then called balls/strikes exactly as they were told, resulting in a perfect zone.

Experiment 1: Walks/Strikeouts overturned

The most damaging ball/strike errors happen when ball 4 or strike 3 was thrown but not called. Sometimes the umpire is redeemed by luck, and a walk/strikeout happens eventually anyway, but not nearly every time. Think of how many times you’ve seen a 3–0 count where a ball was called a strike, only to have the hitter swing and ground out harmlessly on the 3–1 pitch.

For these experiments, let’s look at short description of the situation, the number of instances of that situation in 2014, and net runs that would have been added if a perfect zone had been called.

Data courtesy of Baseball Savant; click on a situation to see the query I used.

Situation Instances Net Runs (Rough)
Strike 3 thrown, batter safe 146 -88
Ball 4 thrown, eventual out 691 415
Difference 545 327 (.07 team runs per game)

Are you surprised? The umpires made 545 more extra outs than extra ‘safes’. Using a rough walk minus out run differential of 0.6 runs, we see that a perfect zone would have added 0.07 runs per game. Interesting, but not huge.

But think again—this effect isn’t limited to plate appearances that should have ended with a bad call. We all know that the count affects the expected run value all on its own. So let’s expand this to all ‘bad calls’ in 2014.

Experiment 2: All balls/strikes called correctly

Balls and strikes don’t obviously translate to runs. So I’ll use someone else’s much more careful research and use a ball minus strike run value of approximately 0.14 runs. Here’s what happens when we apply a perfect zone to all balls and strikes. Brace yourself!

Situation Instances Net Runs (Rough)
Strike thrown, ball called 8724 -1212
Ball thrown, strike called 40557 5633
Difference 31833 4422 (.91 runs per game per team)

Whoa. Are you kidding me? If we’d run last season with a perfect strike zone, the run environment would go from 4.07 runs/game to nearly 5! That’s the highest level since 2000. I know what you’re thinking: this is crazy, and probably wrong.

Sanity checking

I also found this result to be larger than expected, to say the least. So let’s back up, check the mirrors, and look at the frequency of called strikes vs. balls.

Called Ball 233421
Called Strike 123922
Difference 109499

There are a ton more called balls than called strikes. This makes sense because batters are more likely to swing at strikes. But the ratio of balls to strikes is only about 2:1, that doesn’t account for the 5:1 ratio among ‘mistaken’ balls/strikes! How do we account for this?

A possible explanation

Here we dive into speculation, but stay with me for a minute. Maybe there’s a logical explanation.

What sequence of events must occur in order for a Pitch f(x) strike to become a ball?

  1. Pitcher throws in strike zone: ~45% (Zone %)
  2. Hitter takes said pitch in the strike zone: ~35% (100% – Z-Swing %)
  3. Umpire makes bad ‘ball’ call: ~10%

By this ridiculously rough method, we would expect bad ‘ball’ calls about 1.5% of the time (0.10 * 0.35 * 0.45). Compare that with the observed value of 1.2%

Conversely, the sequence for a Pitch f(x) ball becoming a called strike is as follows:

  1. Pitcher throws out of zone: ~55% (100% – Zone %)
  2. Hitter takes said pitch outside the strike zone: 70% (100% – O-Swing %)
  3. Umpire makes bad ‘strike’ call ~15%

We therefore expect bad ‘strike’ calls about 5.7% of the time (0.15 * 0.7 * 0.55). Again, compare that to the observed value of, wait for it, 5.7%. Boom!

More reasons to automate

  1. Automatic things happen faster. As a professional automator, I guarantee this will speed up play, by more than you think. I bet the umpire thinks for about 1 second on every pitch. That’s just the obvious part.
  2. Set the umpires free. Focusing on something as difficult as calling balls/strikes squeezes out the umpire’s attention on other important matters, such as enforcing pace of play.
  3. Crazy cool things will happen. For example, we will finally see what happens to an insane control pitcher’s K-BB%. V-Mart might never strike out!

I welcome your comments, criticisms, or even praise :)

Changes in WAR from 2000 to 2014 (Part 4)

If you haven’t read Part 1, Part 2, and Part 3, you may want to go back and check them out.

After looking in-depth at 2014 WAR, I thought it would be interesting to compare 2014 WAR with WAR totals from 2002. Baseball scoring has dropped considerably since 2002 and I wondered how this would be reflected in WAR, either at the positional level or the age level or both.

Here is a comparison of hitting statistics from 2002 and 2014:

2002 4.62 .261 .331 .417 .326 .155 .293 8.7% 16.8%
2014 4.07 .251 .314 .386 .310 .135 .299 7.6% 20.4%

Twelve years ago, hitters put up a higher batting averages, on-base percentages, slugging percentages, and isolated slugging. They walked more and struck out less.

But,we pretty much knew this. Did this difference in the level of offense affect the WAR accumulated at each position?

Position Players

The following table shows WAR for each position with 2002 on top and 2014 below.

If we look at the comparison of WAR/600 PA for the premium hitting positions (DH, 1B, RF, LF, 3B), we see that all except third base accumulated more WAR in 2002 than in 2014. On the other end of the fielding spectrum, the key defensive positions (C, SS, 2B, CF) all had more WAR in 2014, when offense was down.

This table shows a comparison of the traditionally offense-oriented positions versus the positions historically known more for their glove work in the two different run-scoring environments of 2002 (4.62 R/G) and 2014 (4.07 R/G).

In 2002, the offense-oriented positions averaged 2.2 WAR/600PA. In 2014, these positions average 1.8 WAR/600 PA. The more defensive-oriented positions averaged 1.9 WAR/600 PA in the higher run-scoring environment and 2.4 WAR/600 PA when runs were more scarce.

This shift of WAR from more hitter-heavy positions to the better fielding positions has been a general trend over the last thirteen years, particularly so in the last four years as run scoring has dropped significantly.

Consider the table below. The column to the far right shows the difference between WAR for the hitting positions and fielding positions each year:

The biggest change has been over the last four years, as run scoring has dropped down below 4.3 runs per game after being in the range of 4.6 to 4.8 runs/game in the 2000s. Teams are getting more WAR/600 PA from the defensive-oriented positions than the bat-first positions. The 2014 season saw the biggest gap in the last thirteen years, with glove-first positions averaging 0.6 more WAR/600 PA than the bat-first positions.

Changing distribution of playing time and WAR based on age

Along with the change in WAR for the hitting positions versus the defense-oriented positions, there has been a shift in WAR and playing time based on age. From 2000 to 2005, position players 33 and older had more plate appearances than players 25 and under. Beginning in 2006, position players 25 and under have had more plate appearances each year than players 33 and older. Since 2010, this difference has accelerated, as the graph below shows:

In 2000, players 33 and older had 40,626 plate appearances and players 25 and under had 38,919. Last year, players 33 and older had dropped to 29,191 plate appearances and players 25 and under were up to 45,439 plate appearances.

Plate Appearances by Age Group
Year 25 & under 33 & older
2000 38,919 40,626
2014 45,439 29,191
Difference 6,520 -11,435

With increasing playing time, players 25 and under have seen their total WAR go up, while WAR for players 33 and older has gone down:

The difference in WAR is not just a playing time difference, though. Older players have not only seen less playing time, they’ve also been less productive, as this graph of WAR/600 PA demonstrates:

In 2000, players 33 and older averaged 1.7 WAR/600 PA, while players 25 and under averaged 1.4 WAR/600 PA. The older group of players maintained their lead until 2003, when the two groups were essentially even. Since then, younger players have out-produced older players. Last year, the gap was 0.5 WAR/600 PA in favor of the younger group of players.

Starting Pitchers

For starting pitchers, there are some differences. Innings pitched by starting pitchers 25 and under have fluctuated quite a bit over the last 15 years. Since 2000, starting pitchers age 25 and under have thrown a high of 10,268 innings (2002) and a low of 6,663 innings (2005). Starting pitchers 33 and older have a narrower range of innings pitched per season, with a very slightly downward trend over the last thirteen years, as shown by this graph:

While their innings pitched has been fairly consistent since 2000, starting pitchers 33 and older have been less productive. The following graph shows the WAR/150 innings pitched for starting pitchers 25 and under compared to those 33 and older. The “33 and older” group has dropped from a high of 2.5 WAR/150 IP in 2000 to a low of 1.2 WAR/150 IP last season.

From 2000 to 2007, pitchers 33 and older were more productive per inning than pitchers 25 and under. Since then, young pitchers have been more productive, except for that 2012 season. The gulf has widened between these two groups over the last two years.

Relief Pitchers

Finally, let’s look at relief pitchers. Since 2000, relief pitchers 33 and older have seen their innings pitch per year rise from around 3,000 in 2000 to a high of 3,951 in 2005, but have steadily dropped since then. In 2014, they pitched a 15-year low of 2,063 innings. Relief pitchers 25 and under saw a sharp increase in innings pitched from 2004 to 2006, and have bounced around a bit since then, but have generally seen a drop in the amount of innings they’ve pitched since then.

When it comes to production, older relief pitchers have followed a different pattern than their counterparts. Position players and starting pitchers 33 and older have seen their production drop (using WAR per playing time), while relief pitchers 33 and over have held steady. Older relievers are pitching fewer innings each year but they are still as productive (and have a slight increase in WAR/50 IP over the last 15 years).

Final Thoughts

Baseball has evolved over the last 15 years from a high-offense, slugging game to a low-offense, pitching-and-defense game and WAR reflects those changes. The offense-oriented positions (1B, RF, LF, 3B) used to accumulate more WAR each season, but no longer do so. Older players were once more likely to sustain their production into their mid-30s, but no longer play as much or as well as they once did at an advanced age.

Looking to the future, we have to wonder what’s to come. Will offense continue to drop or has it bottomed-out and now due for a rebound? Will MLB do something to raise the level offense (adjust the strike zone, perhaps?)? If offense makes a comeback, how will that be reflected by WAR?

The Grandyman (Still) Can

For every Dontrelle Willis–who continues to get looks from Major League teams despite over eight years of complete ineptitude–there exists a handful of other players who fade into relative obscurity only a year or two removed from a dominant season. All it generally takes is a down year resulting from–or paired with–an injury to send a guy spiraling below the radar. These are often the players that can return the most value during fantasy drafts if you can make the distinction between a year that’s an aberration, and one that is a bellwether for a significant, irreversible decline in skills.

While I can’t say with complete confidence that Curtis Granderson‘s 2014 doesn’t fall into the latter category, there were a couple of encouraging things going on below the subpar surface stats that make me think he can return some solid value this year, especially considering where he’s going in most drafts.

Granderson was 33 last year and coming off an injury-shortened season. He was also trading a left-handed pull hitter’s haven in Yankee Stadium for the cavernous confines of Citi Field. All things considered, it was natural to expect some significant regression. And when he hit .136 through his first 100 at-bats of the season, it seemed like the Mets might have had a disaster of Jason Bay-like proportions on their hands.

Fortunately for them, Granderson managed to right the ship to an extent, putting together a couple of excellent months. His final line of .227/.326/.388–dragged further down by a nightmarish .037 ISO, 16-for-109 August–wasn’t spectacular by any stretch. But there were some nice takeaways buried in there.

For one, his bat speed doesn’t seem to have slowed enough to justify the statistical hits he took across the board. Despite seeing 56.3% fastballs–the most he’s seen since 2010 by a wide margin–his Z-Contact % of 85% was in line with his 85.8% career average, and not far removed from the league average of 87%. I suspect the uptick in fastballs resulted from opposing teams banking on an age-slowed swing, but Granderson’s contact rates on high velocity pitches in the zone didn’t suffer for it.

Granderson also set a career high in O-Contact % with a 62.7% rate. This could usually indicate a lack of plate discipline as much as it could a sustained bat speed, except that Granderson’s O-Swing % of 26.2% is roughly the average of what he did in the four years prior. He also managed to post the second-highest walk rate of his career (12.1%) and his lowest strikeout percentage since 2009 (21.6%). These are not particularly impressive rates in their own right, but in the context of Granderson’s career they do help to dispel the notion that last year was the beginning of the end for his hitting ability.

That is not to say, of course, that I foresee a return to the 40 home run, .260+ ISO form that he flashed in his early Yankee years–there’s no way he ever touches the absurd 22 HR/FB% that sustained that run. But with the right field fences at Citi Field moving in–a change that apparently would have resulted in 9 more home runs for Granderson had it been done last season–and some improvement on last year’s uncharacteristically bad .265 BABIP, I would not be at all surprised to see a home run total between 25 and 30 to go along with double-digit steals and a batting average that won’t kill you. And that has value when it is being drafted as low as Granderson currently is.

A Historical Study of the Strike Zone and the Offensive Environment

As offense is continuously decreasing, a popular suggestion to increase the offense has been the shrinking of the strike zone. Primarily discouraging the low strike — since the implementation of QuesTec and later Zone Evaluation, the low strike is being called more and more often. All it really is is the enforcement of the strike zone or the rule of the strike zone. The solution that many have proposed is to reduce the low strike, which would require a changing in the wording of the strike zone. This in theory would increase the offense, which would increase the popularity of the game.

This may be a surprise to some but the re-wording of the strike zone is a common occurrence throughout the history of the game. Ok, maybe not common but it does happen on occasion. The first implementation of a strike zone was in 1887. Before 1887 batters would ask where they wanted the ball delivered and pitchers had to throw it there. There was no official definition of the strike zone.

The main question I tried to answer was how did the re-wording of the strike zone affect the run environment, if at all? There is no guarantee that it has, or that there is a correlation between the change in strike zone rules and the run environment. I think it’s a good theory and I would tend to believe that it would affect the run environment; that being said there are many factors that go into the run environment, and the strike zone is merely one of them.

The first chart is a representation of the run environment leading up to 1887, when the strike zone was officially defined. The definitions of the strike zone were found on Baseball Almanac. The data for all the charts was provided by baseball-reference. The X-axis for all the upcoming charts is the year and the Y-axis is the average runs per game.


Take this data for what you will. I personally don’t think it truly reveals a ton about the strike zone’s effect but it is a data point.

“A (strike) is defined as a pitch that ‘passes over home plate not lower than the batsman’s knee, nor higher than his shoulders.”


After 1887 there was a relatively steep drop in the run environment before it went back up. I’m not entirely sure the data reveals anything; the chart is rather noisy. In this chart, probably other factors were conducive to the fluctuation in run environment.

“A fairly delivered ball is a ball pitched or thrown to the bat by the pitcher while standing in his position and facing the batsman that passes over any portion of the home base, before touching the ground, not lower than the batsman’s knee, nor higher than his shoulder. For every such fairly delivered ball, the umpire shall call one strike.”


This chart again isn’t precisely indicative that the change in strike zone had an impact on the run environment. The modern game was still in its infancy and there was a lot of fluctuation before things stabilized in the mid 1900s.

“The Strike Zone is that space over home plate which is between the batter’s armpits and the top of his knees when he assumes his natural stance”.


This data point gives us more information. There was a pretty drastic drop from 1950-1952 in offense. In fact it was almost an entire run of offense that dropped and it makes sense. This was the first time there was a concrete definition of the strike zone. The umpires now had something to go on. Before there was a general idea of what strike and ball was. This was the first acknowledgment that there was a concrete zone pitchers had to throw into. The run environment did stabilize though until1963, where there was a slight drop in offense, obviously unrelated to the strike zone.

“The Strike Zone is that space over home plate which is between the top of the batter’s shoulders and his knees when he assumes his natural stance. The umpire shall determine the Strike Zone according to the batter’s usual stance when he swings at a pitch.” This rule was implemented in 1963.


As you can see there is no real change or effect from the rule change or the re-working of the rule. What you will also be able to conclude from the upcoming charts is that the re-wording of the strike zone doesn’t exactly have any effect on the offensive environment.

The strike zone was then again altered in 1969; “The Strike Zone is that space over home plate which is between the batter’s armpits and the top of his knees when he assumes a natural stance. The umpire shall determine the Strike Zone according to the batter’s usual stance when he swings at a pitch.”


“The Strike Zone is that area over home plate the upper limit of which is a horizontal line at the midpoint between the top of the shoulders and the top of the uniform pants, and the lower level is a line at the top of the knees. The Strike Zone shall be determined from the batter’s stance as the batter is prepared to swing at a pitched ball”


“The Strike Zone is expanded on the lower end, moving from the top of the knees to the bottom of the knees (bottom has been identified as the hollow beneath the kneecap).”


The offense as you can see does take a rather significant and consistent dip after 1996. This, however, is probably not due to the re-working of the strike zone or rather one cannot tell that it is due to the re-working of the strike zone from this chart.

There is, as we all know, another element to this strike zone saga and it’s the implementation of QuesTec. QuesTec was implemented in 2002 and was not well received by umpires. They actually filed a grievance in 2003, about the use of QuesTec, which was resolved in 2004.


The evidence displayed by the data above doesn’t suggest that QuesTec had a direct link to offensive production. What it rather indicates is there was a drastic shift in offensive production after 2006. 2006 was the year where Zone Evaluation was implemented in baseball. Zone Evaluation was deemed to be a more accurate way of judging the strike zone. Its implementation also has a direct correlation with a constant decrease in offense, which has not ended. The goal was to force umpires to be more accurate and to adhere to the definition of the strike zone, which was last altered in 1996. In 1996 the definition explicitly dictated that the strike zone should expand downward from the top of the knees to the bottom of the knees. This seems to perhaps be the biggest impact against offense.

There are obviously other extreme factors to consider. For example, the aggressive testing of steroids and other performance-enhancing drugs. It seems most of us including myself like to believe that we are playing in a much cleaner game, which has affected the offense as a whole. Pitchers are throwing harder than ever and if that wasn’t enough most advanced metrics seem to favor pitching and defense. These are all elements to consider that have affected the offense.

That being said there is an undeniable connection between the enforcement of the strike zone and the drastic drop in offense. In previous years, when the strike zone was re-worked, there were no real correlations with regards to offense, apart from 1950, where the strike zone was initially defined. The correlation is rather with technology and the strike zone. It’s highly probable that the umpires in years past ignored or disregarded the changes with the rule. They just kept calling the strike zone, like they always did. The implementation of Zone Evaluation forced them to change, which had a direct effect on the offense. Changing the strike zone should have a rather drastic affect on offense, especially now that we have Zone Evaluation to keep umpires accountable.

Hardball Retrospective – The “Originals” 1922 Browns

In “Hardball Retrospective: Evaluating Scouting and Development Outcomes for the Modern-Era Franchises”, I placed every ballplayer in the modern era (from 1901-present) on their original team. Consequently, Joe L. Morgan is listed on the Colt .45’s / Astros roster for the duration of his career while the Angels claim Wally Joyner and the Diamondbacks declare Carlos Gonzalez. I calculated revised standings for every season based entirely on the performance of each team’s “original” players. I discuss every team’s “original” players and seasons at length along with organizational performance with respect to the Amateur Draft (or First-Year Player Draft), amateur free agent signings and other methods of player acquisition.  Season standings, WAR and Win Shares totals for the “original” teams are compared against the “actual” team results to assess each franchise’s scouting, development and general management skills.

Expanding on my research for the book, the following series of articles will reveal the finest single-season rosters for every Major League organization based on overall rankings in OWAR and OWS along with the general managers and scouting directors that constructed the teams. “Hardball Retrospective” is available in digital format on Amazon, Barnes and Noble, GooglePlay, iTunes and KoboBooks. Additional information and a discussion forum are available at


OWAR – Wins Above Replacement for players on “original” teams

OWS – Win Shares for players on “original” teams

OPW% – Pythagorean Won-Loss record for the “original” teams


The 1922 St. Louis Browns                        OWAR: 45.8     OWS: 247     OPW%: .532 

“Gorgeous” George Sisler carried a .351 lifetime batting average into the 1922 campaign along with the Major League record for hits in a single-season (257 in 1920). He ravaged rival hurlers and topped the leader boards with 246 base knocks, 134 runs, 18 triples and a career-high 51 swipes to complement a .420 BA. Sisler claimed the MVP award but later fell ill and missed the entire 1923 season due to acute sinusitis.

Marty McManus established personal-bests with 189 safeties and 109 RBI while batting .312 with 34 doubles, 11 triples and 11 round-trippers. Del Pratt pounded a career-high 44 two-baggers and knocked in 86 runs. Pat Collins (.307/8/23) split the catching chores with Verne Clemons and Muddy Ruel. 

Sisler ranked 24th among first sackers in “The New Bill James Historical Baseball Abstract.” Pratt (35th) and McManus (58th) placed in the top 100 at the keystone position while Ruel finished fifty-first among backstops.

George Maisel RF/CF -0.89 0.49
Del Pratt 2B 1.74 17.78
George Sisler 1B 7.36 29.39
Marty McManus DH/2B 1.74 20.29
Muddy Ruel C 0.37 9.29
Cedric Durst CF -0.01 0.2
Burt Shotton LF -0.22 0.01
Gene Robertson 3B 0.08 0.83
Doc Lavan SS -0.5 2.97
Pat Collins C 0.97 6.36
Verne Clemons C 0.04 4.24
Ray Schmandt 1B -1.55 5.8

Missouri native Elam Vangilder (19-13, 3.42) delivered career-bests in victories and WHIP (1.208). Jeff Pfeffer (19-12, 3.58) matched Vanglider’s win total and paced the mound crew with 261.1 innings pitched and 32 starts. Wayne “Rasty” Wright held the opposition at bay with a 2.92 ERA and a WHIP of 1.286. Ray “Jockey” Kolp compiled a record of 14-4 while left-hander Earl Hamilton contributed an 11-7 mark. In his rookie season Hub “Shucks” Pruett fashioned an ERA of 2.33, saved 7 contests and topped the League with 23 games finished.

Elam Vangilder SP 5.26 21.14
Jeff Pfeffer SP 3.96 20.15
Rasty Wright SP 2.72 12.53
Ray Kolp SP 1.74 10.86
Earl Hamilton SP 1.2 10.62
Hub Pruett SW 2.07 11.42
Bill Bayne SP 0.51 4.29
Dutch Henry RP -0.04 0.1
Heinie Meine RP -0.08 0.06
Bill Bailey RP -0.33 0.62
Allan Sothoron SP -0.43 0.44
Tom Phillips SP -0.58 1.52

The “Original” 1922 St. Louis Browns roster

George Sisler 1B 7.36 29.39
Elam Vangilder SP 5.26 21.14
Jeff Pfeffer SP 3.96 20.15
Rasty Wright SP 2.72 12.53
Hub Pruett SW 2.07 11.42
Del Pratt 2B 1.74 17.78
Marty McManus 2B 1.74 20.29
Ray Kolp SP 1.74 10.86
Earl Hamilton SP 1.2 10.62
Pat Collins C 0.97 6.36
Bill Bayne SP 0.51 4.29
Muddy Ruel C 0.37 9.29
Gene Robertson 3B 0.08 0.83
Verne Clemons C 0.04 4.24
Cedric Durst CF -0.01 0.2
Dutch Henry RP -0.04 0.1
Heinie Meine RP -0.08 0.06
Burt Shotton LF -0.22 0.01
Bill Bailey RP -0.33 0.62
Allan Sothoron SP -0.43 0.44
Doc Lavan SS -0.5 2.97
Tom Phillips SP -0.58 1.52
George Maisel CF -0.89 0.49
Ray Schmandt 1B -1.55 5.8

Honorable Mention

The “Original” 1916 Browns                         OWAR: 41.4     OWS: 266     OPW%: .550

Jeff Pfeffer (25-11, 1.92) logged 328.2 innings pitched while establishing personal-bests in virtually every major pitching category. Carl Weilman completed 19 of 31 starts and recorded an ERA of 2.15 along with a 1.134 WHIP. Burt Shotton coaxed 110 bases on balls, pilfered 41 bags and tallied 97 runs.

The “Original” 1983 Orioles                          OWAR: 42.6     OWS: 255     OPW%: .604

Cal Ripken (.318/27/102) led the Junior Circuit with 211 base hits, 121 runs scored and 47 doubles. He appeared in his first All-Star contest and achieved MVP honors along with the Silver Slugger Award. “Steady” Eddie Murray (.306/33/111) registered 115 tallies and placed runner-up to Ripken in the AL MVP balloting. Mike Boddicker accrued 16 victories with a 2.77 ERA in his inaugural campaign.

On Deck

The “Original” 1980 Royals

References and Resources

Baseball America – Executive Database


James, Bill. The New Bill James Historical Baseball Abstract. New York, NY.: The Free Press, 2001. Print.

James, Bill, with Jim Henzler. Win Shares. Morton Grove, Ill.: STATS, 2002. Print.

Retrosheet – Transactions Database – Transaction a – Executive

Seamheads – Baseball Gauge

Sean Lahman Baseball Archive

Shatzkin, Mike. The Ballplayers. New York, NY. William Morrow and Co., 1990. Print.

Rickie Weeks’ Value in Disguise

Rickie Weeks going to the Mariners moved a lot of eyebrows, raising some, furrowing others. Weeks’s deal will be worth $2 million for one year, according to Jim Bowden. To the casual fan, this move might seem a little unnecessary: Seattle already has a pretty good second baseman in Robinson Cano. If you take a closer look, however, there are some hidden metrics that would point to Weeks having a resurgence.

Let’s first look at this acquisition from the position of the casual fan. Weeks is coming off a 2014 season wherein he only had 286 plate appearances, and saw a substantial reduction in power. Long story short, Weeks was a singles hitter last year. In those 286 trips to the plate, Weeks had 41 singles. In 2013, Weeks had 42 singles in 113 more plate appearances. While this helped his overall batting average get back on track, from .209 in 2013 to .274 in 2014, it did nothing to increase his power numbers.

Weeks is also a below-average fielder. Scratch that, Weeks is the worst fielder at the second base position in all of baseball, and he has been for some time now. If we are going by FanGraphs’s UZR, Weeks has a career total UZR of -56.5 for his career. That puts Weeks right at the bottom as far as second baseman who have played more than 5,000 innings since 2005 (Week’s first full season). Below are the bottom five second baseman according to UZR in that same time frame. Recognize anyone?

Notice that current Seattle second baseman Robinson Cano is four from the bottom. This really doesn’t tell us anymore than that Seattle does not put a premium on defense, and we might have suspected this all along if we had first taken a look at team UZR from the last two seasons.

There we have it. A match made in heaven. It is no coincidence that two of the bottom five defensive teams over the last two years contained two of the bottom five defensive second basemen, in Cano and Weeks. So what does this all have to do with Seattle and their recent free-agent acquisition of Mr. Weeks?

Ceteris paribus. All other things being equal, meaning if we take defense out of the evaluation (because Seattle is not focusing on defense at the time), we can better understand what Seattle saw in this now 32-year-old utility man.

Our answers lie within the batted ball statistics. Over his career, Weeks has had a fly ball percentage of 35-36% consistently. Even in 2013 it was 32.7%. Last season that percentage sunk dramatically to a career low of 25%. This may or may not be a bad sign. We will come back to the fly ball percentage shortly. Now let us look at the HR/FB ratio statistic.

Last season Weeks saw a spike in his HR/FB ratio. It reached an all time high of 17.8%.  His career average for that metric is 14%. Knowing that his fly-ball percentage was at an all time low, with his HR/FB ratio at an all time high we can reasonably expect those two metrics to meet somewhere in the middle this upcoming season.

There is one last measurement we should look at in order to fully understand Weeks’s value possibility. Jeff Zimmerman and Bill Petti, of FanGraphs fame, run their own website,, where one can look at batted-ball distance for any player going back to 2007. When we look at Rickie Weeks, we see that he has a career average fly-ball distance of 292 feet. Last year, his average fly-ball distance was 285 feet. This slight decline is understandable due to the age factor. Weight this how you wish, but it doesn’t seem like Weeks is going through any more of a power decline than other professionals have gone through at his age.

Putting it all together, if Weeks starts to hit more fly balls, and (if nothing else) maintains his career average HR/FB ratio, the Mariners will reap the full value of his services. His defense is subpar at best, but Seattle does not seem too concerned about that. Right-handed power seems to be scarce at the moment, especially at the second base position. Rickie will add depth to Seattle, but the real value might come during the season when teams start looking for power to boost their playoff lineups—that is, if Weeks can deliver.