Will the Real Tyler Goeddel Please Stand Up?

Similarly to a large portion of the FanGraphs community, I am a Philadelphia Phillies fan.  I was born in South Jersey just 20 minutes away from the stadium and grew up watching every game.  I was there for the tough times in the late 90’s / early 2000’s, and I was there for the glory days of 2007-2011.  After an abysmal last few seasons of baseball in Philadelphia, we have finally seen some promise this season leading us to believe that better days are coming soon.  One of the bright spots on the team so far this year has been Rule 5 pick, Tyler Goeddel.

After being selected in the first round of the 2011 MLB Rookie Draft, Tyler Goeddel began his professional career with the Tampa Bay Rays.  Goeddel was drafted out of high school as a third baseman and for the first three years of his minor league career that would be the only position he played.  In 2015, however, the Rays decided to move Goeddel to the outfield.  His athleticism allows him to play all three outfield positions and that type of versatility is very sought after by big league clubs.  While defense was never his problem, Goeddel’s bat didn’t develop as quickly as the Rays had hoped.  He was a career .262 hitter with 31 home runs across four full seasons in the minor leagues.  Ultimately the Rays made a tough decision and left him off their 40-man roster, knowing there was a great chance another team would select him in the Rule 5 Draft.  Shortly after, the Phillies did just that and selected Goeddel with the first overall pick of the 2015 Rule 5 Draft.

The Philadelphia Phillies have historically been excellent in finding talent in the Rule 5 Draft.  (2004 – Shane Victorino, 2012 – Ender Inciarte, 2014 – Odubel Herrera).  In the early going, I (like most Phillies fans) was very skeptical as to whether or not Goeddel could follow in the footsteps of players like Shane Victorino and Odubel Herrera and become a valuable contributor to our big league team.  Goeddel had a mediocre spring training but with no other serious competition in the corner outfield spots, there was no harm in keeping him around for a rebuilding year and seeing what the kid could do.

The beginning of Tyler Goeddel’s major league career could not have gone much worse.  Take a look below at his stats through his first nine games:

4:6 - 4:19 Stats

In only 16 at-bats, Goeddel recorded only one hit (a single), and struck out a whopping eight times!  Now obviously this is a VERY small sample size, and we should expect some struggles while adjusting to big league pitching.  Up until this point, Goeddel has never seen pitching above the Double-A level.  Now let’s take a look at his plate discipline stats over the same time frame:

4/6 - 4/19 Plate DisciplineO-Swing % – Percentage of time a batter swings on pitches outside the strike zone
Z-Swing % – Percentage of time a batter swings on pitches inside the strike zone
Swing % – Percentage of time a batter swings at a pitch, regardless of location
O-Contact % – Percentage of times a batter makes contact with a ball when swinging outside of the strike zone
Z-Contact % – Percentage of times a batter makes contact with a ball when swinging inside of the strike zone
Contact % – Percentage of times a batter makes contact with the ball when swinging
Zone % – Percentage of overall pitches thrown to batter that were in the strike zone

There is nothing noteworthy about his swing percentages as they are all just about equal to the league averages, but the contact percentages are quite alarming.  Through his first nine games, Goeddel only made contact 53% of the time he swung his bat.  Rather than just writing this off as a rookie being over-matched by big league pitching, I decided to dig deeper into these stats and figure out exactly where Goeddel was struggling.  Check out the video below that I put together which basically sums up the beginning of Goeddel’s career in 30 seconds:

Whether or not you realized from watching the above video, every one of these swing and misses came on a fastball.  They all also came in the upper portion of the strike zone.  Just by watching Goeddel’s at-bats through this point of the season, it was clear as day to see opposing pitchers were attacking Goeddel with fastballs up in the zone.  The chart below shows every fastball that was thrown to Goeddel over his first nine games.  It is broken up by hot and cold zones and shows his contact percentage versus the fastball at every portion of the strike zone:

4:6 - 4:19 Contact % vs Fastball

This chart verifies for us what we saw in the video…Goeddel really struggled to hit fastballs up in the zone to begin the season.  At this point, everyone was frustrated.  Tyler Goeddel was frustrated because he knew he was much more talented than his results thus far have showed.  The Phillies organization was frustrated because they had such high hopes for Goeddel entering the season.  And most importantly, the Phillies fans were frustrated and began questioning what the Phillies could possibly see in this guy.  (Search for Tyler Goeddel’s name on Twitter and read old tweets from this time period if you don’t believe me!!)

An important thing to remember while looking at these stats, is that up until this point of his career Goeddel has been an every-day player.  Not only is he adjusting to big league pitching, but he is also trying to adjust to not having consistent at-bats.  Since the Phillies unexpectedly got off to such a hot start, an important decision needed to be made.  On one hand, they have this young promising player who will need consistent at bats in order to show his true potential.  But on the other hand, this team is surprisingly in the hunt in the NL East and may not want to allow Goeddel to go through his growing pains while they are competing for the division title.  Eventually, a decision was made and manager Pete Mackanin started to put Goeddel in the every-day lineup. Below are some quotes from Goeddel at this time speaking of the decision:

“Getting regular playing time and the confidence [from that] is huge, but I try to get started a little earlier on my swing so I can be on time with the fastball. You need to hit the fastball if you want to play up here, obviously. I feel like I’ve made that adjustment and it’s been a huge help.” – Tyler Goeddel

“I didn’t play how I wanted to play in April.  And I’m glad he’s (Pete Mackanin) giving me a chance, because I really didn’t play my way into a chance; he just gave it to me. So I’m trying to make the most of it.” – Tyler Goeddel

The video below (from 4/23/16) summarizes Goeddel’s early season struggles and the decision to give him more playing time:

The Phillies coaching staff deserves a lot of credit.  They recognized early on that Goeddel was struggling with fastballs up in the zone and prior to this game really worked with him in that area and promised him more playing time moving forward.  Here is a video of his next at bat in the game, where the pitcher tries once again to attack Goeddel with some high heat:

Goeddel responds with another base hit and his first RBI of the season.  Take a look below at how his stats over his next seven games compare to his stats from his first nine games.

4:23 - 5:6 Stats

4:23 - 5:6 Contact %

You can very easily see that Goeddel drastically improved his contact percentage over this time frame, which resulted in a huge drop in his strikeout rate.  The video below is from 5/8/16, right after the stretch of stats we just evaluated.  Goeddel had a big hit late to tie the game for the Phillies and later came in to score the winning run.

As you could see, the hit came on a high fastball.  A few weeks ago, Goeddel could not touch this pitch…but all of a sudden he is beginning to prove that he can.  The next video is from after that game.  Tyler discusses the adjustments he has made and also how playing every day has contributed to his recent success:

This hit was the start of a new Tyler Goeddel.  Pitchers continued to attack him with fastballs up in the zone and Goeddel really started to make them pay.  This is what he did to a Brandon Finnegan fastball just a few days later:

Ever since that hit on May 8th against the Marlins, Goeddel has been the player the Phillies could have only hoped he one day would become.  He has flashed signs of brilliance in just about every game since that have Phillies fans drooling over what the future outfield could look like.  Even though he has made adjustments and is seemingly now catching up to big league fastballs, opposing pitchers continue to test him.  Check out the video below that I put together showing what Goeddel has done to fastballs in the upper portion of the strike zone over the last few weeks.

As you can clearly see, this is a different player than we saw early on in the season.  Take a look at how his recent stats compare to those early on:

5:6-5:20 Stats

5:6-5:20 Contact %

Goeddel’s contact percentage over his first nine games was only 53%.  Over his last 10 games, it is 91%.  That is an incredible difference and clearly his adjustments are paying off.  In turn, his improved contact has led to a strikeout percentage of only 5.4% over his last 10 games.  The chart below shows how Goeddel has fared against the fastball since he noted his adjustments on April 23, 2016.

4:23 - 5:20 Contact % vs Fastball

Now go back up to the top of the article and compare this chart to what it looked like at the beginning of the season.  More consistent at bats have clearly translated into him catching up to the fastball and the results thus far have been phenomenal.  I have to admit that I was a doubter early on, but I am now completely on board the Tyler Goeddel bandwagon.  This kid is only 23 years old and the fact that he was able to so quickly make an adjustment like this and immediately see results is remarkable.  Now that he is having some success, opposing pitchers will start to change their game-plan against him.  While the pace he is on now may not be sustainable over the course of a full season, I am confident that Goeddel will continue to make the necessary adjustments and help this Phillies team continue to find ways to win ball games.  Although the video below doesn’t exactly relate to his success at the plate, I had to throw this in here and it is a must watch if you have not seen it already:

The last video I will show features Goeddel’s post game interview after this throw:

Recent Quotes:

“It’s exciting.  Coming to the field everyday I’m expecting to see myself in the lineup. That’s a feeling I didn’t have last month. It’s a lot more relaxing, less stressful.” – Tyler Goeddel

“It was definitely a big adjustment, going from playing everyday my whole career to having a specific role, and then not performing well in my role, it was a little tough.  But, you know, they’re giving me an opportunity now and I feel like I’m playing better, which is nice. I’m happy for myself. I always knew I could play up here, but I needed some results to prove it to myself. I’m glad, finally, there are some results to show.” – Tyler Goeddel

I love how confident Goeddel is when he speaks of his game and I am so glad the numbers back him up.  I continue to be blown away watching him play every day, especially due to the fact that he has only been playing the outfield for one year.

Lastly, I want to show a few graphs.  The first one shows a rolling total of Goeddel’s strike out percentage so far this season.  The statistics earlier show you that it has decreased, but this graph makes it much easier to see his progression:

Rolling K%

The next graph is another rolling total showing how Goeddel’s wRC+ has progressed throughout the season.  For those of you who are unfamiliar with the stat, wRC+ stands for weighted runs created plus.  It attempts to quantify a player’s offensive value in terms of runs.  An average wRC+ is 100.  Check out how Goeddel’s wRC+ has improved throughout the season:

Rolling WRC+

What do you think, Phillies fans?  Can Tyler Goeddel keep this up?  Is the Tyler Goeddel that we have seen over the last few weeks the real Tyler Goeddel?  Are you ready to hop on the bandwagon yet or do you need to see more from him to believe?  Only time will tell, but I’m buying into the hype and am excited to see what the future holds for this promising young player.

Twitter – @mtamburri922

A New Hitter xISO, Now with Exit Velocity

Over the last few years, Alex Chamberlain has published a series of posts exploring the concept of xISO. Like the most commonly known xFIP, this metric is supposed to be an “expected” ISO, based on batted ball metrics. Nobly, Alex kept his model quite simple, using only statistics available on the FanGraphs player pages: Hard%, FB%, and Pull%.

I have very little formal training in statistics, most of it is self-taught to help me in my day job, so I’m also going to keep things simple. Inspired by Alex’s work, I began to experiment with improving the xISO model. I started building linear models including more predictors, and even introduced higher order and interaction terms. While these all improved the model slightly, I didn’t feel that the added complexity was worth the slight improvement. Along the way, I noticed that, although Chamberlain makes mention of the correlation between first half xISO and end of season ISO, if I calculated first half xISO and compared to second half ISO, I would find the initial xISO model to be a worse predictor of second half ISO than the actual first half ISO.

As I was running these calculations, I also became acquainted with the publicly available Statcast data through Daren Willman’s Baseball Savant site. Although the gathering of input data becomes a bit more tedious, surely some combination of exit velocity and launch angle information would improve an xISO model, and perhaps produce something which produces a better correlation between first and second halves. Let us see!

First things first, since Statcast is so new, we only have one full season of data. Ideally, we could use multiple years of data to build the model, but for now, we’ll stick with 2015 full season to train the model. As it turns out, the Statcast parameter that correlates best with ISO is the average exit velocity for line drives and fly balls (LDFBEV). This makes sense, right? It also makes sense that we can exclude ground ball exit velocity in an ISO predictor. Launch angle seems to have some relationship with ISO, but it’s relatively weak.

So, we’ll hang our predictive hats on LDFBEV and see what else can help. After constructing various models, we can pretty quickly see that Pull%, Center%, and Oppo% don’t add much additional explained variance between model and data, nor do Soft%, Med%, and Hard%. This isn’t surprising, since we already have an objective hard contact measure. Ultimately, the one traditional batted ball statistic that helps is GB%. In fact, in the final regression, adding GB% nets us about 18% more explained variance between model and data. This also makes sense. It’s pretty hard to hit a ground ball double or triple, and really hard to hit a home run.

So we’re down to two predictors, GB% and LDFBEV. If we ran a regression with only these two predictors, we would undersell the players who hit the ball really hard. To solve this, we’ll simply include another term in the regression, simply the square of the exit velocity. Throw in a constant term, and we’re ready to run the regression using all 2015 qualified hitters (141 of them). Here’s what comes out:

xISO Model Regression

First things first, we see an R-squared value of 0.75. This is pretty decent; it means our really simple model explains 75% of the variance of of the ISO data. The regression coefficients are as follows.

xISO equation

With this equation, one can look up the relevant data on FanGraphs and Baseball Savant, and calculate the current xISO for any given player. We’ll get to that, but first, I think it’s important to check whether the new xISO model can do a better job predicting future performance than a player’s current ISO. One could also check how quickly xISO stabilizes, compared to ISO, but I won’t attempt that here. What I will do is produce the necessary splits for GB%, LDFBEV, and ISO from FanGraphs and Baseball Savant, calculate 2015 first half xISO for all qualified, and compare to second half ISO. Unfortunately, the number of qualifying players common to the first and second half in 2015 was only 109, but this is what we have:

First Half Second Half

It’s hard to see from the plot, but the R-squared values tell the story: first half xISO does a better job than actual first half ISO at predicting second half ISO. Interestingly, it seems that several players significantly increased second half ISO compared to first half xISO or ISO, and relatively fewer saw a large decrease. I don’t know why this is, but perhaps it is related to the phenomenon detailed by Rob Arthur and Ben Lindbergh on the sudden power spike in 2015.

Having roughly demonstrated the predictive power of our new xISO, let’s show its utility by looking at a few interesting 2016 performers, as of May 22nd:

Trevor Story: ISO = .327,  xISO = .272

Domingo Santana: ISO = .142,  xISO = .238

Troy Tulowitzki: ISO = .190,  xISO = .182

Chris Carter: ISO = .349,  xISO = .355

Christian Yelich: ISO = .205,  xISO = .201

One of the first half’s great surprises, Trevor Story has a slightly inflated ISO, but he does hit the ball pretty hard, and does not hit many ground balls. While he probably won’t sustain an ISO north of .300, he’s a good bet to beat his Steamer ROS projected ISO of .191. Santana and Yelich are two guys who hit the ball hard, but are are held back by their ground ball tendencies. Chris Carter currently leads the pack in LDFBEV, and is a deserved second in ISO. Troy Tulowitzki fans: sorry, but it appears his days of .250 ISOs are a thing of the past.

So that’s it! We’ve got a cool new tool to use. Perhaps not surprisingly, I’ll be mostly using it for fantasy. Dedicated FanGraphs readers will also note that Andrew Perpetua has been doing work with Statcast data on “these electronic pages” recently as well. His use of launch angles introduces more sophistication into the models, but also more complication. My intent here is to present something which can be evaluated by anyone with a few clicks and a calculator. Please reach out with any qualms, criticisms, or suggestions for improvement!

Hot Starts and Cool Finishes

When a Major League hitter gets off to a particularly good start it’s tempting to think he’s figured something out and has reached a new level of offensive performance. We want to believe. We had heard all of the “best shape of his life” stories in the spring or how a hitter is committed to using the whole field this year or how he may be working on his plate discipline. If it’s not a new level of performance we can believe in, perhaps this hitter is destined for a career year. We have hope that the good times will continue.

On the flip side, when a hitter gets off to a terrible start we worry that he’ll never figure it out. If he’s older, we may think age has finally caught up to him and this is the beginning of the end. Perhaps he can no longer hit the good fastball. Or he has to cheat to hit the fastball, so he’s susceptible to an offspeed pitch. If he’s a young player, we worry that the league has adjusted to him and he needs to adjust back. We’re pessimistic and wonder if he’ll ever come around.

The reality in most cases is that a particular player is just off to a hot or cold start and will revert back to the player he was expected to be. It’s usually best to trust the projections. FanGraphs has projections from Steamer and ZiPS that are updated on a daily basis based on new information. During the 2015 season, I used the Steamer projections to find out what we can learn about a hitter getting off to a particularly hot or cold start.

Let’s use Bryce Harper as an example. Before the 2015 season began, Harper was projected by Steamer to hit .279/.361/.487, for an OPS of .848. Through one-fourth of the season (I used May 25th, roughly the one-quarter point), Harper was hitting out of his mind: .333/.471/.727/1.198 OPS (in 191 plate appearances). The updated Steamer projection called for Harper to hit .283/.378/.515 (.893 OPS). Harper’s projected *OPS increased by .045 based on an incredible 191 plate appearances.

*I intended to use wOBA for this exercise, but I didn’t save all of the necessary stats to calculate wOBA for each time period I used last season, so OPS it is.

So what did Harper do from that point forward? He was even better than his updated projection. He hit .329/.455/.617 (1.073 OPS) after May 25th. Bryce Harper absolutely torched his updated projection. In his case, it all came together and he appears to have reached a new level of performance (he was projected for a .974 OPS heading into this season, an increase of .126 points of OPS from last season’s pre-season projection).

Of course, what is true for Bryce Harper is not necessarily true for other humans. Joc Pederson came into the 2015 season as a highly-regarded young prospect and got off to a very good start. Unlike Harper, Joc Pederson did not continue to rake. Whatever the opposite of rake is, that’s what Joc Pederson did. In fact, Pederson’s impressive first 179 plate appearances (he was hitting .250/.388/.556, with a .944 OPS through May 25th) increased his Steamer projection from a .702 OPS in the pre-season to a projected .729 OPS for the rest of the season. He actually had a .685 OPS after May 25th (hitting .193/.327/.357). In the case of Joc Pederson, you would have been better off looking at his pre-season projection than his rest-of season projection despite the additional very good early-season plate appearances.

Bryce Harper and Joc Pederson are just two of many MLB hitters. I wondered if there were any trends we could learn from this exercise, so I accumulated the necessary data during the 2015 season. I started with hitter projections from Steamer during the pre-season, then saved actual hitter statistics through May 25th, which was roughly the one-quarter point of the season for MLB teams. At the same time, I also saved the Steamer rest-of-season projections for May 26th and beyond. These projections would be compared to the actual rest-of-season statistics for each player from May 26th on.

I knew sample size would be an issue. This is just one season, after all. Also, I wanted a player to have a good number of plate appearances in the first one-fourth of the season for the Steamer projections to incorporate into the new projection. I also wanted a good number of plate appearances after the one-fourth point. I decided to use 100 plate appearances through May 25th and 100 plate appearances after May 25th as my admittedly arbitrary cutoff points in determining the sample of hitters. This left 238 hitters.

I divided these 238 hitters into three groups based on how well they hit through May 25th. There were 79 hitters who got off to very good starts, meaning their OPS through May 25th was at least .066 higher than their pre-season Steamer projection. These guys are the “Hot Starters.” The middle group of hitters consisted of 80 hitters who had an OPS through May 25th that was between -.047 and +.065 of their pre-season projection. These guys are the “Predictables.” The final group of 79 hitters had an OPS through May 25th that was -.048 or worse than their pre-season projection. These are the “Cold Starters.”

The chart below shows the pre-season OPS projection from Steamer, along with each group’s actual OPS through May 25th, and the difference between the two.

One thing to note is that all 238 players in this sample combined were projected for a .736 OPS, but had a .748 OPS through May 25th. For reference, all hitters in MLB had a .700 OPS in 2014 and .721 OPS in 2015, so the level of offense increased in baseball from 2014 to 2015 (and these hitters were selected based on playing time so are likely to be better hitters than the league as a whole). It appears that Steamer projected hitters for a lower level of offense than what actually occurred.

The Hot Starters group was very hot, with a combined OPS that was +.137 better than their pre-season projection. This group included the aforementioned Bryce Harper, along with other hot starters from last year such as Nelson Cruz, Stephen Vogt, Adrian Gonzalez, and Brandon Crawford.

As a group, the Predictables came in close to where they were projected, with a group OPS of .736 versus a predicted OPS of .730. Players who were almost spot on with their pre-season projection included Jean Segura, Kevin Kiermaier, Will Middlebrooks, and Steven Souza, Jr.

The Cold Starters group combined for an OPS that was nearly .113 worse than projected. Guys like Victor Martinez, Jayson Werth, Carlos Gonzalez, and Christian Yelich were among the biggest offenders in this group.

So how much did the rest-of-season projections change for each group? The chart below shows the same information as above in the first two columns, then adds the updated projection for each group, along with the difference between the pre-season projection and updated projection.

Notice that the Hot Starters were initially projected for the lowest OPS of the three groups and the Cold Starters were projected for the highest. After one-fourth of the season had been played, the Hot Starters group saw their projected OPS increase by .014, while the Cold Starters saw their projected OPS decrease by .011. Even after this update, the Cold Starters (with a combined .638 OPS at this point) were still projected to be nearly as good as the Hot Starters (with a combined .864 OPS through May 25th). The Cold Starters had started with a higher projection. Their lack of production through one-fourth of the season brought them down a notch, but the difference in rest-of-season projections for the hottest and coldest hitters was negligible.

As it turned out, the entire group of hitters in this sample outperformed their updated projections by .020 from May 25th on. This is not surprising when we realize that in 2015 MLB hitters did improve as the season went on, so it makes sense that the entire group of hitters would do better than Steamer projected because Steamer was likely projecting based on a lower offensive environment. MLB hitters hit even better in the second half than the first half of 2015. Here are the monthly OPS totals for MLB hitters in 2015:


April: .705 OPS

May: .712 OPS

June: .713 OPS

July: .719 OPS

August: .736 OPS

September: .737 OPS


As for the three groups of hitters in this study, the Hot Starters averaged an OPS of .767 as a group after May 25th, compared to their .741 rest-of-season projected OPS, an increase of .026, which was the largest increase of the three groups. The Predictables outperformed expectations by .017, and the Cold Starters were better by the least amount, at .014.

At this point, it looks like early-season hot hitters are more likely to beat their updated rest-of-season projection going forward. In this case, the Hot Starters and Cold Starters were given rest-of-season projections that were very similar (.741 and .740), but the Hot Starters out-produced the Cold Starters by .012 points of OPS. We might be on to something here. Before we dive further into this, let’s check up on our old friend “regression to the mean.”

The following table shows what percentage of players from each group improved after May 25th and what percentage performed worse after May 25th.

As you would expect, the majority of the Hot Starters (82%) couldn’t keep up their hot hitting. Notable exceptions included Joe Panik, A.J. Pollock, and Lorenzo Cain. Panik was projected to have a .641 OPS in the pre-season. Through May 25th, his OPS was .773, which made his updated projection .657. Instead of coming back down to earth, Panik took his offense to another level, producing an OPS of .875 after May 25th. A.J. Pollock was similar, but at an even higher level of production.

While the majority of Hot Starters couldn’t keep up their torrid pace, most of the Cold Starters (78%) turned things around after May 25th. Three who did not improve were some of the biggest hitting disappointments in 2015—Mike Zunino, Pablo Sandoval, and Wilson Ramos. All three started poorly and didn’t get any better over the last three-fourths of the season.

In the Predictables group, the king of consistency was Buster Posey. Posey was projected for an OPS of .840. Through May 25th, his OPS was .850. This increased his rest-of-season projection to .845. He produced an OPS of .849 after May 25th. Buster Posey was a human metronome in 2015.

Okay, let’s go back. I had arbitrarily divided these hitters into three groups and came up with these initial results that appear to show that hot hitters stay hotter than their updated projection would expect. What happens if I divide them into four groups?


The column on the far right of the top chart is key here. Based on my results when the hitters were divided into three groups, I expected the Scorching hot hitters in this sample to stay hotter than the other three groups, meaning I expected them to outperform their updated rest-of-season projection by the largest amount. They did not. The Hot and Chilly hitters both improved on their updated rest-of-season projections by a greater amount than the Scorching hitters. The Chilly group of hitters were the worst of the four groups through the one-fourth point of the season (a combined .629 OPS), but actually had the highest OPS over the final three-fourths of the season.

On the bright side, the second chart came out as expected. The hitters who started out the year the hottest were the least-likely to improve after the one-quarter mark. The second-hottest hitters were the second-least likely to improve. The pattern follows for the Cool and Chilly hitters.

I did one final check with just two groups—those who had a higher OPS through May 25th than they were projected for in the pre-season and those who had a lower OPS through May 25th than they were projected for in the pre-season. That chart is below:

The Above Projection group started the year with a projected OPS of .731. Through May 25th, these hitters combined for an .823 OPS. Their updated projection was .739. After May 25th, they had a .761 OPS, which was .022 higher than their updated rest-of-season projection. The Below Projection Hitters ended up .015 higher than their updated rest-of-season projection.

This shows a slight trend towards the early season hot hitters outperforming their projection, but the difference is just .007 points of OPS and if Bryce Harper is removed from the Above Projection group, the difference drops to .005. If there is a trend, the difference is small. The important takeaway, as the second chart shows, is to trust that most of those who start out hot will cool down and most of those who start out cold will heat up.

Hardball Retrospective – What Might Have Been – The “Original” 1908 Cardinals

In “Hardball Retrospective: Evaluating Scouting and Development Outcomes for the Modern-Era Franchises”, I placed every ballplayer in the modern era (from 1901-present) on their original team. I calculated revised standings for every season based entirely on the performance of each team’s “original” players. I discuss every team’s “original” players and seasons at length along with organizational performance with respect to the Amateur Draft (or First-Year Player Draft), amateur free agent signings and other methods of player acquisition.  Season standings, WAR and Win Shares totals for the “original” teams are compared against the “actual” team results to assess each franchise’s scouting, development and general management skills.

Expanding on my research for the book, the following series of articles will reveal the teams with the biggest single-season difference in the WAR and Win Shares for the “Original” vs. “Actual” rosters for every Major League organization. “Hardball Retrospective” is available in digital format on Amazon, Barnes and Noble, GooglePlay, iTunes and KoboBooks. The paperback edition is available on Amazon, Barnes and Noble and CreateSpace. Supplemental Statistics, Charts and Graphs along with a discussion forum are offered at TuataraSoftware.com.

Don Daglow (Intellivision World Series Major League Baseball, Earl Weaver Baseball, Tony LaRussa Baseball) contributed the foreword for Hardball Retrospective. The foreword and preview of my book are accessible here.


OWAR – Wins Above Replacement for players on “original” teams

OWS – Win Shares for players on “original” teams

OPW% – Pythagorean Won-Loss record for the “original” teams

AWAR – Wins Above Replacement for players on “actual” teams

AWS – Win Shares for players on “actual” teams

APW% – Pythagorean Won-Loss record for the “actual” teams


The 1908 St. Louis Cardinals 

OWAR: 29.2     OWS: 247     OPW%: .375     (58-96)

AWAR: 13.5       AWS: 146     APW%: .318   (49-105)

WARdiff: 15.7                        WSdiff: 101  

Despite a dismal effort and last-place finish, the “Original” 1908 Cardinals bested the “Actual” Redbirds by a 9-game margin and a confounding Win Shares differential of 109. “Turkey” Mike Donlin (.334/6/106) tallied 198 base knocks, pilfered 30 bags and recorded a career-high in ribbies. Fellow outfielder Charlie “Eagle Eye” Hemphill swiped 42 bases and batted .297 for the “Original” Cardinals. Red Murray supplied a .282 BA with 48 stolen bases for both the “Original” and “Actual” Redbirds.

Mordecai Brown ranks twentieth among pitchers according to Bill James in “The New Bill James Historical Baseball Abstract.” “Original” Cardinals teammates listed in the “NBJHBA” top 100 rankings include Ed Konetchy (48th-1B) and Mike Donlin (52nd-CF).

  Original 1908 Cardinals                             Actual 1908 Cardinals

Charlie Hemphill LF/CF 3.11 25.83 Joe Delahanty LF -0.89 13.78
Red Murray CF 2.92 25.78 Red Murray CF 2.92 25.78
Mike Donlin RF 5.8 31.2 Al Shaw RF/CF -0.3 10.83
Ed Konetchy 1B 1.65 16.9 Ed Konetchy 1B 1.65 16.9
Chappy Charles 2B -2.75 2.31 Billy Gilbert 2B -1.13 3.61
Freddy Parent SS 1.89 11.89 Patsy O’Rourke SS -1.02 0.64
Bobby Byrne 3B -1.61 3.31 Bobby Byrne 3B -1.61 3.31
Art Hoelskoetter C -0.24 2.21 Art Hoelskoetter C -0.24 2.21
Joe Delahanty LF -0.89 13.78 Shad Barry RF -0.53 4.25
Al Shaw CF -0.3 10.83 Chappy Charles 2B -2.75 2.31
Al Burch LF 0.18 10.72 Jack Bliss C 0.12 2.18
Spike Shannon LF -0.85 7.58 Bill Ludwig C -0.02 1.4
Jack Bliss C 0.12 2.18 Wilbur Murdoch LF -0.21 1.3
Bill Ludwig C -0.02 1.4 Champ Osteen SS -0.81 0.41
Wilbur Murdoch LF -0.21 1.3 Charlie Moran C -0.44 0.28
Patsy O’Rourke SS -1.02 0.64 Walter Morris SS -0.65 0.25
Art Weaver C -0.1 0.33 Doc Marshall C -0.07 0.18
Charlie Moran C -0.44 0.28 Tom Reilly SS -0.58 0.13
Walter Morris SS -0.65 0.25 Ralph McLaurin LF -0.14 0.09
Tom Reilly SS -0.58 0.13
Ralph McLaurin LF -0.14 0.09
Simmy Murch 1B -0.06 0.06

Mordecai “Three-Finger” Brown, in the midst of six straight seasons with 20+ victories, furnished a 29-9 record with a 1.47 ERA and a career-best WHIP of 0.842. He completed 27 of 31 starts and saved 5 contests in 13 relief appearances for the “Original” Cardinals. Billy Campbell contributed 12 wins with a 2.60 ERA and a 1.116 WHIP in 221.1 innings. “Actuals” ace Bugs Raymond suffered through a 15-25 campaign despite a 2.03 ERA and 1.021 WHIP. Johnny Lush (11-18, 2.12) endured similar results as the Redbirds rotation was unable to overcome a lackluster offense.

  Original 1908 Cardinals                            Actual 1908 Cardinals

Mordecai Brown SP 6.62 31.34 Bugs Raymond SP 1.97 21.04
Billy Campbell SP -0.96 10.38 Johnny Lush SP 0.26 14.3
Art Fromme SP -1.45 3.61 Fred Beebe SP -2.13 5.63
Slim Sallee SP -1.61 3.19 Ed Karger SP -1.87 3.69
Jake Thielman RP -0.34 3.78 Art Fromme SP -1.45 3.61
Irv Higginbotham SP -0.9 3.1 Slim Sallee SP -1.61 3.19
Charlie Rhodes RP -0.05 1.67 Irv Higginbotham SP -0.9 3.1
Stoney McGlynn SP -1.16 1.23 Charlie Rhodes SP 0 1.4
O.F. Baldwin SP -0.46 0 Stoney McGlynn SP -1.16 1.23
Buster Brown RP -0.39 0 O.F. Baldwin SP -0.46 0
Fred Gaiser RP -0.13 0 Fred Gaiser RP -0.13 0

Notable Transactions

Mordecai Brown

December 12, 1903: Traded by the St. Louis Cardinals with Jack O’Neill to the Chicago Cubs for Larry McLean and Jack Taylor.

Mike Donlin

Before 1901 Season: Jumped from the St. Louis Cardinals to the Baltimore Orioles.

Before 1902 Season: Released by the Baltimore Orioles.

August, 1902: Signed as a Free Agent with the Cincinnati Reds.

August 7, 1904: Traded as part of a 3-team trade by the Cincinnati Reds to the New York Giants. The New York Giants sent Moose McCormick to the Pittsburgh Pirates. The Pittsburgh Pirates sent Jimmy Sebring to the Cincinnati Reds.

Charlie Hemphill

March 2, 1901: Jumped from the St. Louis Cardinals to the Boston Americans.

Before 1902 Season: Signed as a Free Agent with the Cleveland Bronchos.

June, 1902: Released by the Cleveland Bronchos. (Date given is approximate. Exact date is uncertain.)

June 4, 1902: Signed as a Free Agent with the St. Louis Browns.

August 23, 1905: Purchased by the St. Louis Browns from St Paul (American Association). (Date given is approximate. Exact date is uncertain.)

November 5, 1907: Traded by the St. Louis Browns with Fred Glade and Harry Niles to the New York Highlanders for Hobe Ferris, Danny Hoffman and Jimmy Williams.

Honorable Mention

The 1983 St. Louis Cardinals 

OWAR: 54.8     OWS: 310     OPW%: .517     (84-78)

AWAR: 36.1     AWS: 237     APW%: .488   (79-83)

WARdiff: 18.7                        WSdiff: 73 

The “Original” 1983 Cardinals seized the National League Eastern Division flag by a single game over the Expos. The flock featured left fielder Jose Cruz (.318/14/92), the NL leader with 189 base hits. “Cheo” reached the 30-steal mark for the fifth time in his career. Terry Kennedy (.284/17/98) registered a personal-best in RBI. Keith Hernandez earned the sixth of eleven consecutive Gold Glove Awards. John Denny (19-6, 2.37) merited the NL Cy Young Award. Larry Herndon notched personal-highs in batting average (.302), hits (182), doubles (28) and RBI (92). Ted “Simba” Simmons delivered a .308 BA with 39 two-baggers and 108 ribbies. Steve “Lefty” Carlton whiffed 275 batsmen and fashioned a 3.11 ERA. George Hendrick (.318/18/97) received his fourth All-Star invitation and posted a career-high in batting average for the “Actual” Redbirds.

On Deck

What Might Have Been – The “Original” 1975 Astros

References and Resources

Baseball America – Executive Database


James, Bill. The New Bill James Historical Baseball Abstract. New York, NY.: The Free Press, 2001. Print.

James, Bill, with Jim Henzler. Win Shares. Morton Grove, Ill.: STATS, 2002. Print.

Retrosheet – Transactions Database

The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at “www.retrosheet.org”.

Seamheads – Baseball Gauge

Sean Lahman Baseball Archive

Where Art Thou, Yan?

It seems that more and more often, we as baseball fans are constantly trying to “diagnose” the cause of a specific player’s struggles, and give our two cents on if everyone should — in the words of Aaron Rodgers — relax, or be concerned about the player’s deficiencies. I am not sure what it is; maybe it’s because talking about other people’s problems makes us forget about our own. Maybe it’s because we as humans simply like to tell other people how to do their jobs, because it makes us feel important. No one will truly ever know the exact answer to that question. With that being said, however, I am going to do exactly what I just talked about the previous four sentences; I am going to try to explain what is going on with Yan Gomes. In his first two seasons with the Tribe (223 games total), he accumulated 7.8 WAR, won a Silver Slugger award in 2014, and drew positive reviews for his framing abilities according to Baseball Prospectus (ranked 17th out of 417 catchers in 2013 and 32nd out of 382 in 2014 in the Framing Runs statistic). Framing runs essentially shows how many runs a catcher saves throughout a given season based on how many extra strikes they are able to get their pitchers from their framing abilities. The Indians, seeing a young and talented player still required to go through the arbitration process for several more years, locked Gomes up to a six-year, 23-million-dollar contract before the 2014 season. Taking a look at this chart, the Indians’ felt they were in for a huge bargain.

Year Age Salary (in millions) WAR est. $/WAR Value (in millions)
2014 27 0.6 3.5 7.6 26.6
2015 28 1 3.15 8.2 25.8
2016 29 2.5 2.84 8.8 24.9
2017 30 4.5 2.55 9.4 24.0
2018 31 6 2.17 10.0 21.7
2019 32 7 1.84 10.6 19.5
Total 23 (includes 0.5 million signing bonus) 142.6
Surplus Value 119.6 M


To briefly explain my methodology, I used the estimates for dollars per WAR (which adjusts for inflation) from an article by Matt Swartz from Hardball Times, and adjusted Gomes’ overall WAR per year by the generally accepted decline rates laid out by Dave Cameron of FanGraphs a few years back. Players on average perform at 90% of their previous year’s WAR output through age 30, 85% from 31-35, and 80% from 36 and up. When the Indians signed Gomes, he was coming off a 3.3 WAR season. Considering he was going into his age-27 season, he was probably nearing his peak year in terms of WAR. Therefore, right or wrong, I believe his true-talent level (and what the Tribe were expecting from him) in 2014 was right around 3.5 WAR. I adjusted his yearly totals accordingly until his contract expired — I did not incorporate team options for 2020 and 2021 into this. The Indians receive roughly 120 million dollars in surplus value for the length of Gomes contract, which would be an incredible deal for a small-market team.

Obviously, Gomes went out in 2014 and produced a 4.5 WAR season, even further increasing the bargain for the Tribe in the early goings of the deal. Since 2014, however, Gomes hasn’t been the same player at the dish. His defense still grades out favorably according to many defensive metrics, but his bat appears to have taken a big step back. It isn’t fair to judge him on 2015, considering he was injured early on in the season and never fully recovered. This year, there isn’t an injury excuse — that we know of anyways. Gomes is slashing a dismal .167/.204/.353 at the plate, and has been worth just 46 wRC+, meaning his hitting has been 54% worse than league average. Few things of merit before jumping into a more detailed analysis: he is running a .174 BABIP, which is tremendously lower than his career average of .302 and upon regression will raise his average. His walk rate is about the same, and he is only striking out 3% more than his “peak” season of 2014. While a 3% rise in strikeout percentage isn’t minuscule, Gomes has always been known as a free swinger (over the last four years, he is in the 75th percentile in swinging strikes and 83rd percentile in swing percentage).

So, the big question here is, what specifically is causing Gomes’ struggles? I am going to try to be as systematic as possible here, so that everything kind of builds upon itself. To quickly summarize his plate discipline statistics — because I don’t think there are really any surprises here — his out of zone, zone, and overall swing percentages in comparison to his career have increased, and his out of zone and overall contact percentages have decreased. I am not sure why his Z-Contact% has increased, but I don’t think that is of much consequence. It is clear that Gomes is swinging more, and making contact less.

Turning to his batted-ball statistics, there are several important changes that start to paint a better picture of why Gomes is struggling. For ease of communication, I have split the information into two tables below.

Season Team GB/FB LD% GB% FB% IFFB%
2012 Blue Jays 1.28 14.9% 47.8% 37.3% 8.0%
2013 Indians 1.12 17.8% 43.5% 38.7% 11.2%
2014 Indians 0.93 24.0% 36.7% 39.4% 9.6%
2015 Indians 0.84 26.4% 33.6% 40.0% 11.3%
2016 Indians 0.76 18.9% 35.1% 45.9% 14.7%

Notice how in all of Gomes’ professional seasons, his groundball-to-fly-ball ratio has gone down. This could be considered a good thing, since he does possess a ton of raw power, and everyone knows you can’t hit home runs on the ground — okay, technically you can, but Gomes doesn’t have Dee Gordon speed. The next thing that jumps out is his 14.7% pop-up rate, which is good for 25th highest out of 192 qualified hitters. His increased fly-ball rate, coupled with his bloated IFFB%, could explain why his BABIP is so low — balls in the air are caught more often than balls on the ground. More importantly, though, it seems that there could be a pitch-recognition problem, considering his isn’t quite squaring up balls as consistently as he has in the past. To go into this concept further, let’s take a look at the next chart.

Season Team Pull% Cent% Oppo% Soft% Med% Hard%
2012 Blue Jays 52.9% 31.4% 15.7% 7.1% 62.9% 30.0%
2013 Indians 42.2% 31.7% 26.1% 14.3% 53.5% 32.2%
2014 Indians 42.6% 30.2% 27.2% 16.4% 52.6% 31.0%
2015 Indians 37.4% 37.0% 25.7% 16.6% 55.5% 27.9%
2016 Indians 44.6% 40.5% 14.9% 20.3% 54.1% 25.7%

Gomes is pulling the ball more than he ever has in his entire career — excluding the cup of tea he had in the bigs in 2012. Not to mention, he has basically abandoned taking the ball the other way. Looking at his quality of contact stats, he is hitting the ball “hard” less often than he typically has throughout his career, too.

Sure enough, Gomes has been below the league average in exit velocity for the majority of the season. So, to recap what I have already found, Gomes is hitting a ton of fly balls and pop-ups, is pulling the ball more and taking it the other way less, and is hitting the ball softer than usual. What does this all mean? I think it illustrates that Gomes is struggling with breaking balls.

Looking at Gomes’ spray angles against hard, breaking, and offspeed pitches, it appears that he is not recognizing breaking balls well this season.

For those that aren’t familiar with Brooks Baseball’s spray angle data, it essentially shows the average direction which balls are hit on the field. So, a positive spray angle (as depicted on the graph) means that the hitter tends to pull that pitch, and a negative spray angle means they tend to take it the other way. A recent FanGraphs Community Blog post by an author named Brad McKay explained the significance of spray angle well, in my opinion. He surmised that similar spray angles for different pitch types suggests that a player “was able to recognize and wait back equally well for both pitch types,” something that I happen to agree with. Looking at Gomes’ Silver Slugger Award winning year, it appears that Gomes tracked and hit fastballs and breaking balls at a similar spray angle, while also hitting offspeed pitches almost identical as well. This shows that Gomes was picking up the ball well in 2014. Fast-forward to 2016, and you can see that those angles have changed, and Gomes is now pulling breaking balls more than he does against fastballs. This suggests that something isn’t right with Gomes’ pitch recognition. He has almost reverted back to more of what he was in 2013. Interestingly enough, Gomes hit really well that season in 88 games played. The difference from then to now, however, is the pitch sequencing.

The approach against him has done a complete 180. The lefties — who used to pound him with fastballs when ahead in the count — now go to their breaking balls, while the righties — who used to pound him with breaking balls when ahead in the count — now attack him with fastballs. Essentially, the way pitchers (both lefties and righties) attacked Gomes in 2013 is consistent with how one would traditionally pitch to an aggressive, right-handed power bat. Here’s what I think has happened now. Pitcher’s have realized that Gomes is not picking up breaking balls the way he was in the past, causing him to have to sit breaking ball on the majority of pitches. He does this with the hopes of picking up the breaking ball early enough to decide whether to swing or not swing. With this in mind, right-handed pitchers know that because Gomes is sitting breaking ball, he will have a harder time catching up to the fastball many times. Simultaneously, left-handers know that they can attack him earlier with their fastballs (which are generally a pitch righties see well from lefties) to get ahead in the count, and then try to put him away with the breaking ball. In a sense, Gomes is completely and utterly discombobulated at the plate. Here are his heat maps vs. righties, broken down into “hard stuff” and “breaking balls.”As expected, the “hard stuff” is up, while the breaking balls are started over the middle of the plate and break down and away. Next, the lefties.

Lefties have attacked him with fastballs low, and inside, and use this to set up the breaking ball on Gomes’ back foot, which is incredibly difficult to hit (especially for someone not picking up those types of pitches well). Gomes is hitting .177 against the 55 sliders he’s seen this year, and is hitting .000 against the 35 curveballs he’s seen. His averages against harder pitchers are not much better.

Now that we have identified the problem, is there a way to fix it? I don’t know what Gomes is doing behind the scenes, but in my opinion there are three different ways to go about this. For one, I think Gomes should study the way pitchers are attacking him (which I would assume he is already doing). Using this knowledge, I think Gomes could benefit from being a little more patient at the plate. Instead of swinging out of his shoes all the time, he might be better suited remembering how pitchers are attacking him, and waiting on a pitch he not only can drive, but knows is most likely coming (helping to eliminate the guessing game he is playing right now). Lastly, I think he could simply practice recognizing pitches on the pitching machines teams have in the clubhouse. Gomes could spend time every day tracking a set amount of pitches, working to improve his ability to discern spin on the baseball upon its release. Then, he could put that pitch recognition to the test by actually attempting to hit the pitches when they are thrown. These are pretty simplistic solutions, and I am sure Gomes is working tirelessly trying to break out of his slump already. These are just my best guesses on how to improve this deficiency in Gomes’ game going forward in 2016.

I still believe in Yan Gomes, and so should you. He has proven he can be a successful big-leaguer, and one of the top catchers in the league. Catchers are judged more on their defense than on their bat, and catchers who can do both are considered a premium. In other words, Gomes could still be considered a solid MLB catcher, even if he doesn’t ever regain his old form at the plate. It is my opinion, however, that we should not sell him short at the plate. The ability is there, it just needs a little refining right now. For the sake of Indians’ fans everywhere, let’s hope Gomes can unleash his inner “Yanimal” sooner rather than later; the fate of the Indians season depends on it.









The Future of Analytics In Baseball: How Will Small-Market Teams Fare?

This post originally appeared on the Pittsburgh Pirates blog Bucco’s Cove.

A recent episode of the Baseball Prospectus podcast Effectively Wild (and if you don’t listen to it, this is one of the best baseball podcasts out there) had two analysts from the LA Dodgers’ front office as guests. During the episode, one of them said, “Even though we have grown substantially in the last year…” and went on to talk about the size of their analytics department and how they work together. This is a scary prospect for small-market teams like the Pirates; embracing analytics before such things were en vogue allowed teams like the Moneyball A’s, the Royals, the Pirates, and many others to gain a competitive advantage over their comparatively retrograde competition still throwing money at their problems every offseason.

The window of opportunity for small-market teams to use advanced analytics to their advantage may be closing faster than we think. Most (and possibly all, I don’t have access to every team’s front office payroll) teams have some sort of analytics department (or “Baseball Operations Department,” as they’re often dubbed). According to this ESPN article from about 14 months ago, only two woeful teams are listed as “nonbelievers,” the Marlins and the Phillies, and the Phillies have since seen some significant shuffling in their front offices. Larger teams are beginning to emulate their smaller counterparts to varying extents, with results that will bear fruit over the coming seasons. As a fan of a small-market team, this is concerning; the limited dividends paid from the analytics advantage may mean a return to the old power structure in baseball in which larger-market teams with more money have the ability to acquire players at will. The difference, however, will be that stats will have informed the signings, so if two teams are targeting the same player for “sabermetric reasons,” the team with more money will obviously still have the upper hand.

Scarier still for fans of small-market teams is that the greater financial capital available to geographically-favored franchises is that these financial resources can not only be employed to sign the best players, but also the most talented analysts and more of them. The premise that teams all have access to effectively the same data and analysis is rendered moot if larger franchises can secure a stronger analytics department, both in terms of the number of analysts and the talent of the analysts (money could even be used to lure talented analysts to the richer franchises in the same way that players are). For example, the Cubs thus far this season seem to be a perfect confluence of young talent, effective free-agent signings based on a strong analytics department, and a hell of a lot of money, which is exactly where you want to be if you’re trying to create a dynasty and win multiple Commissioner’s Trophies.

Parity in the league is still greater than that of the NFL, but we could be witnessing the last generation of such parity. How is such a situation solved? The one obvious choice is a salary cap; the player’s association would be loath to support such an idea, although it’s perhaps beginning to be in their interest. As the league’s revenue increases, players haven’t been getting the same share of that revenue, according to Nathaniel Grow on FanGraphs. A quote from that article:

“The biggest difference between the NBA and MLB, then, isn’t the fact that the former has a salary cap while the latter does not. Instead, the primary difference between the two leagues’ economic models is that by agreeing to a “salary cap,” NBA players in turn receive a guaranteed percentage of the league’s revenues, while MLB players do not.”

According to the same article, the players’ share of revenue has fallen about 13% to 16% since 2002 or 2003. While this argument is unlikely to induce the MLBPA to support a salary cap, a downturn in league parity could force their hand at some point in the future. This would be a long-term effect, however; many years of a “lack of parity,” coupled with a downturn in the popularity of the sport as a whole, would be required to even have the MLBPA thinking about acquiescing to a salary cap.

Coming back to the proliferation of analytics departments among MLB teams and their effect on important advantages held by those willing to embrace statistics: I don’t know what’s going to happen. There are many facets to analytics, more than just comparing players based on the BABIP or K% or arm slot or determining what players to acquire and how much they’re worth. For example, one of the Effectively Wild guests from the episode I cited earlier was a biomedical engineering major during her undergraduate studies, implying that the front office is becoming interested in the medical side of analytics: preventing injuries, improving player health, and looking at the biomechanical aspect of baseball, which takes a significant toll on players’ bodies. This is not too dissimilar from what the Pirates have done in recent years and is just one of the many components to assembling and maintaining a competitive squad.

This line of thinking admittedly removes the human component from the equation, which is still incredibly important to this entire process. There will always be GMs who are more willing to try new strategies to win and those who are unwilling to change (*cough* Ruben Amaro, Jr.). Coaching and player development, especially in the minor leagues, will continue to be extremely important for MLB franchises and is largely outside the purview of the type of statistical analysis that is widely considered in evaluating players. Rather, this part of baseball can be thought of, to a certain extent, as producing the statistics that analysts ultimately study. As a result, there will always be opportunities for smaller-market teams to hire talented personnel, including trainers, coaches, scouts, and other employees outside the scope of the Major League analytics departments that will influence franchises’ success and failure.

However, analytics at the MLB level may start to be influenced by money. Ultimately, stories like the Pirates’ repeated acquisitions of undervalued Yankee catchers who are stellar pitch framers, the Royals’ World Series win relying on great defense and a crazy strong bullpen, and the general parity of the league beyond the traditionally great franchises may be fewer and further between. Those franchises with more money may regain the competitive advantage that the sabermetric revolution has wrested away from them for the past decade, and smaller-market teams will have to find yet another way to adapt to the ever-changing baseball landscape.

Got Projections?

Back in college, I remember being fascinated by a concept I learned in one of the first chemistry classes I took: the atomic orbitals. Contrary to what I thought at the time, electrons don’t orbit around the atom’s nucleus in a defined path, the way the planets orbit around the sun. Instead, they move randomly in the vicinity of the nucleus, making it really hard to pinpoint their location. In order to describe the electrons’ whereabouts within the atom, scientists came up with the concept of orbitals, which, simply put, are areas where there’s a high probability of finding an electron. That’s pretty much how I see baseball projections.

A term that is very often used by the sabermetric community is “true talent level,” and just like an electron’s position, is a very hard thing to pinpoint. Projections, however, do a very good job of defining the equivalent of an atomic orbital, sort of like a range of values where there’s a high probability of finding a certain stat. I know what you’re thinking; projections are not a range of values. But you can always convert them very quickly just by adding a ±20% error (or any other percentage you consider fitting). So, for example, if a certain player is projected to hit 20 home runs, you can reasonably expect to see him slug 16 to 24 homers.

As a 12-year veteran fantasy baseball manager (and not a very good one at that), I’ve never used projected stats as a player-evaluating tool when I’ve gone into a draft. For some reason (probably laziness), I’ve mainly focused on “last year’s” stats, and felt that players repeating their last season’s numbers was as good a bet as any. This year, after taking a lot of heat for picking Francisco Lindor and Joe Panik much higher than what my buddies thought they should’ve been taken, I started wondering how much of a disadvantage was using a simple prior-year data instead of a more elaborate method.

To satisfy my curiosity, I decided to evaluate how good a prediction are “last year” numbers, and compare them to other options such as using the last two or three years, and using some projections publicly available. In this particular piece, I’ll limit the study to offensive stats, but I’ll probably tackle pitching stats in a second article.

The first step for this little research was to establish the criteria with which to compare the different projections. A simple way to evaluate projection performance is using the sum of the squared errors; the greater the sum, the worse the projection (in case you’re wondering, squared errors are used in order to make negative errors positive so they can be added, it also penalizes bigger errors more than smaller errors). In this particular case however, I wanted to evaluate projections for a number of different stats, so a simple sum of squared errors would have an obvious caveat in that stats with bigger values have bigger errors. For example, an error of 10 at-bats is a very small one, given that most players log 450+ of them per season. On the other hand, an error of 10 HR is huge. Additionally, not every stat has the same variation among players. Home runs, for example, have a standard deviation of around 70% of the mean, while batting average’s standard deviation is only about 11% of the mean. So, you could say that it’s harder to predict HR than it is to predict AVG.

Long story short, I divided each squared error by the squared standard deviation, and calculated the average of all those values for each stat. Finally, I converted those averages to a 0 to 1 scale, with 1 being a perfect prediction (in reality, these values could be less than zero when errors are greater than 1.5 standard deviations, but I scaled it so that none of the averages came out negative).

For this study, only players with at least 250 AB on the season were considered. Also, players that were predicted to have less than 100 AB were not considered, even if they did amass more than 250 AB on the season. The analysis was done on five different sets of predicting data:

  1. Last season stats.
  1. A weighted average of the two preceding seasons, with a weight of 67% for year n-1, and 33% for year n-2.
  1. A weighted average of the last three seasons, with 57.5% for year n-1, 28.5% for year n-2, and 14% for year n-3.
  1. ZiPS projections (Created by Dan Szymborski, available at FanGraphs)
  1. Steamer projections (Created by Jared Cross, Dash Davidson, and Peter Rosenbloom. Also available at FanGraphs)

The following graph shows the average score of each of the 5 projections for each individual stat considered in this study. The graph also shows the overall score for each stat, in order to have an idea of the “predictability” of each one of them. Remember, higher scores indicate better performance, with 1 being a perfect prediction.


Other than hinting that it is in fact a very poor decision to use only last year’s data, this graph doesn’t tell us much about which predicting data has a better overall performance. It does provide, however, a very good idea of the comparative reliability of each stat within the projections.

Aside from stolen bases (which honestly surprised me as being the most predictable stat of the bunch), the three most reliable stats are the ones you would’ve expected: HR, BB, and K. They’re called “true outcomes” for a reason, they depend a great deal on true talent level, and involve very few external factors such as luck or opponent’s defensive ability.

On the other end of the spectrum, it’s really no surprise to find three-baggers as the least reliable stat. This may seem counterintuitive at first, given that players that lead the league in triples have a distinctive characteristic in being usually speedy guys. Nonetheless, 3B almost always involve an outfielder misplaying a ball and/or a weird feature of the park such as the Green Monster in Fenway or Tal’s Hill in Minute Maid’s center field, making triples unusual and random events. Playing time (represented in this case by at-bats) has also an understandably low overall score. Most injuries, which are a major modifier of playing time, are random and hard to predict. Also, managerial or front-office decisions can affect a player’s playing time. It does surprise me, however, to see doubles so far down in this graph, and I really can’t find a logical explanation for it.

Let’s move on now to the real reason why we started doing all this in the first place. Here’s a graph that shows the average score for each predicting data, for years 2013, 2014, and 2015. It also shows the three-year average score.



The one fact that clearly stands out in this graph is that last-year numbers are a very poor predicting tool. Its performance is consistently and considerably worse than any other set of data used. So my initial question is answered in a pretty definite way: it is a huge mistake to rely on just last season’s number when trying to predict future performance.

Turning our attention to the other four projections, it becomes a bit harder to separate them from each other, especially using only three years’ worth of data. The average performance of the three-year period gives us a general idea of the accuracy of each option, but looking at the year-by-year numbers, it’s not really clear which one is better. Steamer seems to be the winner here, since it had the better score on all three years. ZiPS, on the other hand, despite having a better overall score than the three-year weighted average, has a worse score in two of the three years. They were really close in 2014 and 2015, but ZiPS was considerably better in 2013, which interestingly, was a less predictable year than the other two.

The biggest point in favor of ZiPS when comparing against the three-year weighted average is that ZiPS doesn’t actually need players to have three years’ worth of MLB data in order to predict future performance, and that makes a huge difference. Another major point in favor of ZiPS is that it’s doing all the work for you! Believe me, you do not want to be matching data from three different years every time drafting season comes around (I just did it for this piece and it’s really dull work).

After all is said and done, projection systems such as Steamer or ZiPS do a fine job of giving us a good indication of what to expect from players. We’re much better off using them as guidelines when constructing our fantasy teams than any home-made projection we could manufacture (unless you’re John Nash or Bill freaking James). I know next March I’ll be taking advantage of these tools, hoping they translate into my very elusive first fantasy league title.

Tyler Wilson and His Five Plus Pitches

Let me preface this article by saying that I watch A LOT of baseball.  I also have an extensive analytical background and am always analyzing baseball stats looking for value in players.  Last week, I was watching an Orioles game and the starting pitcher was a player I have never heard of.  His name is Tyler Wilson.  While watching the game, I was very impressed with his overall make-up and the confidence he displayed in each one of his pitches.  Many times what separates a pitcher from being able to start at the big-league level versus being destined for the bullpen is the ability to throw multiple pitches.  The ability to throw each of those pitches effectively, however, can be what separates a good starting pitcher from a great starting pitcher.  The more I watched of Wilson, the more intrigued I became about his future outlook, and the more motivated I became to write this article.  (I went back and watched all of Wilson’s starts this year before writing this article.)

To give you a little background, Tyler Wilson has never been an elite prospect.  He attended college at the University of Virginia, where he was overlooked by fellow staff-mate, and future 1st round pick, Danny Hultzen.  Wilson was drafted by the Orioles in the 10th round of the 2011 MLB Draft.  Ever since being drafted, he has quietly excelled at every level.  He doesn’t have the dominant strikeout numbers that you look for in pitching prospects, which is a big reason he has gone overlooked for much of his career.

After climbing his way through the organizational ladder, Wilson made his major league debut with the Orioles last year and eventually made the team this year out of spring training.  Although he made the team in a bullpen role, early season injuries to the Orioles pitching staff opened up an opportunity and Wilson has really taken advantage of it.  Enough of the background though.  Let’s move on to what I saw while actually watching him pitch.

Tyler Wilson features a cutter and a two-seam fastball.  Each of these pitches sit in the 89-91 mph range and both show a great amount of movement.  The cutter is most effective against right-handed batters when thrown on the outside portion of the plate.  Check out the video below to watch him fool Kansas City Royals outfielder Lorenzo Cain with three straight cutters:

He essentially gave Cain, a very good hitter, three of the exact same pitches in a row…and Cain couldn’t touch them.  In every start this year, Wilson has pounded the outside corner with this cutter and has had fantastic results.  Don’t think by any means though that he is a one trick pony.  As soon as you start to expect that cutter on the outside corner, Wilson will come right back in on you with a two-seam fastball:

Look at the horizontal movement on that pitch!  Absolutely filthy!  Wilson has showed a ton of confidence in both of those pitches so far this season as he uses them to pound both sides of the strike zone and his command of them has been exceptional.  He is not afraid to throw them in any count and they are equally effective vs both left-handed and right-handed batters.

While his fastballs both seemed to be plus pitches upon first glance, I started to have thoughts that this guy might be for real as soon as he started throwing his curveball.  Wilson’s breaking ball sits in the 77-79 mph range.  I was astonished by how well he was able to locate his curve and the amount of movement on each and every one he threw.  Watch him send White Sox slugger Jose Abreu down swinging in the video below:

Abreu had no chance.  In his most recent start against the Twins, Wilson’s curve looked even better.  Check out the one he threw to Byung-Ho Park:

Both of those pitches came in a 2-2 count.  Many pitchers are scared to throw a breaking ball in a 2-2 count, especially to players with plus power such as Abreu and Park.  If you miss your target, two things can happen.  One — you leave the ball up in the zone and it gets hit out of the stadium.  Two — you throw it in the dirt; the hitter lays off; and now you have to pitch to this slugger with a full count.  Wilson isn’t scared to throw his curveball in any count and that is what makes him so dangerous.  You never know when to expect it, but at the same time you have to expect that he can throw it at any moment.

The last pitch in Wilson’s arsenal is his changeup.  This pitch has a ton of downward movement and produces a lot of groundballs.  While there were many better examples that I could have shown you of his change-up in action, I wanted to show one of his bad ones.  Even when he missed his target, the batter was still fooled by the amount of movement on this pitch.  Check out the following pitch to Royals SS Alcides Escobar:

The catcher set up down in the zone and Wilson clearly misses his target.  Luckily it didn’t seem to matter as the pitch had an insane amount of horizontal movement, running in on Escobar and jamming him.

Take a look at the chart below, showing the vertical and horizontal movement on each of Wilson’s pitches:

Tyler Wilson Movement

The middle portion of this chart is empty.  All five of his pitches have a tremendous amount of movement, and none of them move in the same direction.  The fact that he is able to command each of these pitches so well and keep hitters guessing with which one will come next is the reason why he has had so much success.  A big reason why hitters are having trouble guessing his pitches is because of how well Wilson is able to repeat his delivery.  The chart below shows Wilson’s release point for each type of pitch:

Tyler Wilson Release Point
As you can see, his release point is almost identical with all five of his pitches.  At this point, I have watched all of his starts from this season and was very impressed.   I then decided to do some research and was immediately impressed with stats such as his career BB rate and low WHIP, but wanted to dig further.  I began to look through the PITCHf/x data because I was curious to see how effective each of his pitches actually were.  Based on the PITCHf/x value metric, all of his pitches so far this year have graded as above average.  If you are not familiar with the PITCHf/x value scale, someone who has a fastball ranking of zero means that he possesses an average fastball.  Any value above zero means that pitch is above average.  Obviously the higher the number, the better the pitch.  The same goes for negative numbers and pitches being below average.  See the table below for the breakdown of Wilson’s arsenal:

Screen Shot 2016-05-15 at 1.19.17 AM

Based on the above values, the change-up has been Wilson’s most valuable pitch this season with his curveball close behind.  Obviously it is very early in the season and we are working with a small sample size…but that doesn’t mean we can’t have fun!  While doing this research, I set out the goal to find every starting pitcher who throws five or more above-average pitches.  Below is the list of players who fit that description:

Screen Shot 2016-05-15 at 1.41.09 AM
IP = Innings Pitched
FA = Fastball
FT = Two-Seam Fastball
FC = Cut Fastball
SI = Sinker
SL = Slider
CU = Curveball
CH = Change-up
KC = Knuckle Curveball
EP = Eephus

There are only five pitchers who have thrown five or more pitches above average so far this season!  Wilson is in great company, as the other four pitchers are all All-Star-caliber players and borderline household names.  Being that this is such a small sample size, I decided to look back at last year’s stats to see how many players fit this description over a full season.  Using the same parameters and setting the minimum IP to 100, the following table was produced:

Screen Shot 2016-05-15 at 2.05.17 AM

Once again, the names on this list are some of the top pitchers in baseball.  A few of these pitchers have a pitch that graded out as below average, but since they had five or more different pitches all individually grade as above average, they made the final cut.

As you can see, it is very rare to have a pitcher who has five legitimate plus pitches.  I am very interested to see if Tyler Wilson can maintain these results over the course of a full season, and I really hope he is given the opportunity to do so.  If he continues to pitch the way he has been, the Orioles will have no choice but to leave him in the rotation.  Although he has had limited success, Wilson has struggled in each of his starts when facing the lineup the third time around.  This could be due to the fact that he is still in the process of being stretched out from his bullpen role.  When in the bullpen, you don’t have to prepare to face the same hitter three times.  I am hopeful that once he is fully stretched out and back into his starter mentality, he will be able to make the necessary adjustments and continue to throw all of his pitches with confidence.  If he can continue to make quality pitches as he faces the lineup for a third time, I believe Tyler Wilson has the chance to become a very special pitcher.

Memorable quotes I heard during the TV broadcasts:

“Everyone thinks that I pitch with a chip on my shoulder but I really don’t.  I just go out and compete.  I don’t think of it that way.” – Tyler Wilson

“I think he understands himself.  He can maintain his game-plan throughout the game.  He’s going to keep us in the game and give us a chance to win.  What more can you ask for?” – Pitching Coach Dave Wallace

“I love that he can make the ball run in and then cut away.  He pitches to both sides of the plate.  Not a lot of young pitchers can do that.” – Manager Buck Showalter

…no Buck, not a lot of young pitchers can do that.

Twitter – @mtamburri922

Drew Pomeranz Is Here to Stay

After shutting out the Chicago Cubs offense over six innings of 10-strikeout ball, Drew Pomeranz lowered his season ERA to 1.80 and FIP to 2.61. He currently ranks 3rd among qualified starters in K% and is tied for 11th in WAR. Furthermore, Pomeranz has faced four of the top five offenses in the National League, as well as having had a season opener at Coors Field, hence we cannot claim stat padding against mediocre competition. While a .250 BABIP and 82.1 LOB% may not exhibit the greatest signs of stability, Pomeranz is finally reaching the potential that garnered him a top-30 prospect ranking from Baseball America. So what has Pomeranz done to unlock this potential?

Pomeranz has discovered his newfound success by neutralizing right-handed bats. Earlier in his career, Pomeranz’ relative struggles against righties led many to wonder whether his ultimate fate rested in the bullpen. In fact, heading into 2016 many doubted whether he could even earn a spot in the Padres rotation; he couldn’t even earn a mention in Jeff Sullivan’s positional preview post. This sentiment was understandable given his career .340 wOBA against and 7.1 K-BB% when facing right-handed hitters up to this point. In 2016, however, he has lowered the wOBA against to a measly .240 while striking out 34% of righties. By dropping 100 points of wOBA, he’s essentially transformed his average opposite-handed plate appearance from Kyle Seager to Omar Infante. As with any dramatic improvement in performance, a confluence of factors has led to Pomeranz’ success.

Since debuting in 2011, Pomeranz has gradually raised his vertical release point up nearly half a foot. This more over-the-top delivery has undoubtedly provided him greater deception against righties. More noticeably, however, Pomeranz has brought his changeup back from the dead. Early in his career, Pomeranz threw his change roughly 9% of the time to righties. From 2013-2015, when 72% of his appearance came out of the bullpen, Pomeranz lowered that rate to 3%. This season, however, Pomeranz is utilizing his change-piece over 15% of the time against right-handers. Throwing it around 87 mph, Pomeranz’s change nearly perfectly mimics his sinker in both velocity and movement, but to differing results. Pomeranz generates an above-average 44% fly balls on balls in play with his change, while the sinker gets 67% groundballs. This deception, combined with Pomeranz’s pitcher-friendly home park, have led to a dearth of quality contact on the changeup, as illustrated by the .111 ISO against on the pitch.

Despite the resurgence of Pomeranz’s changeup, his improved curveball has been the true game-changer.  He trails only the enigmatic Rich Hill in percentage of pitches that are curveballs; likewise, he employs it over 43% of the time against righties, up from 23% over his career before joining San Diego. His 4.6 curveball pitch value trails only the Phillies duo of Aaron Nola and Jerad Eickhoff, and their club’s experimental pitching philosophy, so far in 2016. After leaving the breaking-ball-murdering confines of Coors Field in 2014, Pomeranz witnessed a significant increase in both vertical movement and velocity. This, however, does not explain his recently-discovered success. Similarly, he has kept his Zone% on the curve right around his career average of 43%. The key lies in where out of the zone he locates the ball. This season, Pomeranz is hitting low-and-gloveside off the plate with almost 30% of his curves to both righties and lefties alike. Prior to this campaign, Pomeranz only hit that spot about 10% of the time, as he more evenly distributed his curveballs across the zone horizontally. Whether a change in approach or simply improved mechanics and command, Pomeranz is finding tremendous success with his hook. Using the curve against righties, Pomeranz has raised his Whiff% to a career-high 16.4% in addition to generating a career-high 39.6 Swing %. Furthermore, nearly three-quarters of his balls in play off the curve are grounders and he has yet to permit a single fly ball on the pitch vs. right-handed hitters.

As Eno Sarris noted in his discussion with him last December, Pomeranz’s success hinges on three things: “his health, his changeup, and his curveball.” Seven starts into the season, Pomeranz’s progress on these three fronts has led him to success against righties and helped him unlock his prior potential. He’s gone from a guy the Athletics traded for spare parts to a solidly above-average starter for the Padres. Perhaps the most encouraging aspect of this emergence: Pomeranz is still only 27 years old. With almost three more years of service time left, and an inevitable sell-off of Tyson Ross, Andrew Cashner, and James Shields on the horizon, Pomeranz could potentially parlay his improvement into an ace role on the Padres staff. Of course, Pomeranz could find himself on the market in the near future, and he would certainly fetch more than Yonder Alonso and Mark Rzepczynski this time around.

xHR%: Questing for a Formula (Part 5)

This is the long-delayed fifth part in the xHR series. If you really want to read the first four parts, they can be located here, here, here, and here.

More than a month late, the highly anticipated follow-up to the first iteration of xHR has arrived. Once more, that increasingly trivial metric will grace the page of FanGraphs, wallowing in the mostly prestigious Community Research section (on the other hand, this section is most definitely the best section on the World Wide Web for experimental metrics and amateur analyses).

Unless the reader has an impeccable memory for breezily scanned, frivolous articles, he or she likely needs a reminder as to what xHR% is and aims to be. xHR% is a metric that describes at what rate a player should have hit runs over a given season. From this, expected home runs, a more understandable counting statistic, can be found by multiplying plate appearances by xHR%. It cannot be emphasized enough that the metric is not predictive; it only aims to describe. Without further ado, the formula is here:

I know that’s a lot to look at, and it isn’t exactly self-evident what all of the variables mean. As such, an explication of each part is necessary and provided below. (For logical rather than chronological purposes, the Kn variable will be analyzed last.)

AeHRD – One of the biggest differences between this formula and the last one is that this one does not use home run distance. This iteration uses expected distance, rendering it a combination of simple math, sabermetric theory, and physics. As such, expected home run distance strips out one of the biggest factors in luck — the weather.

Expected home run distance is found by utilizing a method taken from Newtonian Mechanics to calculate how far objects go. By using ESPN’s HitTracker website, I was able to obtain launch angles and velocities for nearly every home run hit in 2015. From this, I was able to resolve velocity into its respective parts, velocity in the x-direction (Vx) and velocity in the y-direction (Vy). After that, I calculated the amount of time the ball would be in the air with the formula vf=vi+gt, where vf is final velocity (0 m/s), vi is initial velocity (Vy), and g is simply the gravitational acceleration constant. Finally, I multiplied Vx by time in order to get the total expected distance.

I repeated that process for every home run hit by a given player in order to find his average expected home run distance. By doing this, I was able to strip out all weather-related components.

AeHRDH – Utilizing the same process as above, I found the average expected home run distance for every stadium. This is the player’s home stadium’s average home run distance, regardless of team.

AeHRDL – The same as above, but done for every home run hit in the majors last season.

When put together in the numerator and the denominator, the above variables serve as a “distance constant” of sorts that will at most adjust the resulting expected home runs by plus or minus two. Occasionally, the impact is negligible because the average expected distance is very close to that of the player’s home stadium and the league. Averaging the mean expected home run distance of the league and of the home stadium allows the metric to paint a more accurate picture of where the player hit his home runs and whether or not they should have left the park. Nevertheless, it’s important to note that this formula still fails to account for fly balls that fell just short of the wall due to the wind and other factors, meaning that there are still expected home runs unaccounted for.

FB% – If you remember correctly, or took the time to briefly review the previous posts, then you will recall that in the prior iteration of the formula there was a section very similar to this one. The only differences are that the weights on each year of data have changed (those are still somewhat arbitrary, however, but I am working on getting them to more precisely reflect holdover talent from past years) and the primary statistic used.

Previously, HR/PA was used, but it had to be abandoned because the results were too closely correlated with reality. This time, I looked at how similarly descriptive formulas were quantified. Oftentimes, those metrics did not use the target expected metric in their formulas. Rather, they utilized other metrics that correlated moderately well or strongly with their expected metric. In this case, I decided to use FB% because it’s a relatively stable metric (especially in comparison with HR/FB), and it has a strong correlation with HR% (about .6).

As a clarification, the subscript Y3, Y2, and Y1 indicate the years away from the season being examined, where Y1 is really Y0 because it’s zero years away. So just to be clear, Y1 is the in-season data from the year being examined. In the data to be examined, for example, Y1 is 2015, Y2 is 2014, and Y3 is 2013.

Kn – As you can well imagine, FB% numbers are always far greater than HR% numbers*, resulting in some truly ridiculous results if a constant isn’t applied that relates HR% to FB%. For instance, without a constant to modify the results, Jose Bautista would have been expected to hit 304 home runs last season. That’s a lot of home runs. Just two and a half seasons of playing at that level and he’d have the home run record in the bag. Luckily, I’m not stupid enough to think that that’s actually possible, and so I initially related FB% and xHR% with a constant, called KCon.

Unfortunately, KCon didn’t work as well as I’d hoped because it skewed expected home run results way up for terrible home run hitters and way down for the best home run hitters. By skewed, I mean bad by more than six home runs. And so I, in my infinite (and infantile) amateur mathematical wisdom, made it into a seven part piecewise** function. By this, I mean that there’s a different constant for each piece of the formula, defined by HR% at somewhat arbitrary, though round points. For clarity, here they are:

K1 = HR%<1

K2 = 1≤HR%<2

K3 = 2≤HR%<3

K4 = 3≤HR%<4

K5 = 4≤HR%<5

K6 = 5≤HR%<6

K7 = 6<HR%

It works quite well. I am very excited about the current iteration of xHR%, its implications, and all it has to offer. Of course, it is not finished, but I think I’m getting closer. Please comment if you have any questions, an error to point out, or anything of that nature. There will be a results piece published soon on the 2015 season, so keep an eye out.

*It wouldn’t be surprising if Ben Revere became the first player to have a HR% equal to FB% (both at 0%, naturally).

**It is neither continuous nor differentiable.