Author: Devin Jordan

Author Archive

Different Aging Curves For Different Strikeout Profiles

April 22, 2015

What follows will look at aging curves as they relate to players with specific strikeout profiles. Specifically, we will look at how wOBA ages for players that strikeout more than the league-average strikeout rate and less than the league-average strikeout rate.

Through the research that is presented in this post, two points will be proven:

Players of different strikeout profiles age—their wOBAs change—at different rates.
The aging curve for players of different strikeout profiles has changed over time.

Before I present the methodology, the research that was conducted, and their conclusions, I want to give a big thank you to Jeff Zimmerman, who has not only done a lot of research around aging curves, but has also helped me throughout this process and pushed me in the right direction several times when I was stuck. Thank you.

Population

In order to give a non insignificant amount of time for a player’s wOBA to stabilize, but not place the playing time threshold for plate appearances so high that we artificially limit the population even more than it naturally is at the ends of the age spectrum, I looked at all player season from 1950 to 2014 where a player had a minimum of 600 plate appearances for the first aging curve in this post. The second aging curve in this post looks at all player seasons from 1990 to 2014 with a minimum of 600 plate appearances.

Now that we have our population, we need to split our population into two groups: players that strikeout more than league average and players that strikeout less than league average.

Because the league average strikeout rate of today is very different than it was 65 years ago, we can’t look at a player’s strikeout rate from 1950 and compare it to the league average strikeout rate of today.

In order to divide the population into two groups, I created a stat that weighs a player’s strikeout rate against the league average strikeout rate for the years that they played. For example, if a player played from 1970 to 1975, their adjusted strikeout rate would reflect how their strikeout rate compares to the league average strikeout rate from 1970 to 1975.

Players were then placed into two buckets based on their adjusted strikeout rate: players that struck out more than league average and players that struck out less than league average.

Methodology

There has been a lot of discussion over the years about the correct methodology to use for aging curves. This conversation has had altruistic intentions in the sense that it’s aim has been to minimize the survivorship bias that is inherent in the process, and, through the progress that has been made over the years, this study uses what the author has found to his knowledge to be the best technique to date. This article by Mitchell Lichtman summarizes a lot of the opinions.

While there is a survivorship bias inherent in any aging curve, the purpose of the different techniques used to create aging curves is to minimize the survivorship bias wherever possible.

What We Don’t Want In an Aging Curve

An aging curve is not the average of all performances by players of specific ages. For example, say you have a group of 30-year-old players that have an average of a .320 wOBA and group of 29-year-old players that have an average of a .300 wOBA.

The point of an aging curve is to see how a player aged, not how they played. The group of 30-year-old players has a high wOBA because they are a talented group of players; they lasted long enough to play until they are 30. As they aged from the previous year, when they were 29 to their current age 30 season, they lost the bottom portion of players from their player pool. These are the players that couldn’t hang on any longer, whether it be because of a decline in defense, offense, or a combination of both. This bottom portion of players lower the wOBA of the current 29-year-old population through their presence and raise the wOBA of the 30-year-old population through their absence.

At the same time, the current 30-year-olds aged from their age-29 season to their age-30 season. Sure, there may be players who had a better age-30 season than age-29 season, but the current group of 30-year-olds, as a whole, still played worse at 30 than they did at 29.

When you look at the average of a particular age group, in this case 30-year-olds, you only see the players that survived, and, because they no longer play, you leave behind the players that are hidden from you sample. The method that follows resolves this issue to an extent.

What We Do Want In an Aging Curve

This study uses the delta method which looks at the differences of player seasons (i.e. a players age 29 wOBA minus their age 28 wOBA) and weighs those differences by the harmonic mean of the plate appearances for each pair seasons in question.

I would explain this further, but Jeff Zimmerman does an excellent job of this in a post on hitter aging curves that he did several years ago. While Jeff Zimmerman looked at RAA, which is a counting state, the methodology is basically the same for our purposes and wOBA, which is a rate stat:

In a nutshell, to do accurate work on this, I needed to go through all the hitters who ever played two consecutive seasons. If a player played back-to-back seasons, the RAA values were compared. The RAA values were adjusted to the harmonic mean of that player’s plate appearances.

Consider this fictional player:

Year1: RAA = 40 in 600 PA age 25
Year2: RAA = 30 in 300 PA age 26

Adjusting to harmonic mean: 2/((1/PA_y1)+(1/PA_y2)) = PA_hm
/((1/600)+(1/300)) = 400

Adjust RAA to PA_hm: (PA_hm/PA_y1)*RAA_y1 = RAA_y1_hm
(400/600)*40 = 26.7 RAA for Year1
(400/300)*30 = 40 RAA for Year2

This player would have gained 13.3 RAR (40 RAA – 26.7 RAA) in 400 PA from ages 25 to 26. From then, I then would add all the changes in RAA and PA together and adjust the values to 600 PA to see how much a player improved as he aged.

Findings

Below is an aging curve by strikeout profile for all player seasons with over 600 plate appearances in a season from 1950 until 2015.

We can see several findings immediately:

Players do age differently based on their strikeout profile.
Players that strikeout more than league average peak at 23.
Players that strikeout less than league average take longer to hit their peak—their age 26 season.
Players that strikeout more than league average age better than players that strikeout less than league average.

From a historical perspective, this graph is fun to look at, but the way the game was played over half a century ago is eclipsed by societal evolutions that today’s players benefit from.

To give us a more realistic idea of how today’s players age relative to their strikeout rate, I made another graph the at looks at player seasons from 1990 to 2014.

What we find in this graph, which is more current with today’s style of play, is that players still age differently dependent on their strikeout profile, but not in the same way that they did in the previous sample.

Players that strikeout more than league average still peak earlier than players that strike out less than league average, but in this more current population of players, players that strikeout more than league average peak very early—their age 21 season. This information would reciprocate the sentiment that has been conveyed through recent work that suggests that the aging curve has changed to the point that players peak almost as soon as when they enter the league.

The peak age for players that strikeout at below league average rates is still 26, but whereas this group aged more poorly than the strikeout heavy group in our previous population, players that strikeout at below league average rates now age better than their counterparts.

Conclusions

This information can make material differences for our overall expectations and outlooks on players.

Previous knowledge would suggest that players like George Springer and Kris Bryant—players who have exorbitant strikeout rates—are still on the climb as far as their talent goes, but this information shows that these players may already be at/close to their peaks or on the decline as far a their wOBA is concerned.

This information also shows that we should be patient with prospects that have a penchant to put balls is play; while they peak more quickly than they did in the previous population, they take longer to develop than players with more swing and miss in their game, and when they do start to decline, there isn’t much need to worry, because their climb from their peaks will be gradual.

Like many other studies that have looked at new aging curves, this study confirms that players/prospects peak earlier now than at any other point throughout history, but it also shows that a player’s trajectory upward and downward is dependent on characteristics specific to their approaches at the plate.

Devon Jordan is obsessed with statistical analysis, non-fiction literature, and electronic music. If you enjoyed reading him, follow him on Twitter @devonjjordan.

Should Players Try to Bunt for a Hit More?

by Devin Jordan

April 2, 2015

This post will look at bunting for a hit and try to identify if it is a skill that can efficiently and effectively increase offensive production, and answer the general question of, should players bunt more?

Is Bunting for a Hit a Skill?

Before we answer the ultimate question of whether or not players should bunt more, we need to first identify whether or not bunting for a hit is a skill to begin with.

This is where data becomes an issue, but we should be able to make do.

Before 2002 there are no records on FanGraphs of bunt hits, so I looked at all qualified hitter seasons from 2002 to 2014 in which a player bunted three or more times in a season—since most players go a whole season without a bunt, three bunts or more in a season puts a player in the top fifty percentile for bunt attempts in a season.

From there I looked at the year-to-year correlation of a player’s bunt hit percentage—bunt hits divided by bunts (i.e. a player’s batting average on bunts)—for the entire population. Mind you, we only have record of the amount of times a player bunts, not the amount of times a player attempted to bunt for a hit. So in all reality, a player’s bunt hit percentage would be higher if we were able to tease out the amount of times that they laid down a sacrifice bunt from their total bunts. However, from the data we are still able to find a .33 year-to-year correlation on bunt hit percentage for our population of hitters.

Takeaway: bunting for a hit is a skill.

Should Players Bunt More?

Now that we’ve answered the question of whether or not bunting for a hit is skill, we can circle back to our original question of whether or not players should bunt more.

Because we want to have a large enough sample of attempted bunts for bunt hit percentage (BH%) to stabilize, we will look at all qualified hitter totals (i.e. multiple season totals), not individual seasons, from 2002 to 2012.

To answer our question we need to look at the expected value gained for a player when they have an at bat where they don’t attempt to bunt—a regular at bat—and subtract it from the expected value gained in at bats where they attempt to bunt for a hit—a bunt hit attempt.

To come up with the expected value of a regular at bat we have to look at the linear weight value added per plate appearance of a player’s at bats from 2002 to 2012, or their entire career value if their whole career falls within that period. We then multiply that linear weight value per plate appearance by probability that they achieve one of those outcomes.

Here’s the formula for Expected Value of a regular at bat (xRA):

=((((1B-Bunt Hits)*0.474)+(2B*0.764)+(3B*1.063)+(HR*1.409)+(HBP*0.385)+(IBB*0.102)+(UBB*0.33))/(PA-Bunt Attempts))*((1B-Bunt Hits)+2B+3B+HR+BB+HBP)/(PA-Bunt Attempts)

This formula looks much more complicated than it actually is, but you’ll be able to click into the cells in the live excel document below and visually see how the values are computed. All of the decimals that are part of the formula are linear weight values, which you can find here.

We need to go through the same process to figure out what the expected value added is for a player on a bunt hit attempt—the average value added with a bunt times the probability of a successful bunt hit (BH%).

I was unable to find the linear weight value of a bunt hit, but we do have a sufficient substitute. A bunt hit essentially adds the same value as a base hit with no runners on base—.266 runs per inning. A single with no runners on base is a good proxy for the happening of a bunt hit. Like a base hit with no runners on base, a bunt hit offers no opportunity for a runner on base to score or advance past the next base in front of them. Short of looking at box score data to find the average amount of runners that scored per inning after a successful bunt hit, which will need to be done for a more conclusive answer to our question, we will use the average linear weight value of a single with no runners on for each of the out states as our linear weight value (i.e. I averaged the linear weight value of a single with no runners on base and no outs, a single with no runners on base and one out, and a single with no runners on base and two outs to get the average linear weight value; this is not the exact way to get the linear weights value of a single with no runners on base, because there are undoubtedly a different amount of singles with no runners on base that occurred for each out state, but this should be close).

This is the formula for expected value gained on a bunt attempt (xBA):

Bunt Hit Average (bunt hits/total bunts)*.266 (our estimated linear weight value for a bunt hit)

Now that we’re able to come up with the expected value added for a player in a regular at bat (xRA) and the expected value added for a player on a bunt hit attempt (xBA), we can subtract the two values from each other—xRA minus xBA—to see which players have lost the most value per plate appearance by not bunting.

This chart shows the players with a minimum of ten bunt attempts that have lost the most value by not bunting (i.e. which players have the biggest difference between their expected value gained from a regular at bat and a hit attempt):

Click Here to See Chart with Results

Bunts: Bunt attempts

Bunt Hits: Hits on bunts

RA%: Chance that a positive offensive event occurs, outside of bunt hit

BA%: Chance that a player gets a hit on a bunt

xRA: Expected value added from a regular at bat

xBA: Expected value added from a bunt attempt

Net Value: xRA minus xBA

Implications

This research doesn’t mean to suggest that all players who have a higher expected value added on a bunt attempt than they do in a regular at bat should bunt every time. Carlos Santana gets a hit in 78% of the at bats where he bunts, but he has only attempted 14 bunts in his career, so we don’t have a large enough sample of bunt attempts to know what his actual average on bunt attempts would be; this goes for most if not all of the players on this list. There is most likely an inverse correlation between BA% and bunt attempts (i.e. the more you try and bunt for a hit, the less likely you will get a hit as the infield plays further up on the grass).

This research means to suggest that players have not reached the equilibrium for bunt attempts (i.e. they haven’t maximized their value). Players should increase the percentage of the time they bunt until their xRA and xBA are the same; at this point their value will be maximized. The more a player with a negative net value tries to bunt for a hit, the more expected value he will add. This will happen until his expected value added from a bunt falls beneath what he is able to achieve through a regular at bat; this happens when the defense starts to defend him more optimally, they align for the bunt hit, and his BH% falls. Once this occurs he will force the defense to play more honestly—the infielders will have to play farther in on the grass—and increase his expected value added in a regular at bat as more balls get past the infield from shallow play.

What’s interesting is that there are two different types of players on this list. The first type of player is the type that you would traditionally think of as player who would try and bunt for a hit: the speedster with very little power. The second type of player is the player who, as a result of the recent, extensive use of defensive shifts, has a high BA%—batting average on bunt hits—because the defense is not in a position to cover a bunt efficiently: Carlos Santana, Carlos Pena, Colby Rasmus, etc.

The voice for the question about why players don’t try to beat the shift with bunts down the third base line has grown louder, but there still hasn’t been a good answer as to why it hasn’t been done more; the evidence seems to suggest that it is valuable and should be done more. I’m not able to confirm that the 11 hits that Carlos Santana had on bunt hits came when the defense was in a shift, but I think it would be somewhat unreasonable to believe that he was able to beat out a throw to first on a bunt hit attempt when the defense was in a traditional alignment more than a few times.

Carlos Santana Spray Chart — Carlos Santana’s spray chart take from Bill Petti’s Spray Chart Tool

The image above is a spray chart of Carlos Santana’s ground ball distribution as a left-handed hitter; the white dots are hits and the red dots are outs. This chart suggests that it would be advantageous for teams to shift against Santana when he bats left-handed. I would argue that because of Santana’s success—his high BH%—at bunting for a hit, he should do this more, which will generate more value by itself, and increase the value generated in regular at bats as he forces the defense to change their defensive shift against him from the increase in bunt attempts. However, once he reaches the equilibrium, any further changes may ultimately be a zero sum game.

There are no silver bullets to get more runners on base, but there will always be more efficient, undervalued ways to achieve that goal. This research has proven that bunting for a hit is underutilized, and once more work is done to tease out sac bunts from a player’s bunt hit attempts and calculate an accurate BH%, along with the generation of linear weight values for a bunt hit, we will have a more definitive answer for what a bunt hit is worth.

Devon Jordan is obsessed with statistical analysis, non-fiction literature, and electronic music. If you enjoyed reading about pitcher value in Fantasy Baseball, follow him on Twitter @devonjjordan.

Does Troy Tulowitzki Suffer Without Carlos Gonzalez?

by Devin Jordan

August 12, 2014

Does Troy Tulowitzki suffer without Carlos Gonzalez in the lineup?

Several weeks ago, in the same way my last article on rookie first and second half splits was inspired, my attention was alerted when a podcast personality contrived that Troy Tulowitzki, before his most recent bout with the injury bug, had performed poorly because Carlos Gonzalez had been out of the lineup.

The pundit grabbed the lowest handing fruit he could find in an effort to create a narrative, and a dogmatic one at that, as to why the Colorado Rockies slugger had not lived up to his pre All-Star break numbers.

******* *******’s (I’d prefer the article to be more about the subject of Tulowitzki and Gonzalez than the podcast member) argument was that without Carlos Gonzalez in the lineup, pitchers could approach Tulowitzki without fear, give him less strikes, and that is why his hitting has declined.

While this pundit surmised that Troy Tulowitzki’s performance declines when Carlos Gonzalez is out of the lineup, the numbers tell a much different story.

While we will look at the more direct numbers in a moment, the idea that Tulowitzki plays worse without Gonzalez is essentially the idea of lineup protection at a micro level. There have been countless instances that have debunked the idea of lineup protection, and, to my knowledge, none that have proved its existence.

Screen Shot 2014-08-10 at 6.02.45 PM

The research looked at all games from 2010—Carlos Gonzalez’ first complete season—to today.

The results paint a much lighter picture than the Guernica that ******* ******* painted.

In games where Tulo has played without Cargo, he has had a higher AVG, OBP, OPS, and BB%. One might think that Tulowitzki would continue his normal performance without Carlos Gonzalez in the lineup, but, as this information suggests, it is hard to imagine that Tulo plays better because Carlos Gonzalez is not in the lineup, which leads me to believe what one would normally think about out of the ordinary performances in a small amount of at bats.

The utility of these results should be used for descriptive, and not predictive, purposes. Troy Tulowitzki has only had 479 plate appearances without Carlos Gonzalez, and that is far from a large enough sample size to be deemed reliable.

But because of the recent remarks made by Tulowitzki, it seems like it will be more likely than not that sooner rather than later we will see a large enough of a sample size of Tulo in another uniform to see if this trend continues.

While Tulo has played worse and is hurt as of late, we might expect that it is because he was unlikely to live up to the performance he had in the first half, and not because of Cargo’s presence or lack thereof in the lineup. Over the course of the first half of the season, Tulowitzki’s posted the 15^th best OPS in a half of a season since 2010.

Tulo’s latest play suggests a regression to the mean, and while we are powerless to know exactly why regression happens, some pundits proclaim to know the reason (i.e. Tulo plays worse without Carlos Gonzalez), when really their specious statement is noise with a coat of eloquent words painted upon it.

When the next “expert” tells you that Tulo has preformed poorly, because “ he wants out of Colorado” or “he wants to be traded”, you’ll know to be more skeptical and not passively agree.

If he gets healthy at some point this season, we should expect Tulowitzki to perform close to his projections in all areas for the rest of the year, and it will be with or without Carlos Gonzalez, not because of him.

Do Rookie Hitters Decline in the Second Half?

by Devin Jordan

July 23, 2014

Do rookies perform worse after the All-Star break?

My claim over this statement is nonexistent, while the original thought of its occurrence was brought to my attention by Adam Aizer on the CBS Fantasy Baseball Podcast.

My judgment dissuaded, I thought that it would be worth the effort to look into the validity of the statement.

From the perspective of an offensive player, rookies infrequently make enough of an impact in the size of leagues (i.e. 10-team and 12-team leagues) that pedestrian Fantasy Baseball players occupy. For those sizes of leagues that the aforementioned owners participate in, a rookie hitter that is worth owning is either an elite prospect or a player that has preformed beyond their true talent level. As a result, the former is rare, while it would make sense for the latter to regress to their true talent level and is more common than the former. The idea that rookie hitters decline throughout the year is just a misevaluation of the player’s true talent level.

To put another way, it is the same logic that comes into play with a recent event: the Home Run Derby. Players that participate in the Home Run Derby are players that have exceptional first halves, which are often beyond their true talent level. These players often perform worse in the second half than they did in the first half, not because they participated in the monotonous and dated event that has become the Home Run Derby, but because, just like the rookies who perform worse in the second half of the season than the first, they have regressed toward their true talent level; when the rookies regress, they have just regressed to the point where they are not ownable.

The research looks at all player seasons between 1988 and 2013 where a batter was in their first season, had 250 plate appearances in the first half of the season, and had 250 plate appearances in the second half of the season.

Screen Shot 2014-07-20 at 8.48.48 PM

The rookie second half decline and the post Home Run Derby slump intuitively make sense, but intuition does not always bear truth. Through cognitive ease we rationalize that “Swinging that hard for that long throws off your timing”; “A rookie is too young to be able to make it through the long hot summer.”

Because most fantasy leagues are small, the only reason that the common rookie was on our teams to begin with is because they had to play beyond their ability in the first half of the season. The rookie who is on our team right now, unless he is a reputable prospect, is probably a safe bet to decline. But as a whole, we can see that there is no decline in rookie performance based on first half/second half splits.

Our desire to perceive a decline is just our desire to hold onto our ability as talent evaluators. We know that Yangervis Solarte is a great player, and the only reason he hasn’t been able to sustain his performance is because he is rookie that can’t play out the season: common baseball logic. In actuality, Solarte was not as good as some originally thought, and his true talent was never good enough to be on a 10 or 12 team league.

Summary:

Rookie hitters, as a generalization, are not good enough to play in 10 or 12 team leagues, and, as a generalization, those that do play in ten team leagues regress to their true talent level, which is not valuable enough to be ownable.

Devin Jordan is obsessed with statistical analysis, non-fiction literature, and electronic music. If you enjoyed reading him, follow him on Twitter @devinjjordan.

Expected RBI Totals: The Top 267 xRBI Totals for 2013

by Devin Jordan

March 28, 2014

While there is almost zero skill when it comes to the amount of RBI a player produces, through the creation of an expected RBI metric I have found a way to look at whether or not a player has gotten lucky or unfortunate when it comes to their actual RBI total.

I hope I don’t need to do this for most of our readers, because it’s 2014 and you’re reading about baseball on a far off corner of Internet, so you obviously are more informed than the average fan who consumes ESPN as their main source of baseball information, but lets talk about why RBI, as a stat, and why it is not valuable when you look at a players’ talent. The amount of RBI a player produces are almost—we’ll get into the almost a little later—entirely dependent on the lineup a player plays in. If a player doesn’t have teammates that can get on base in front of them in the lineup, there aren’t very many opportunities for RBIs; that’s the long and short. Really, RBI tell more about the lineup a player plays in than the player himself.

Intuitively, this makes sense. The more runners there are on base, the more chances the batter will have for RBI, and the more RBI the batter will accumulate. When I said, “The amount of RBI a player produces are almost…entirely dependent on the lineup a player plays in”, lets be a little more precise. My research took the last three years of data (2010 to 2013) and looked at all players that had 180 runners on base (ROB) during their at bats over the course of a season. Over the three seasons, which should be enough data—it was a pain in the ass to obtain the data that I did find—ROB correlated with RBI by a correlation coefficient of .794 (r²= .63169), which is a very strong positive relationship.

But hey, that doesn’t mean that you can be a lousy hitter get a lot of RBI. That would be like if you threw a hobo in the Playboy Mansion and expected him to get a lot of tail; all the opportunity in the world can’t mask the smell of Pall Malls, grain alcohol and a lifetime of deflected introspection; trust me, I worked at a liquor store for three years in college, and I know. In the same sample of players from 2010 to 2013 as used above, the correlation between wOBA—what we’ll use here to define a player’s ability at the plate—and RBI is .6555. So there is a relationship between a player’s ability and their RBI total, but nowhere near as strong as the relationship between their RBI total and their opportunity—ROB.

However, when we combine a player’s opportunity—ROB—with their talent—wOBA—we should get a good idea of what to expect for a hitter’s RBI total. Here is the formula for the expected RBI totals based on the correlations between ROB and wOBA, and RBI: xRBI =- 85.0997 + 262.7424 * wOBA + 0.1918 * ROB.

When you combine wOBA and ROB into this formula you end up with a correlation coefficient of .878 and an r²of .771. Wooooo (Ric Flair voice)!!!!! With the addition of wOBA to ROB we increase our r², from .63 with just ROB, by fourteen percent.

2013 Expected RBI Leaders

Click Here to See xRBI Leaderboard

Miguel Cabrera — Photo by: Keith Allison

Let’s think about why Chris Davis’ xRBI is so much lower than his 2013 actual RBI total.

Davis had 396 runners on base while he batted in 2013, which is 140 ROB less than Prince Fielder who led the league with 536 ROB; Davis’ opportunity was limited.

Davis’ RBI total was considerably higher than what his opportunity would suggest his RBI total should be, and one of the reasons that he outperformed his xRBI total by so much was because of the amount of home runs he hit. Davis, or any batter, doesn’t need a runner on base to get an RBI when he hits a home run. But beyond home runs there is another reason why Davis and other batters outperform their xRBI totals: luck.

Hitting with runners on base is not a skill. A batter has the same probability, regardless of the base/out state, of a hit. Lets forget pitcher handedness and Davis’ platoon splits at the moment. With a runner on second base and two outs Chris Davis will get a hit .272 (27%) of the time—I averaged his Steamer and Oliver projections for 2014 together. Davis, and Alfonso Soriano for that matter, who was the only player to outperform his xRBI by more than Davis in 2013, was lucky and happened to have runners on base the majority of the 28.6%—Davis’ 2013 batting average—of the time he got a hit in 2013.

To put Davis’ 2013 136 RBI season into perspective, in the last five seasons there have been eight players to record 130 or more RBI in a season. Of those eight players, only two—Ryan Howard (2008-9) and Miguel Cabrera (2012-13)—were able to duplicate the performance the following year.

While the combination of ROB and wOBA has allowed us come up with a reliable xRBI, the next step, to increase the reliability of xRBI and account for players who produce a large amount of their RBI from home runs (i.e. Davis), is to include a power component in xRBI: HR/FB ratio.

Devin Jordan is obsessed with statistical analysis, non-fiction literature, and electronic music. If you enjoyed reading him, follow him on Twitter @devinjjordan.

A Different Way to Look at Strikeout Ability

by Devin Jordan

October 5, 2013

Mike Podhorzer has looked into the relationship of a batters’ average fly ball distance as it relates to their HR/FB ratio, and has found results that will allow others to more accurately project a hitter’s home run totals from year to year.

This got me thinking. Which can be a good or bad, but in this case, the authors’ labor produced a fruitful return. While a hitters’ HR/FB ratio can fluctuate indiscriminately from year to year, Podhorzer has proven a batters’ average fly ball distance is a better indication of a player’s true talent power production. In the same light, my study looks at how a player’s swinging strike rate (SwStr%) is a better indication of a pitcher’s strikeout potential than K/9.

My assumption was that K/9 and SwStr% have a strong relationship. But, how strong of a relationship is it? To find this out, I took all qualified starter seasons from 2003 to 2013, which gave me a sample size of 933 pitchers, and ran a correlation between their SwSTR% and their K/9. The results showed that there is an exceedingly positive correlation between SwSTR% and K/9, to the tune of a .807 correlation coefficient and a .65 R².

Screen shot 2013-10-03 at 1.06.11 PM

What is important to note is that there are very few pitchers present in the sample with a SwStr% above 13%, which may be symptomatic of something larger. Getting batters to swing and miss is difficult. The more often you can get a batter to swing and miss, the more valuable you are as a pitcher. As a result, the higher the SwStr%, the smaller the sample size becomes. For example, Johan Santana (2004) and Kerry Wood (2003) are the two lone dots to the farthest right on the graph with SwStr% of over 15: wow.

After the relationship between SwStr% and K/9 ratio became unmistakable, I calculated what a particular SwSTR%s translates into, as far as K/9, with the formula Y=68.473*x+0.8435, and got this chart:

Screen shot 2013-10-03 at 1.55.30 PM

The next step is to take what we have discovered and apply it to a sample. The chart below shows each qualified pitcher for 2013, their SwStr%, xK/9, K/9, and K/9-xK/9. xK/9 is what we would expect a pitcher’s K/9 to be based off of their SwStr%, and K/9-xK/9 shows us how much a pitcher over-performed or under-performed their SwStr% and xK/9.The first set of ten names are the pitchers who outperformed their xK/9 the most, and the second list of ten names are the players who underperformed their xK/9 the most.

Screen shot 2013-10-03 at 2.44.59 PM

The results show that Ubaldo Jimenez, Yu Darvish, and Jose Fernandez are the pitchers who have outperformed their xK/9 the most in 2013. These three pitchers also have great a great amount of deception and/or command (deception in Jimenez’s case: because, no one has ever called Ubaldo a control artist). And, while they may have outperformed their true talent in 2013 to an extent—they all had remarkable years—maybe that deception and control, which SwStr% does not take into account, leads to less swings by batters and more pitches taken for strikes, as opposed to swung at for strikes.

Perhaps xK/9 is more helpful when we look at pitchers who underperformed their SwStr%, like Jarrod Parker and Kris Medlen. Both of these pitchers had down years compared to what their projections suggested, but their xK/9s seem to be optimistic about their futures. Parker showed a .18 improvement in his K/9 from the first half to the second half of the season, while Medlen showed almost a full point improvement going from a 6.81 K/9 in the first half to a 7.67 K/9 in the second half.

While xK/9 may miss something—deception and command—when it comes to pitchers that outperform their SwStr%, xK/9 seems to find a reason to be optimistic when it comes to pitchers like Kris Medlen and Jarrod Parker who have underperformed their SwStr% and strikeout potential.

Devon Jordan is obsessed with statistical analysis, non-fiction literature, and electronic music. If you enjoyed reading him, follow him on Twitter @devonjjordan.

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG