Stephen Strasburg, Dazzy Vance and Context

Eric Seidmen wrote an interesting article last Thursday about Atlanta reliever Craig Kimbrel’s historic strikeout pace. So far, Kimbrel is sporting a blistering 42.7% strikeout rate (K%). Even for a relief pitcher in this era, that’s incredibly impressive. But one person who commented on the story noted that there was a non-reliever approaching the same level of whiff greatness (i.e. > 30% strikeout rate).

Nationals phenom Stephen Strasburg has thrown 182 innings in the big leagues and has struck out 32.5% of the batters he’s faced. No starting pitcher who lasted any significant amount of time ever finished his career with a strikeout rate higher than 30%. The closest  is Randy Johnson and his 28.5% strikeout rate. This season, Strasburg has a 33% strikeout rate. If he were to maintain that pace, he’d be the 10th starting pitcher in history to achieve the feat and would have the 23rd such season since 1916. But take a look at that list and you’ll note that the oldest instance came back in 1984.

The problem we run into with strikeouts — like many statistics in baseball — is that the playing environment has changed over time.

The average rate at which pitchers record strikeouts has jumped dramatically since 1916 (the earliest year we can calculate strikeout percentage). That year, the league-average starter had a 10.5 K%. Compare that to the current 18.6% rate this year, and we have a massive gap. Whether hitters are more prone to strikeouts, pitchers are simply nastier now, or some combination of environmental and structural changes to the game (e.g. how players are selected, technology, etc.), the fact is a strikeout in 1916 was rarer than one today.

Fortunately, integrating some context into the conversation is incredibly easy. We simply need to calculate a “plus” statistic for strikeout rate so that we can compare how much better than league-average an individual pitcher’s strikeout rate is in a given season. Here’s the formula for calculating what I will call K%+: [(Pitcher’s K% / League Average K%) * 100].

Similar to wRC+, a K%+ of 100 means that a pitcher’s strikeout rate was league-average*. A pitcher with a 125 K%+ would have a strikeout rate 25% better than league average, and a 75 would indicate a rate 25% worse than league-average.

So if we adjust all individual qualified starter seasons since 1916 in this way, which pitcher had the greatest single season in terms of strikeout rate? The one and only Dazzy Vance.

Vance posted a 21.5 K% in 1924, which ranks as the 404th-best strikeout rate in a single season. But back then, the league-average K% was a mere 6.9%. That means Vance posted a whopping 312 K%+. That season, Walter Johnson had the second-highest K% at 13.8%, which translates to a 200 K%+**.

In fact, if we look at a K%+ leader board, Dazzy Vance appears more dominant in terms of strikeouts than any other pitcher.

Season Name K% League K% K%+ K% Rank K%+ Rank
1924 Dazzy Vance 21.5% 6.9% 312 404 1
1925 Dazzy Vance 20.3% 6.9% 294 587 2
1926 Dazzy Vance 19.6% 7.2% 272 725 3
1926 Lefty Grove 18.1% 7.2% 251 1082 4
1928 Dazzy Vance 17.8% 7.4% 241 1160 5
1999 Pedro Martinez 37.5% 15.6% 240 1 6
1941 Johnny Vander Meer 21.4% 9.2% 233 419 7
1984 Dwight Gooden 31.4% 13.5% 233 14 8
1928 Lefty Grove 17.0% 7.4% 230 1389 9
1946 Hal Newhouser 23.4% 10.2% 229 205 10
1927 Dazzy Vance 16.4% 7.2% 228 1627 11
1923 Dazzy Vance 16.6% 7.3% 227 1546 12
1946 Bob Feller 23.0% 10.2% 225 241 13
2001 Randy Johnson 36.7% 16.4% 224 2 14
1979 J.R. Richard 26.6% 11.9% 224 66 15
1939 Bob Feller 19.8% 8.9% 222 684 16
1976 Nolan Ryan 27.4% 12.4% 221 53 17
1995 Randy Johnson 34.0% 15.4% 221 6 18
2000 Pedro Martinez 34.8% 15.8% 220 3 19
2000 Randy Johnson 34.7% 15.8% 220 4 20
1989 Nolan Ryan 30.5% 13.9% 219 17 21
1955 Herb Score 25.1% 11.5% 218 120 22
1938 Bob Feller 19.2% 8.8% 218 799 23
1927 Lefty Grove 15.7% 7.2% 218 1908 24
1978 J.R. Richard 26.6% 12.2% 218 66 25
1928 George Earnshaw 16.1% 7.4% 218 1753 26
1999 Randy Johnson 33.7% 15.6% 216 7 27
1930 Lefty Grove 17.6% 8.2% 215 1219 28
1936 Van Mungo 18.1% 8.5% 213 1082 29
1978 Nolan Ryan 25.8% 12.2% 211 93 30

Vance has four of the top five spots on the list and six of the top 12. Of his 11 qualifying seasons, his 170 K%+ was his worst. The best non-Vance season came fromLefty Grove in 1926 (251 K%+). To add some additional context, Randy Johnson posted eight seasons with a K% of greater than or equal to 30% (most all time, double the number by Pedro Martinez). Here’s Johnson’s top 10 K%+ seasons, compared to Vance:

Season Name K% League K% K%+ K% Rank K%+ Rank
1924 Dazzy Vance 21.50% 6.9% 312 404 1
1925 Dazzy Vance 20.30% 6.9% 294 587 2
1926 Dazzy Vance 19.60% 7.2% 272 725 3
1928 Dazzy Vance 17.80% 7.4% 241 1160 5
1927 Dazzy Vance 16.40% 7.2% 228 1627 11
1923 Dazzy Vance 16.60% 7.3% 227 1546 12
1930 Dazzy Vance 16.30% 8.2% 199 1670 58
1931 Dazzy Vance 16.30% 8.2% 199 1670 58
1929 Dazzy Vance 12.90% 7.3% 177 3291 136
1922 Dazzy Vance 12.50% 7.2% 174 3533 155
2001 Randy Johnson 36.70% 16.4% 224 2 14
1995 Randy Johnson 34.00% 15.4% 221 6 18
2000 Randy Johnson 34.70% 15.8% 220 4 20
1999 Randy Johnson 33.70% 15.6% 216 7 27
1997 Randy Johnson 34.20% 16.3% 210 5 36
1993 Randy Johnson 29.30% 14.2% 206 28 40
2002 Randy Johnson 32.30% 16.0% 202 11 50
1998 Randy Johnson 32.50% 16.3% 199 10 56
1994 Randy Johnson 29.40% 15.3% 192 27 81
2004 Randy Johnson 30.10% 16.0% 188 21 91

Johnson’s best season ranks 14th in terms of K%+; Vance had six seasons better than that. The point here isn’t to say that Johnson wasn’t that great (he was phenomenal), but to illustrate how much the changing environment impacts just how great today’s strikeout artists compare to those who played long ago.

So, getting back to Strasburg. Let’s assume he finishes this season with a 33% strikeout rate. As impressive as that would be, it would still only rank 129th historically in terms of K%+ — by far the lowest of the other 23 30%+ strikeout rate seasons we’ve seen from starting pitchers. Now, 129th out of 6,980 seasons is nothing to sneeze at. But it does illustrate the need to put the current strikeout successes that we are seeing into historical context. Strasburg is undoubtedly one of the most talented strikeout pitchers the game has seen. But he also is pitching in the most strikeout-friendly era of the past 96 years.

——————

**If we had K% data going back further, Johnson would no doubt have been more of a force on the leader boards.

We hoped you liked reading Stephen Strasburg, Dazzy Vance and Context by Bill Petti!

Please support FanGraphs by becoming a member. We publish thousands of articles a year, host multiple podcasts, and have an ever growing database of baseball stats.

FanGraphs does not have a paywall. With your membership, we can continue to offer the content you've come to rely on and add to our unique baseball coverage.

Bill leads Predictive Modeling and Data Science consulting at Gallup. In his free time, he writes for The Hardball Times, speaks about baseball research and analytics, has consulted for a Major League Baseball team, and has appeared on MLB Network's Clubhouse Confidential as well as several MLB-produced documentaries. He is also the creator of the baseballr package for the R programming language. Along with Jeff Zimmerman, he won the 2013 SABR Analytics Research Award for Contemporary Analysis. Follow him on Twitter @BillPetti.

Guest
cass

It’s a lot easier to beat the average K% when the K% is so low, though. Not sure how to adjust for that, but perhaps a look at the variance of the K% over time would give more context?

Guest
DD

I agree, there should be an analysis of the bell curve/distribution of pitchers compared to league average. Perhaps there are significantly more pitchers right around league average now, so to have a % so much higher than the league is more impressive now?

Also, why would you park-adjust a park-neutral stat?

Guest
Steve the Pirate

I’d imagine that there was a greater variation in talent in the early 1900’s due to the lack of specialized training programs and the relative lack of accessibility of high level baseball instruction. IMO, the strength of the field makes Stras’ achievement more notable in my eyes. IOW, he stands out in an era where it is much harder to stand out.

As for park factors, has there been any work on parks and K rates? I would think shadows, batter eyes, rocks, guy in T-shirt could affect K rates. Not sure if difference would be notable.

Guest
Ryan

Another difference amongst parks that could affect K rate is foul territory; a lot of foul ground would lead to more foul popouts, whereas very little foul territory would generate more strikes. I would imagine its effect would be minor, though.

Guest
JimNYC

Strikeouts are not a park-neutral stat. Oakland, for example, suppresses strikeouts since the vast foul territory leads a lot of balls that would be foul pops to the stands end up being caught.

Guest
DD

Jim NYC – prove it. I understand the concept, but has this really been fleshed out? what is the impact? 2-3 Ks over 200 innings? does it really move the needle here?

Guest
Anon

I did a search for “strikeout park factors”.

Way to be lazy DD.

Member
Toffer Peak

DD – It doesn’t need to be proven by Jim. It’s already been done by others. The data is out there if you look for it.

Guest
DD

He made the claim – usually if someone claims something is true, they should have some support to back it up. Not a crazy request. Thanks for the link, that’s all I was looking for.

Member
Member
davisnc

DD, do you have support for that claim you made about the support people who make claims ought to have for the claims they make?

Guest
jmarsh

I was thinking this as well. Talent is more toward the center now. What impresses me most is how high Pedro and Gooden are on the list.

Guest
CircleChange11

Just subtract the league average K% from the individual pitcher K% and … give it to Pedro.

Anytime you divide by a smaller number, it’s going to make the numerator larger. There’s gotta be a better way than just doing that.

I’d follow the same pattern that the guys use to estimate what “Barry Bonds would have hit in 1924” or similar process.

You wouldn’t just divide Bonds “batting” (whatever metric you want) by the league average and then multiply it by the league average in 1924, would ya?