## Stephen Strasburg, Dazzy Vance and Context

Eric Seidmen wrote an interesting article last Thursday about Atlanta reliever Craig Kimbrel‘s historic strikeout pace. So far, Kimbrel is sporting a blistering 42.7% strikeout rate (K%). Even for a relief pitcher in this era, that’s incredibly impressive. But one person who commentedÂ on the story noted that there was a non-reliever approaching the same level of whiff greatness (i.e. > 30% strikeout rate).

Nationals phenom Stephen Strasburg has thrown 182 innings in the big leagues and has struck out 32.5% of the batters he’s faced. No starting pitcher who lasted any significant amount of time ever finished his career with a strikeout rate higher than 30%. The closest Â is Randy Johnson and his 28.5% strikeout rate. This season, Strasburg has a 33% strikeout rate. If he were to maintain that pace, he’d be the 10th starting pitcher in history to achieve the feat and would have the 23rd such season since 1916. But take a look at that list and you’ll note that the oldest instance came back in 1984.

The problem we run into with strikeouts — like many statistics in baseball — is that the playing environment has changed over time.

The average rate at which pitchers record strikeouts has jumped dramatically since 1916 (the earliest year we can calculate strikeout percentage). That year, the league-average starter had a 10.5 K%. Compare that to the current 18.6% rate this year, and we have a massive gap. Whether hitters are more prone to strikeouts, pitchers are simply nastier now, or some combination of environmental and structural changes to the game (e.g. how players are selected, technology, etc.), the fact is a strikeout in 1916 was rarer than one today.

Fortunately, integrating some context into the conversation is incredibly easy. We simply need to calculate a “plus” statistic for strikeout rate so that we can compare how much better than league-average an individual pitcher’s strikeout rate is in a given season. Here’s the formula for calculating what I will call K%+: [(Pitcher's K% / League Average K%) * 100].

Similar to wRC+, a K%+ of 100 means that a pitcher’s strikeout rate was league-average*. A pitcher with a 125 K%+ would have a strikeout rate 25% better than league average, and a 75 would indicate a rate 25% worse than league-average.

So if we adjust all individual qualified starter seasons since 1916 in this way, which pitcher had the greatest single season in terms of strikeout rate? The one and only Dazzy Vance.

Vance posted a 21.5 K% in 1924, which ranks as the 404th-best strikeout rate in a single season. But back then, the league-average K% was a mere 6.9%. That means Vance posted a whopping 312 K%+. That season, Walter JohnsonÂ hadÂ the second-highest K% atÂ 13.8%, which translates to a 200 K%+**.

In fact, if we look at a K%+ leader board, Dazzy Vance appears more dominant in terms of strikeouts than any other pitcher.

Season Name K% League K% K%+ K% Rank K%+ Rank
1924 Dazzy Vance 21.5% 6.9% 312 404 1
1925 Dazzy Vance 20.3% 6.9% 294 587 2
1926 Dazzy Vance 19.6% 7.2% 272 725 3
1926 Lefty Grove 18.1% 7.2% 251 1082 4
1928 Dazzy Vance 17.8% 7.4% 241 1160 5
1999 Pedro Martinez 37.5% 15.6% 240 1 6
1941 Johnny Vander Meer 21.4% 9.2% 233 419 7
1984 Dwight Gooden 31.4% 13.5% 233 14 8
1928 Lefty Grove 17.0% 7.4% 230 1389 9
1946 Hal Newhouser 23.4% 10.2% 229 205 10
1927 Dazzy Vance 16.4% 7.2% 228 1627 11
1923 Dazzy Vance 16.6% 7.3% 227 1546 12
1946 Bob Feller 23.0% 10.2% 225 241 13
2001 Randy Johnson 36.7% 16.4% 224 2 14
1979 J.R. Richard 26.6% 11.9% 224 66 15
1939 Bob Feller 19.8% 8.9% 222 684 16
1976 Nolan Ryan 27.4% 12.4% 221 53 17
1995 Randy Johnson 34.0% 15.4% 221 6 18
2000 Pedro Martinez 34.8% 15.8% 220 3 19
2000 Randy Johnson 34.7% 15.8% 220 4 20
1989 Nolan Ryan 30.5% 13.9% 219 17 21
1955 Herb Score 25.1% 11.5% 218 120 22
1938 Bob Feller 19.2% 8.8% 218 799 23
1927 Lefty Grove 15.7% 7.2% 218 1908 24
1978 J.R. Richard 26.6% 12.2% 218 66 25
1928 George Earnshaw 16.1% 7.4% 218 1753 26
1999 Randy Johnson 33.7% 15.6% 216 7 27
1930 Lefty Grove 17.6% 8.2% 215 1219 28
1936 Van Mungo 18.1% 8.5% 213 1082 29
1978 Nolan Ryan 25.8% 12.2% 211 93 30

Vance has four of the top five spots on the list and six of the top 12. Of his 11 qualifying seasons, his 170 K%+ was his worst. The best non-Vance season came fromLefty Grove in 1926 (251 K%+). To add some additional context, Randy Johnson posted eight seasons with a K% of greater than or equal to 30% (most all time, double the number by Pedro Martinez). Here’s Johnson’s top 10 K%+ seasons, compared to Vance:

Season Name K% League K% K%+ K% Rank K%+ Rank
1924 Dazzy Vance 21.50% 6.9% 312 404 1
1925 Dazzy Vance 20.30% 6.9% 294 587 2
1926 Dazzy Vance 19.60% 7.2% 272 725 3
1928 Dazzy Vance 17.80% 7.4% 241 1160 5
1927 Dazzy Vance 16.40% 7.2% 228 1627 11
1923 Dazzy Vance 16.60% 7.3% 227 1546 12
1930 Dazzy Vance 16.30% 8.2% 199 1670 58
1931 Dazzy Vance 16.30% 8.2% 199 1670 58
1929 Dazzy Vance 12.90% 7.3% 177 3291 136
1922 Dazzy Vance 12.50% 7.2% 174 3533 155
2001 Randy Johnson 36.70% 16.4% 224 2 14
1995 Randy Johnson 34.00% 15.4% 221 6 18
2000 Randy Johnson 34.70% 15.8% 220 4 20
1999 Randy Johnson 33.70% 15.6% 216 7 27
1997 Randy Johnson 34.20% 16.3% 210 5 36
1993 Randy Johnson 29.30% 14.2% 206 28 40
2002 Randy Johnson 32.30% 16.0% 202 11 50
1998 Randy Johnson 32.50% 16.3% 199 10 56
1994 Randy Johnson 29.40% 15.3% 192 27 81
2004 Randy Johnson 30.10% 16.0% 188 21 91

Johnson’s best season ranks 14th in terms of K%+; Vance had six seasons better than that. The point here isn’t to say that Johnson wasn’t that great (he was phenomenal), but to illustrate how much the changing environment impacts just how great today’s strikeout artists compare to those who played long ago.

So, getting back to Strasburg. Let’s assume he finishes this season with a 33% strikeout rate. As impressive as that would be, it would still only rank 129th historically in terms of K%+ â€” by far the lowest of the other 23 30%+ strikeout rate seasons we’ve seen from starting pitchers. Now, 129th out of 6,980 seasons is nothing to sneeze at. But it does illustrate the need to put the current strikeout successes that we are seeing into historical context. Strasburg isÂ undoubtedlyÂ one of the most talented strikeout pitchers the game has seen. But he also is pitching in the most strikeout-friendly era of the past 96 years.

——————

**If we had K% data going back further, Johnson would no doubt have been more of a force on the leader boards.

Print This Post

Bill works as a consultant by day. In his free time, he writes for The Hardball Times, speaks about baseball research and analytics, consults for a Major League Baseball team and appears on MLB Network's Clubhouse Confidential. Along with Jeff Zimmerman, he won the 2013 SABR Analytics Research Award for Contemporary Analysis. Follow him on Tumblr or Twitter @BillPetti.

### 50 Responses to “Stephen Strasburg, Dazzy Vance and Context”

You can follow any responses to this entry through the RSS 2.0 feed.
1. Oliver says:

Awesome article, thank you.

#### +13

• JimNYC says:

Any article that talks up Dazzy Vance — probably my second favorite pitcher ever, after Dizzy Dean, although Rube Waddell deserves some consideration — is ok by me.

2. cass says:

It’s a lot easier to beat the average K% when the K% is so low, though. Not sure how to adjust for that, but perhaps a look at the variance of the K% over time would give more context?

#### +8

• DD says:

I agree, there should be an analysis of the bell curve/distribution of pitchers compared to league average. Perhaps there are significantly more pitchers right around league average now, so to have a % so much higher than the league is more impressive now?

Also, why would you park-adjust a park-neutral stat?

• Steve the Pirate says:

I’d imagine that there was a greater variation in talent in the early 1900′s due to the lack of specialized training programs and the relative lack of accessibility of high level baseball instruction. IMO, the strength of the field makes Stras’ achievement more notable in my eyes. IOW, he stands out in an era where it is much harder to stand out.

As for park factors, has there been any work on parks and K rates? I would think shadows, batter eyes, rocks, guy in T-shirt could affect K rates. Not sure if difference would be notable.

• Ryan says:

Another difference amongst parks that could affect K rate is foul territory; a lot of foul ground would lead to more foul popouts, whereas very little foul territory would generate more strikes. I would imagine its effect would be minor, though.

• JimNYC says:

Strikeouts are not a park-neutral stat. Oakland, for example, suppresses strikeouts since the vast foul territory leads a lot of balls that would be foul pops to the stands end up being caught.

#### +12

• DD says:

Jim NYC – prove it. I understand the concept, but has this really been fleshed out? what is the impact? 2-3 Ks over 200 innings? does it really move the needle here?

• Anon says:

I did a search for “strikeout park factors”.

Way to be lazy DD.

• Toffer Peak says:

DD – It doesn’t need to be proven by Jim. It’s already been done by others. The data is out there if you look for it.

• DD says:

He made the claim – usually if someone claims something is true, they should have some support to back it up. Not a crazy request. Thanks for the link, that’s all I was looking for.

• davisnc says:

DD, do you have support for that claim you made about the support people who make claims ought to have for the claims they make?

• jmarsh says:

I was thinking this as well. Talent is more toward the center now. What impresses me most is how high Pedro and Gooden are on the list.

• Bill Petti says:

FWIW, standard deviation in K% among starters has increased over time. It was 2.6% in 1924, about 4.5% this year.

• CircleChange11 says:

Just subtract the league average K% from the individual pitcher K% and … give it to Pedro.

Anytime you divide by a smaller number, it’s going to make the numerator larger. There’s gotta be a better way than just doing that.

I’d follow the same pattern that the guys use to estimate what “Barry Bonds would have hit in 1924″ or similar process.

You wouldn’t just divide Bonds “batting” (whatever metric you want) by the league average and then multiply it by the league average in 1924, would ya?

3. Justin says:

We should also probably put in context that K’s as a whole are up more because SPs throw faster now than they did decades ago.

• JimNYC says:

Tell that to Bob Feller. If you said SP’s “as a whole,” then, yes, I’d agree with you.

• Justin says:

I thought it was implied, sorry

• Antonio Bananas says:

It was implied, I’ve noticed that a lot of people on here can’t infer anything. You have to spell everything out or some asshole is going to nitpick it.

#### +8

• Ben says:

Yes. Also, more hitters probably swing for the fences today which lead to more strikeouts. It’s easy to imagine batters in 1924 mostly hit for singles.

Was it ever definitely documented that Feller threw 100 MPH?

4. Chummy Z says:

Valid points above, and this was a fantastic read.

5. Ixcila says:

I realize that it’s a quick-and-dirty tool for making the comparison, but X+ tools (where X is some generic stat) should really be scaled to account for the spread of the data. For stats with high spread, being 20% above average could be one standard deviation from the mean, whereas for one with low spread, that same 20% could be two or three deviations. The X+ stat should reflect that.

#### +17

• Pat says:

Yeah it’d be much more interesting to see this done in terms of deviation from the mean.

6. Mac says:

Also worth noting, Max Scherzer is currently sporting a 28.9% strikeout rate. If he kept that up and topped 30 K% by year’s end , it would probably be the worst overall pitching season of the >30 K% club.

7. Mac says:

More on the subject of Vance and K%+:

8. Well-Beered Englishman says:

I’m flattered to have an article written out of one of my comments – let alone such a fascinating read. Many thanks!

• Beard says:

You are mistaken. I believe the article was inspired by this comment

• Englishman Who Lost His Beard says:

Where you runnin’ off to, beard?

9. mcbrown says:

To expand on a nit highlighted above, using K%+ as a metric makes certain implicit assumptions about the distribution of strikeout skills across players over time that may not be valid. Is it the case that strikeout rates are rising “just because”, and all else being equal (particularly true talent), Strasburg the individual should be expected to strike out more batters in 2012 than 2011 just because more batters are striking out on average? Is it the case that rising strikeout rates reflect changes that the average pitcher has made that are unavailable to a pitcher like Strasburg because he is optimized in some sense, in which case Strasburg should be expected to strike out the same number of batters in 2012 as 2011 even as the average pitcher strikes out more batters? Is the reality somewhere in between?

I’m inclined to think the reality is somewhere in between. In the limit where the league average K% approached 50%, obviously no one could post a K%+ in excess of 200. That might argue for looking at some kind of logarithmic K%+ metric as a more fair comparison across time periods.

Another possibility is that ratios are the wrong way to look at K%, and the absolute difference is what matters. Or some combination of absolute difference and ratio.

I’m not sure how to even begin to approach the problem without assuming a conclusion.

• mcbrown says:

Or we can just use standard deviation, as someone else pointed out.

10. man07 says:

Why would you park adjust K%? Except for higher mounds back before Bob Gibson went off, I can’t think of any way the parks would affect strike outs.

Might could adjust for the mound heights if you could figure out a way to do that but I don’t think there’s a need to adjust for individual parks.

• RC says:

That fact that you can’t think of any reasons, doesn’t mean they don’t exist. Some parks have some pretty strong K factors.

The can come from anything from the color of the walls, to the width of foul territory.

11. Dan Rosenheck says:

No Rube Waddell on this list??

12. 86general says:

Strikeout rates are probably higher today largely for a single reason: Players don’t care nearly as much about striking out. In the early days of baseball, there was a sizeable negative stigma attached to striking out. If in a 2-strike count, for decades most players would alter their approach slightly, trying to put the ball in play rather than drive the ball. Today, I think in part owing to a better understanding of what productive offense is, players realize that a strikeout often is no worse than any other kind of out, and that an extra-base hit or a homer is much better than a dinky single. Therefore, they are more willing to swing for the fences even with 2 strikes. I believe this will change in the coming decade. In the end of the steroid era, we’re seeing fewer fly balls clear the fences, and umpires appear to be calling more called strikes as the strike zone improves. The strategy of trying for walks and homers becomes more difficult to successfully execute in this environment. I think as this trend evolves, eventually we will reach a point where a contact-hitting player who can bat .350 with just average or even below average slugging percentage will be an asset. In other words, as walks and homers become harder to come by, the value of slap singles goes up.

I really do hope things evolve this way. Homers will never go away, but the game is more exciting when a variety of different types of offense are all valuable. Players who can put the ball in play and run the bases well are exciting to watch. They almost went extinct in the homer/steroid era, but they may make a comeback.

• Hank G. says:

I think as this trend evolves, eventually we will reach a point where a contact-hitting player who can bat .350 with just average or even below average slugging percentage will be an asset.

Has there ever been a time when a contact-hitting player who can bat .350 has not been an asset? That’s pretty much describing Tony Gwynn.

13. pm says:

“[(Pitcher's K% / League Average K%) * 100].”

This method hurts modern players because league average rate includes relievers. So lets say today the average starter has a 18 K% and pitches about 60% of the innings while relievers are 23 K% and 40%. That gives you a league average rate of 20%, but that is misleading because the relief pitchers are inflating that number. Whereas 90 years ago, the starters are taking up 90% of the innings therefore the number isn’t misleading.

#### +5

• Bill Petti says:

True, but I actually used starter-only league averages.

Right, but you throw harder when you know someone has your back.

Walter Johnson was once the K leader but he liked to pitch to contact so that he could get out of the inning.

14. Hurtlockertwo says:

“Whether hitters are more prone to strikeouts, pitchers are simply nastier now, or some combination of environmental and structural changes to the game ” Seriously?? Maybe it’s because every batter wasn’t trying to hit HR’s and just trying to get on base.

15. Brent says:

As several commenters have noted, the relative strikeout percentage, K%+, seems biased in favor of pitchers who pitched in low-strikeout eras.

Mathematically, the problem is that the range for K% is 0â€“1, and the “+” adjustment is a ratio. Ratios work for data that have a 0â€“infinity range. For example, suppose the league K% were to increase to .35; then to match Vance’s 1924 K%+ of 312, a pitcher would have to have a K% of 109%, which, of course, is impossible.

A possible solution is to make the ratio adjustment to the odds ratio, rather than to the percentage. The odds ratio is K%/(1â€”K%), and it has a range of 0â€“infinity.

The “relative odds ratio,” K-Odds+, is:

[((Pitcher's K% / (1â€”Pitcher's K%)) / (League Average K% / (1â€”League Average K%)) * 100].

Based on this relative odds ratio metric, Vance’s top two seasons remain at the top of the list (K-Odds+ of 370 and 344), but Martinez’s 1999 season moves into third place (325). Vance’s 1926 season (314) is in fourth place, but then Johnson’s 2001 (296) and Gooden’s 1984 (293) take fifth and sixth. The seventh through tenth ranked seasons are Grove 1926 (285), Martinez 2000 (284), Johnson 2000 (283), and Johnson 1995 (283).

Altogether, I think the odds-ratio-based metric gives a fairer representation to the eras than the metric based on relative K%.

#### +10

16. Jon L. says:

Great article! People talking about looking at standard deviations might have a good idea, but all you have to do is look at the list of pitchers and seasons this method generated to see that it’s finding some of the toughest pitchers to make contact against across several different eras.

17. J W says:

I would be careful about assuming that the changing run environment is responsible for the increase strikeouts, rather than the other way around. I suspect if you took Strasburg back to 1924 he would strike out well over 21% of batters. Fundamentally, the flaw is simply that baseball’s era have been so different, in terms of talent distribution, run environment, the parks themselves, and so on, that even adjusted comparisons really don’t tell you much. That’s especially true when you look only at a component stat like strikeout rate–which even with SIERA has a pretty linear effect on run prevention, rather than a proportional one–instead of an overall run suppression stat like FIP- (where Stras is, I think, comfortably in the lead among starters so far in his career).

18. james wilson says:

“Was it ever definitely documented that Feller threw 100 MPH?”
Pretty much. He was timed–at home plate–at 98, and another time, averaging 98 over 60 feet (in his street shoes). The difference between hand and plate is generally 8mph. Ted Williams said Feller was the fastest he ever saw. And he was also that rare bird to tell you that ballplayers are better now than in his day.

Walter Johnson was asked to stop in at a Conneticutt ammo manufacturer to be timed by equipment which measured projectile velocity through time and distance. He was calculated to have averaged 102 over 60 feet. But getting struck out in that era was viewed as a personal shortcoming.

I witnessed a very elderly scout who never carried a gun calling the speed of every pitch, and there were three scouts there to confirm that he never missed. It was like a game. Old scouts who had seen Johnson, Feller, and Ryan, claimed Johnson was fastest.

• J W says:

Feller indeed was clocked at close to 100 mph–in 1946, the year he struck out over 23% of batters. Interestingly, Feller didn’t even have the highest strikeout percentage that year; Hal Newhouser struck out 23.4% of the batters he faced. Feller was 27 in 1946, and his record looked like this:

``` Year Age PA K% BB% 1936 17* 279 27.2 16.8 1937 18* 651 23.0 16.3 1938 19 1248 19.2 16.7 1939 20 1243 19.8 11.4 1940 21 1304 20.0 9.0 1941 22 1466 17.7 13.2 ---- Did not play from 1942-44, military service. 1945 26* 300 19.7 11.7 1946 27 1512 23.0 10.1 1947 28 1218 16.1 10.4 1948 29 1186 13.8 9.8```

``` ```

```* Did not qualify for the leaderboards. ```

Feller was 27 in 1946. He entered the majors at 17 in 1936. One of the things that makes finding a proper comparison for Feller hard is that he entered the league so early and fought in WWII for three years of what was probably his strikeout prime (1942 to 1944). Everyone knows that Feller had a 100 mph fastball when he was younger, and most of the names that immediately spring to mind are either Hall of Fame pitchers (Pedro Martinez, Randy Johnson, Roger Clemens, and of course Nolan Ryan) or fireballing relievers (Aroldis Chapman, Billy Wagner, Eric Gagne). Feller, though, was a starter–a highly successful one–and did not have particularly good commmand. That drops out everyone I just listed but Ryan, who would be an excellent comparison (he entered the league early) if it weren’t for two things: he pitched in a very different era from today’s, and he, like many Hall of Famers we remember, was an all-time great older pitcher. Feller, on the other hand, fell off hard after his year 28 season. Before that season, he had never had a K% below 17.7; after it, he had just one above 13. So we’re looking for a modern-era pitcher who entered professional baseball in high school, throws high ’90s with spotty command, and fell off hard around his age 28 season. Ubaldo Jimenez was the name that immediately sprung to mind, at least so far.

Here are Feller and Jimenez’s stats superimposed over one another. Jimenez entered low-A baseball at 18, while Feller was in the majors, so you should almost certainly adjust his numbers in the minor league seasons downwards.

``` Age PA_B PA_U K%_B K%_U BB%_B BB%_U 17 279* ---- 27.2 ---- 16.8 ---- 18 651* 288+* 23.0 22.6 16.3 10.1 19 1248 664+* 19.2 21.8 16.7 10.2 20 1243 176+* 19.8 34.7 11.4 6.8 21 1304 588+* 20.0 22.3 9.0 12.2 22 1466 648+* 17.7 23.1 13.2 12.8 23 ---- 354* ---- 19.2 ---- 10.5 24 ---- 868 ---- 19.8 ---- 11.9 25 ---- 914 ---- 21.7 ---- 9.3 26 300* 894 19.7 23.9 11.7 10.3 27 1512 822 23.0 21.9 10.1 9.5 28 1218 428* 16.1 16.1 10.4 13.3 29 1186 ---- 13.8 ---- 9.8 ----```

``` ```

```* Did not qualify for the leaderboards. + Minor league season. ```

Feller had practically unparalleled velocity during the time he pitched, but it’s not clear to me that it was any faster than Ubaldo’s. Ubaldo may not be an all-time great or have a ticket punched to the Hall of Fame, but he certainly can light up the radar gun. People pointing to the low strikeout totals in that era as reflective of batters’ philosophies being different are missing the point, I think. I’m certain that is a part of it, but there are very few pitchers even today who strike people out with an under 90 mph fastball, which is what the vast majority of pitchers were presenting them. I’m sure that with modern training regimens, medicine, coaching and scouting, and so on, as well as earlier integration and more international talent, there would have been a lot more pitchers throwing in the high ’90s. But I don’t see what relative K%, by itself, really brings to the table, except perhaps a measure of “impressiveness.”

As for Johnson, like many other good things, the myth that Johnson threw 100 mph appears to be just that–a myth:

“Although a lack of precision instruments prevented accurate measurement of his fastball, in 1917, a Bridgeport, Connecticut munitions laboratory recorded Johnson’s fastball at 134 feet per second, which is equal to 91.36 miles per hour (147.03 km/h), a velocity which was virtually unique in Johnson’s day, with the possible exception of Smoky Joe Wood. Johnson, moreover, pitched with a sidearm motion, whereas power pitchers are normally known for pitching with a straight-overhand delivery. Johnson’s motion was especially difficult for right-handed batters to follow, as the ball seemed to be coming from third base.”

I think it’s safe to say that Strasburg would have had high strikeout totals in any era.

• Hurtlockertwo says:

All true, but hitters were not trying to hit HR’s in the 1900-1920
years either. Walter Johnson faced hitters just trying to get on base.

• 39Bailey says:

Ted Williams said Steve Dalkowski was the fastest pitcher he ever faced.

Apparently, Ted William said a lot of shit, some of which contradicted other statements. Much like the rest of us.

19. gdc says:

The various comments about park effects for K% might want to also include which might have a better background to pick up the ball. In the case of Coors, the thinner air is supposed to give breaking balls less movement.

20. Don Draper says:

Great article,
Sincerely
Dick W

21. Jack says:

Fun read. My appreciate for Pedro and Randy Johnson continues to go up in the years post-replacement.