The Strange Career of Wes Ferrell (SP Leverage, Part 4)

This is the fourth article in a never-ending series on starting pitcher leverage. If you know the gist of these suckers, you can skip this paragraph . For the rest of you, starting pitcher leverage refers to the once-common practice of a team intentionally using its pitchers disproportionately against particular opposing teams. It could be an ace starting all the time against the best opposing teams, or southpaws starting against the most left-leaning offenses. For this study, I figured out that leveraging existed back in the earliest days of baseball up to the 1960s, and thus I looked at the usage patterns for virtually every pitcher worth looking at. I ended up determining the leveraging for over two-thirds of all GS from 1876 to 1969. For this I invented a stat called AOWP+. Scroll down below to see exactly how this stat works. Short version: it’s set up like ERA+ or OPS+, centered on 100. A higher score means the pitcher was used more against the best teams, a low score means more against the worst teams, and if he’s used evenly against all his AOWP+ will be 100. So much for that.

So far I’ve looked at that best and worst leveraged careers, , single seasons, and tried to determine how much impact leveraging had on a pitchers’ numbers. There’s a lot of ground I’d still like to cover, but first there’s an issue I must contend with first: the specter of Dick Thompson.

The problem

I mentioned in this series’ debut article that Dick Thompson’s work inspired this entire venture. Though the notion of starting pitcher leveraging is fairly well known—find an old ChiSox fan who remembers the Go-Go Sox and he’ll tell you about all the times Billy Pierce started against the Yanks—Thompson is the only person I know of who had the insight to use what’s known of leveraging to an individual pitcher in judging his career. In particular, in his book, “The Ferrell Brothers of Baseball he argued that Wes Ferrell is far better than anyone realizes because of how his teams used him.

Years ago he used to post on Baseball Primer detailing nuggets he’d learned. Most memorably, he claimed Ferrell from 1929 to 1936 was as good as Lefty Grove. For example, he’d point to 1930 and 1931 when Grove didn’t pitch much against the Yanks and never had to face his own squad’s potent offense, while the Indians routinely loaded Ferrell up against those teams. Since then I’ve come across members of the sabermetric community as prestigious as Rob Neyer mentioning Wes Ferrell as an example of a pitcher who was better than his numbers indicate because of his usage.

This was brilliant and frankly revolutionary research by Dick Thompson, and I salute him for having the idea to examine leveraging, the determination to see the project through, and the willingness to make his information public.

There’s just one problem. He’s all wrong.

Wes Ferrell was actually a rather poorly leveraged pitcher. Not always, mind you, but while with the Red Sox calling him poorly leveraged is a massive understatement. On the whole, leveraging actually diminishes, not enhances, his value. This realization was one of the most jarring things I uncovered in this study. Here’s his AOWP+ info for his career:

1928  2   430  514  84   CLE
1929 25   502  495  101  CLE
1930 35   511  496  103  CLE
1931 35   513  498  103  CLE
1932 34   496  489  101  CLE
1933 26   513  498  103  CLE
1934 23   468  500  94   BOX
1935 38   481  498  97   BOX
1936 38   483  504  96   BOX
1937 35   492  501  98   BOX/WAS
1938 26   474  496  96   WAS/NYY
1939  3   482  490  98   NYY
1941  3   524  516  102  BOS
All 323   493  497 99.16

When Dick Thompson pointed to the early 1930s and how the Indians used Ferrell often against the Yanks and A’s, he was right. In 1930 he started six games against the Yanks and seven against the Athletics. He had no more than five starts against any other team. Next year he started six times against the A’s and four against the Bronx Bombers (plus seven more against the third-place Senators). However, that was merely common for an ace in those years, and by no means was it the most remarkable part of his usage.

For Ferrell, the key lies in Boston. Sure, his AOWP+s don’t look that bad for those years, but back in the day, when leveraging was pretty common, you rarely saw a big-name pitcher like Ferrell, who still was winning 20 games a year, consistently fall several points below 100. If you divide up all his 110 Boston starts into seven categories—games against the best-available opponent, second-best, third-best, and so forth down to the seventh-best (worst opposing team), here’s what his usage pattern looks like while a Red Sox:

Opponent      GS
Best          10
2nd Best      14
3rd Best      16
4th Best      11
5th Best      18
6th Best      20
Worst         21

Pretty neat, huh? Only one step out of place. And that gives him the benefit of the doubt, because in 1936 both the White Sox and Senators had a .536 mark. He had six starts against the former but only two against the latter. I listed the Sox as the third-best and Washington as the fourth-best. Flip it around and his usage looks even more bottom heavy. It’s even more amazing when you realize that contemporary ace pitchers were more likely to start against the best available team. In fact, never, in the entire history of baseball, has there been a pitcher so talented as Wes Ferrell, who—while still in his prime—was as poorly leveraged as he was by the Boston Red Sox. For perspective, here’s his entire career:

Teams        Cle     Box     Rest    Total
Best          28      10       6       44
2nd Best      20      14       8       42
3rd Best      27      16       7       50
4th Best      20      11       9       40
5th Best      18      18      10       46
6th Best      21      20       8       49
Worst         21      21       8       52
Total        158     110      56      323

And since Thompson compared Ferrell to Grove, here’s the great one’s leverage score, and below that, to finish off the parallel, how many times he started against the various teams in his career (since with Ferrell, when teams were tied, I put the team he faced the most on top, I’ll give Grove the same courtesy):

Year          GS     AOWP   TOWP AOWP+  Team
1925          18     508     488  104   A's
1926          33     501     491  102   A's
1927          28     486     488  100   A's
1928          31     478     481   99   A's
1929          37     467     473   99   A's
1930          32     471     477   99   A's
1931          30     471     473  100   A's
1932          30     488     484  101   A's
1933          28     499     499  100   A's
1934          12     510     500  102   BOX
1935          30     500     498  100   BOX
1936          30     535     504  106   BOX
1937          32     532     498  107   BOX
1938          21     501     489  102   BOX
1939          23     516     485  106   BOX
1940          21     498     495  101   BOX
1941          21     487     495   98   BOX
All          457     496     489  101.43

Rival          A's     BoX     All
Best          46      32      78
2nd Best      35      28      63
3rd Best      33      36      69
4th Best      40      21      61
5th Best      32      32      64
6th Best      38      24      62
Worst         43      17      60
Total         267     190     457

First, Grove’s marks with the A’s were low for an ace back then. Other aces weren’t leveraged—Walter Johnson being the best example—because their teams used them as workhorses throwing them out there as often as possible. Grove was never a workhorse like the Big Train. Also the first two articles in this series showed lefties were especially likely to be used against the best teams disproportionately, which makes Grove’s pedestrian show that much more unusual. Instead, Grove’s teammate (and fellow southpaw) Rube Walberg picked up the slack for him, and thus became baseball’s fourth best leveraged starter of all-time.

However, Grove bests Ferrell every way—better career marks, higher peak seasons, while Ferrell had the lower single seasons. Grove had a greater percentage of his starts against the best available team, and fewer against the worst. Even in terms of pure AOWP, without looking at leveraging, he beats Ferrell .496 to .493; a key point because part of Thompson’s argument was that Grove never had to face Philly’s own fantastic offense. Added bonus: the difference is most notable in the years they were teammates. It’s the damnedest thing.

Trying to Figure This One Out

Well, Dick Thompson did use innings pitched while I used GS, so that could be it. But I have a lot of trouble seeing that explaining the difference. Both men completed a large majority of their starts, so that shouldn’t make much difference. They both had numerous relief appearances, but in relief the situation matters more than opponent when it comes to leveraging. Hmmmmmm…

Dick Thompson is a brilliant researcher. Crickey, he even won SABR’s highest honor, the Bob Davids Award, for his work. As a general rule of thumb, when a society called the Society for American Baseball Research singles you out for brilliant baseball research, you’ve done something really right. So how the heck could he be so far off on this one? Well, that’s where this really gets, uh, “fun.”

You see, Dick Thompson became aware of my research and conclusions about Wes Ferrell around two years ago. (I had less refined ways of quantifying this stuff back then, but it came out the same for Wes). And oh lordy, you think that the people who firebombed Dresden made themselves some enemies. Without referring to me by name, in post #134 in this thread Thompson in short order, 1) dismissed it as “SABRmetric drivel,” 2) denigrated me as “the guy who tallies data from retrosheet and passes it off a[s] original research,” and 3) my personal favorite, said I “plagarize [sic] material by hitting a few computer keys.” Looks like I touched a nerve. I don’t want to rehash the entire thread (my response is post 154 under my handle of Dag Nabbit), but I do think there’s a connection between his botched interpretation and his emotionally venomous outburst.

The key to unlocking this mystery resides in a seemingly unconnected comment at the end of post #134. He mentions that he’s researching an obscure pre-integration black pitcher named Bill Jackman. From what he’s uncovered, Jackman’s a better pitcher than Dick Redding, Jose Mendez, either of the Foster brothers, and possibly the great Satchel Paige. Mind you, almost all those guys are in Cooperstown. Meanwhile, a few years ago I read a book, Cool Papas and Double Duties where about 30 Negro League experts, and about as many former Negro Leaguers (many of whom died before the book came out) named up to 27 picks for the best Negro Leaguers not in Cooperstown at that moment. None mentioned Jackman.

There’s a theme underlying Thompson’s interpretations. He argues his subjects of interest are better than anyone thinks. Far better. Jackman’s as good as Satchel Paige. Wes Ferrell was comparable to Grove. Even Rick Ferrell, one of the most denigrated Hall of Fame selections of all-time, gets Thompson’s defense, as Thompson marshaled a series of quotes from old time baseball men talking about how great Rick was.

One gets the sense that Thompson doesn’t just research his players, but falls in love with them. And when that happens, he becomes blinded to any/all negative information about them and entranced solely by the wonderous parts. This is another reason why I think using IP instead of GS wouldn’t explain the difference. The problem of his Wes Ferrell research fits a larger pattern of interpretational bias.

He ends up making claims so beyond what’s reasonable that some schmuck like me poke some serious holes in an interpretation he spent years working on. It’s a shame because it’s great research, and a fantastic starting point, but it’s a cautionary tale on the need to keep a critical eye on what you’re doing. As a general rule of thumb, when some random schlub can spend an hour poking around retrosheet and provide evidence you’re interpretation doesn’t hold water, you’ve done something really wrong.

This leaves me in a conundrum. Dick Thompson’s work on Wes Ferrell was the stimulus for all the work I’ve done on starting pitcher leverage. His insight into examining how usage patterns are something I still treasure. He’s a far better researcher than I’ll ever be in his dedication to mining information on a specific player. Being an expert, however, does not mean one’s interpretations are sacrosanct and above question.

One final comment: the charge of plagiarism I find especially unfounded. All I have to say about that is that unless one makes no distinction whatsoever between using retrosheet and plagiarizing retrosheet, then the charge is wholly without merit. Sorry from the just from the main thrust of this series, but when you engage in a major study like this, and the person who inspired it has such open contempt for it, you really need to address that. Next article this series will get back on track as I look at something that previous articles have shown to be extremely important to the concept of starting pitcher leveraging—platoon leveraging.

References & Resources
What the heck is AOWP+?: The stat I invented to judge pitcher leveraging. It’s AOWP/TOWP*100. AOWP is Average Opponent Winning Percentage. TOWP is Team’s (Average) Opponent Winning Percentage. To figure AOWP for a single season, you take the number of starts a given pitcher had against each opposing team, and multiply that by the team’s winning percentage. After doing this for all rival squads, add up the products and divide by the pitcher’s total GS. The result is his AOWP. The same logic applies to TOWP, only here you look at how many games the team played against all rivals. If a pitcher’s used evenly, his AOWP will be the same as the TOWP, and he’ll have an AOWP+ of 100. If he’s used more against better teams, he’ll have a higher AOWP+. I calculated AOWP+ for 659 pitchers who started 182,000 games, including over two-thirds of all games from 1876-1969.

