Did the Rocket win St Louis the World Series?

If you read the THT mailbag last Wednesday you probably noticed Anthony Weekley’s question asking whether the Astros would have made the postseason had Clemens returned at the start of the season rather than waiting until June 22.

To remind you here is Anthony’s question in full:

In looking back at the 2006 season, I wonder what might have been for the Houston Astros if Roger Clemens had decided to come back earlier than he did. Would two, four, or six weeks more of Clemens’ production been enough for the Astros to have taken the Central from the Cardinals? If Clemens had come back earlier than he did, might we have had a different World Series champ than the St. Louis Cardinals? Would the Detroit Tigers have accomplished a most remarkable story by winning the 2006 World Series?

My answer was had the Rocket not postponed his 2006 debut to the height of midsummer and had he repeated his second-half performance in the first half AND not seen a dropoff in performance from this additional workload then the ‘stros would have saved 16 extra runs, which is a shade over 1.5 wins. I concluded that the Astros may have tied with the Cards but would likely not have overhauled them in the NL central lollygag. Probably.

Since posting my reply I’ve been mulling over the issue for the past week, and I’m not sure whether I was 100% right. The theory is fine but there are many other factors to consider in such a small sample. Each game has a specific context. It may sound obvious but remember that pitchers win games only if they allow fewer runs than their team scores. To see what I mean consider Jaret Wright. Wright, a quite terrible pitcher, will win more games pitching for the Yankees than he would for Houston simply because as a Yankee you have Jeter, Rodriguez and Giambi coming into bat for you rather than Everett, Ausmus and Taveras. And I don’t mean “win” in a purely statistical sense as in win-loss records but rather, given the performance of relief corps and the batters in a particular game, whether a starter can prevent enough runs to either win or tie the game.

Do you follow? Take Game 1 of the World Series as an example, which the Cardinals won 7-2. Reyes pitched eight strong innings, giving up two runs. The bullpen gave away nothing and the Cardinals’ batters drove in seven runs. All things being equal, and ignoring for the minute that they never are, Reyes could have conceded six runs and still won the game. In other words you’d even expect Jaret Wright to bring home the victory.

So, to fully answer Anthony’s question we need to look at each individual game that Clemens would have pitched in and work out whether his presence would have changed the result, all other things being equal. That doesn’t quite give us a full answer because we also need to consider what toll pitching for the full year would have had on Clemens’ arm. Stepping on to the mound for Opening Day is all well and good provided your arm doesn’t fall off by the All-Star break.

April to June

Let’s look at the Astros’ rotation when the season opened. Because a couple early games with the Giants were rained out it meant that the Astros fielded a four-man rotation for the first two weeks of the season.

Opening Day Rotation

Mid-way through April Brandon Backe went on the DL with Taylor Buchholz and Fernando Nieve promoted. The rotation, and each starter’s ERA until midsummer’s day, looked like this:

  Starter    ERA
1 Oswalt     3.21
2 Pettitte   5.34
3 Rodriguez  4.45
4 Buchholz   6.00
5 Nieve      4.67

Give or take a skipped start this is how the rotation stayed until Clemens polled up for duty on June 22.

We all know how badly Pettitte pitched before the break, but an ERA of 5.43, really? What’s more surprising was that he wasn’t even the worst starter on the staff: Taylor Buchholz managed to post an ERA of 6.00 over the same period (err, maybe that isn’t surprising after all). As I alluded to in my answer to Anthony’s question Clemens replaced Nieve, who had a semi-respectable ERA of 4.67.

Let’s take a look at all the games that Nieve started for the Astros and how he fared:

Date        Win / Loss Nieve IP   Nieve ER Nieve RA Bullpen RA  RS
16-Apr      Won        4          2        2        3           8
25-Apr      Won        4 2/3      2        2        1           4
2-May       Won        7          4        4        1           8
7-May       Loss       5 1/3      5        5        0           3
12-May      Won        5 1/3      1        1        1           12
17-May      Loss       3 2/3      6        7        3           1
23-May      Loss       7          2        2        2           1
28-May      Won        6 2/3      4        4        0           5
4-Jun       Loss       5          2        2        4           4
9-Jun       Won        7 1/3      2        2        0           7
15-Jun      Won        5 2/3      2        2        0           3
TOTAL                  61 2/3     32       33       15          56

Despite his 4.67 ERA, Nieve actually pitched reasonably well. In games that he started Houston won seven times and lost four. In total, Nieve allowed 33 runs to score in his starts. Had Clemens been in his place, runs allowed would have dropped to about 17 assuming he extended his June to September performance back to April. So, how many of the games that Nieve started would the Astros have won had Clemens been on the mound?

We can use the Weibull distribution, to work out what Clemens’ runs allowed profile would have been for a given number of innings pitched. For example, here it is for four, seven and nine innings pitched:

Runs allowed  4 IP    7 IP     9 IP
0             26%     12%      8%
1             39%     24%      18%
2             25%     24%      21%
3             10%     19%      19%
4             3%      12%      14%
5             1%      6%       10%
6             0%      3%       6%
7             0%      1%       3%

If we apply this distribution to every game that Nieve started and assume that both the runs scored by the Astros offense and runs allowed by the bullpen remained the same, we can estimate the probability that Houston would have won each game had Clemens started. Here are the game by game winning probabilities had Clemens pitched the same number of innings as Nieve:

Date       Result   Win prob if Clemens pitched Lose prob if Clemens pitched
16-Apr     W 8-5    100%                        0%
25-Apr     W 4-3    87%                         13%
2-May      W 8-5    100%                        0%
7-May      L 5-3    82%                         18%
12-May     W 12-2   100%                        0%
17-May     L 10-1   0%                          100%
23-May     L 4-1    0%                          100%
28-May     W 5-4    96%                         4%
4-Jun      L 6-4    10%                         90%
9-Jun      W 7-2    100%                        0%
15-Jun     W 3-2    80%                         20%

Interestingly, Nieve didn’t pitch in that many close games. The only game where the outcome would have substantially differed had Clemens been on the mound was the 5-3 loss to Colorado on May 7. The only other loss where Clemens may have swung the game was the 6-4 defeat to Cincinnati on June 4, but the odds would have been heavily stacked against him as the pen took a pasting.

If we calculate a weighted probability, then Clemens’ record over the period would have been 8-3 (technically 7.55-3.45) rather than 7-4, putting the Rocket 1 up.

Innings Pitched

Another factor to consider is whether Clemens would have pitched more innings and therefore relieved some pressure from the weary bullpen.

Nieve didn’t tend to pitch deep into games often not going more than six innings, but the Rocket probably wouldn’t have fared much better. In 2006, Roger averaged a shade below six innings per start. Had he pitched from April it’s likely that to preserve effectiveness Garner would have used him as sparingly as he used Nieve. No spare change there I’m afraid.

Clemens’ Effectiveness

Given that Clemens would at a max only have added one extra victory to the win column, the question of whether increased use would have doused some of his effectiveness becomes more important.

Clemens, at 44 years old, is certainly no spring chicken, and we might logically expect his stamina to decline at a fair clip. For an average pitcher the decline phase will start by the time he is 30, although this can vary by pitcher type. However, Clemens is not your average pitcher—he is a workhorse who appears to have an almost timeless quality! Applying standard aging curves won’t work as as our sample size will reduce to about three.

There were two reasons why Clemens delayed his 2006 season. First, he was keen to represent the USA in the WBC and felt he needed time to recuperate after pitching in the Classic. Second, he knew that in the previous two seasons he had probably pushed his body close to the limit and was worried about reduced effectiveness as the postseason dawned. Probably the best proxy for Clemens’ effectiveness over the course of a season is to see what numbers he put up month by month for 2004 and 2005, his first two years pitching at Minute Maid.

           April    May      June     July     August    September
2004       1.95     2.81     3.05     3.21     4.25      2.57
2005       1.03     1.54     1.97     1.32     1.70      4.33
2006                         2.38     2.00     2.54      2.33
TOTAL      1.47     2.10     2.53     2.17     2.76      3.00

As expected we see Clemens’ ERA erode as each full season progressed. His decision to start in June in 2006 was correct as that was the only year in the last three where we haven’t seen a performance drop off. Each of the previous two years was characterized by a particularly bad month. In 2004 it was August, where he posted an ERA of 4.25, which is still respectable mind you; in his legendary 2005 season it was September where his ERA was 4.30, incredibly the only month where it breached the 2.00 barrier!

Based on this a reasonably safe assumption is that Clemens’ ERA would have snuck up by 0.70 had he started the season in April. Over the course of the 90 or so innings pitched after the all start break, that is equivalent of seven runs, or just under one win. We can’t say for certain but it may have been enough to negate the extra win that Clemens’ may have secured by coming back early. We’re back to all-square.

Pulling it Together

Okay, we’ve done all the grunt work so let’s calculate what difference it would have made to the playoff picture. Here’s a reminder of what the standings looked like at the end of 2006:

Team       Won      Loss     GB
St Louis   83       78       -
Houston    82       80       1.5

The ‘stros were 1.5 games back because the Cardinals didn’t play their final game of the season against the Giants. Depending on the outcome of that match-up Houston would have had to win either one or two more games to force a play-off or either two or three extra games to win the division outright. What were the odds?

   Cards last game    Clemens performance  Astros p-off odds  Card p-off odds
1  Cards win          N/A                  0%                 100%
2  Card lose          Maintained           39%                61%
3  Cards lose         Declines             12%                88%
   TOTAL                                   17%                83%

Hmm … low. Either way, the Cardinals would very likely have made the postseason. The most probable outcome would have been scenario three, which would have resulted in the Astros’ postseason probability being a meager 12%. So, to answer our lead-off question: No, in all likelihood the Rocket choosing to delay his 2006 debut didn’t help St Louis win the World Series.

Concluding Thoughts

Looking back at all this one of the more revealing insights was Garner’s decision to demote Nieve to the pen. In all, Nieve’s record as a starter was an unimpressive 2-3, but the fact remains that Buchholz was worse. It is amazing to think that a pitcher as good as Roger Clemens would have been lucky to win the Astros even one extra game. It just goes to show why pitcher won-loss records are rightly vilified in the analytical community. Who knows, but had the Rocket replaced another pitcher such as Buchholz or even his great friend, Andy Pettitte, then the outcome may have been different. Fortunately for the Cardinals that didn’t happen.

One final thing: The Rocket remains one of the best pitchers in the game. David Gassko’s excellent piece in the THT annual pegs him as the best overall. I pray that we see him pitching next season. Now, if only he’d take a pay cut to play for the Atlanta Braves … nah, it ain’t happening.

References & Resources
All pitching data came from David Pinto’s excellent daily database at Baseball Musings. Also I’d like to thank THT’s Sal Baxamusa for answering some questions that I had on the Weibull distribution (and also providing a Weibull spreadsheet).

