Finding Value in Pitcher Inconsistency

I’d like to talk to you today about pitcher evaluation.

I don’t mean evaluation in the sense of determining a pitcher’s talent level, or evaluation in the sense of determining a pitcher’s future value — or even evaluation in the sense of determining a pitcher’s market value. I mean a pitcher’s past value. Or, perhaps, because value is so often misunderstood and misinterpreted, we’d be better off speaking in terms of contribution. That’s how do we determine the extent to which a player contributed to his team’s success (or failure)?

Of course, this goal — evaluate a pitcher’s past contribution — isn’t new, and it hasn’t gone unaddressed. In fact, the entire purpose of WAR is to quantify a player’s value to his team. Whether with FIP or RA/9 or something in between, various WAR versions attempt to determine the extent of a pitcher’s contribution to the team.

But WAR doesn’t consider context. That is, it looks at a pitcher’s season as a whole — whether through the lens of FIP or something else, without taking into account the order and the relative importance of the variables that it takes as input.

One simple example illustrates this point. Consider the following two hypothetical starting pitchers — Pitcher A and Pitcher B — and two of their hypothetical starts:

Start 1 Start 2 Total
Pitcher A 3 IP, 8 R 9 IP, 0 R 12 IP, 6.00 RA/9, 6.00 FIP
Pitcher B 6 IP, 4 R 6 IP, 4 R 12 IP, 6.00 RA/9, 6.00 FIP

(I didn’t include the FIP inputs, but assume the FIP is proportional to the RA/9 for each start)

Pitcher A and Pitcher B were exactly the same with regards to their FIP-based peripherals and their run prevention. But they arrived at the the total in very different ways. Pitcher A contributed negative value to his team in his first start — giving the team very little chance to win. In the second start, he almost guaranteed his team would win. On the other hand, Pitcher B pitched poorly in both starts. He gave his team some chance to win, but it wasn’t much.

We don’t have to guess which pitcher helped his team more. To evaluate each pitcher’s importance to his team — and keeping each game in context — we have to shift perspectives. Instead of looking at each inning, as with RA/9, or at each plate appearances, as with FIP, we can look at each start and determine the probability of a team’s win, given the quality of the start.

FanGraphs already has one metric that evaluates players in a similar way: Win Probability Added, or WPA. WPA, though, looks at the entire game context. It includes the runs the pitcher’s team scores, in order to determine the extent to which the pitcher increased his team’s odds of winning. Obviously, the pitcher has (almost) no control over how much run support he gets; instead, we want to determine the probability that a team will win, given only the quality of the pitcher’s start.

Although the quality of a start can be determined in a number of ways, only three variables affect the team’s chances of winning: innings pitched, runs allowed and the base-out state if the pitcher leaves in the middle of an inning. The third factor can be addressed by looking at the run expectancy for the inning when the pitcher leaves, but for simplicity’s sake, let’s just consider the first two.

Results

Here, we’ll look at how often the starting pitcher’s team won, given the innings pitched and runs allowed. I pulled the data from 1993 — when the decline in relief innings per appearance began to stabilize — to get as large a sample as possible while still representing modern reliever usage. Below is a graph representing the results:

win percentage by IP and RA 2

Note that for readability and simplicity, the graph only shows starts with a whole number of innings pitched, and it’s limited to three or more innings pitched and seven or fewer runs allowed. Obviously, there are many more possibilities than just the ones above. The graph is simply intended to be a helpful visual reference for understanding a start’s value based on run prevention.

The results aren’t surprising. A nine-inning shutout almost guarantees a win for the pitcher’s team. A seven-run, three-inning start does not. Interestingly, the minimum required for a quality start — six innings and three runs — gives a team almost exactly a 50-50 shot at winning. Interestingly, a nine-inning, four-run start is significantly better than said minimum, though it’s not considered a quality start.

With this data, including all starts, we can evaluate starting pitching using a different approach. To do so, all we’ll do is take each start, look at the pitcher’s innings pitched and runs allowed and then pull the corresponding winning percentage for all equivalent starts based on the data above. We can either take that number as it is to mimic pitcher winning percentage, compare it to an average pitcher by subtracting 0.5, or we can compare it to a replacement level pitcher by subtracting .38.

Testing the theory: Jered Weaver and Matt Cain

A theory like the one presented above has little use or significance until we know how it actually affects player evaluation. In researching who I could compare to test the theory, I came across a fantastic case from the 2012 season: Matt Cain and Jered Weaver.

In 2012, Cain gave up 73 runs in 219.1 innings, which was good for a 3.00 RA/9. Cain never gave up more than five runs in a start, and he never pitched fewer than five innings in a start. However, he only had four starts where he didn’t allow a run to score, and seven others where he held the opponents to just one run.

Weaver was almost exactly the same as Cain as far as run prevention is concerned. He allowed 63 runs in 188.2 innings, which translated to a 3.01 RA/9. But unlike Cain, Weaver was inconsistent. He had one start in which he gave up eight runs in 3.1 innings, one in which he gave up nine runs in three innings and two other starts in which he failed to pitch past the first inning. On the other hand, he had nine starts in which he allowed no runs, and five more in which he only allowed a single run. In terms of starts with zero or one run allowed, Weaver led Cain 14 to 11.

If we’re evaluating the two pitchers based solely on run prevention (ignoring park factors), they were nearly identical on a per-inning basis. But the distribution of the start’s quality was very different. Let’s see what our above approach says about these two pitchers.

Wins Above Average Above Replacement
Matt Cain 20.12 4.12 6.68
Jered Weaver 19.36 4.36 6.76

And the same numbers on a per-start basis (important since Cain had 32 starts to Weaver’s 30):

Wins Above Average Above Replacement
Matt Cain 0.629 0.129 0.209
Jered Weaver 0.645 0.145 0.225

Well, that’s interesting. One would think that consistency — of which Cain was an excellent example — would be a desirable trait for a pitcher. However, using this WPA-based, support-neutral, per-start approach, Weaver — who was almost identical on a run-prevention basis — is given more credit than Cain. In fact, this approach gives him more wins above replacement than Cain, despite pitching two fewer starts.

This difference, while interesting, isn’t big. Over a season, this approach to evaluating pitching only made a difference of about a quarter of a win. The conclusion–that consistency actually hurt the Giants chances of winning, all else equal–is somewhat surprising, though.

If we look at all pitchers in 2012 with this approach, and compare their rankings with this method to their ranking with RA/9, the largest differences confirm the above case study. Pitchers like CC Sabathia and Cliff Lee — both of whom are very consistent pitchers — were most hurt with this method; pitchers like A.J. Burnett and Johan Santana, who weren’t consistent, were helped.

Obviously, the best pitcher would be one who is perfectly consistent and amazing in every start, but this metric suggests that teams would win slightly more often if their pitchers skewed towards the extremes, rather than being reliably okay.

Future Considerations

I’ll be the first to admit there are some issues with the methodology above. I didn’t adjust for the run environment or the park, and the sample size for a few IP/RA combinations was likely not large enough to be confident of win percentage’s accuracy. That being said, my goal here wasn’t to create a perfect metric, but to introduce a different, hypothetically better, method of evaluating a starting pitcher’s past value.

There are a number of paths on which I, or others, can take this approach in the future. The most obvious is to adjust the win probabilities for park and league, so as to more accurately represent each start’s true value.

Secondly, we can employ this approach using more sophisticated methods of pitcher evaluation. Using straight run prevention is the natural first step, but one could also use FIP, RE24 or even something like Game Score in a similar way. Once one determines the probability that an average team wins given the “score” of the start, the rest is essentially just addition.

The title of this piece refers to starting pitcher consistency, and that is certainly the practical implication of the approach that I offered above. However, the other purpose of this approach is philosophical. With regards to evaluating past performance, the most important — the fundamental — unit of measurement for the starting pitcher is the start. The goal is to win, and the way to win is to pitch as well as possible in each start. It is only natural, therefore, to evaluate pitchers not in terms of their effectiveness in each plate appearance, or their effectiveness in each inning, but their effectiveness in each start. There are many ways to do so, and the above approach is almost certainly not the best. But approaching the issue of pitcher evaluation from this standpoint adds a new, and I believe better, perspective, one that I, or others, will hopefully expand on and improve in the future.

Much of the above data is courtesy of Retrosheet. It should also be noted that a similar idea to the one above was presented at Baseball Prospectus nine years ago, but none of my ideas or writing were taken from that piece. 



Print This Post



Matt is the founder of SaberSim, a daily sports projections and analytics company. Follow him on Twitter @MattR_Hunter and @SaberSim, or email him here and tell him all the things he should do to make the site better.


Sort by:   newest | oldest | most voted
suicide squeeze
Member
suicide squeeze
2 years 11 months ago

Good stuff.

One thing I pondered when reading this is whether consistency is a skill that holds across seasons. I understand that less consistent pitchers appear to be a little more valuable, but I wonder if that ability to pitch more great starts is a repeatable skill. If it isn’t, then I don’t know if I would want to give a less consistent pitcher more credit (much like I’m not a fan of how WAR is so dependant on HR rate, which seems to be quite fickle).

James Gentile
Guest
2 years 11 months ago

Oh that IP/RA chart is absolutely fantastic, Matt. Already downloaded and saved for future reference.

Nicely done!

MDL
Member
MDL
2 years 11 months ago

That’s a great chart for possibly redefining the “Quality Start” metric. Or it’s at least a step in right direction.

Moves Like Munenori
Guest
Moves Like Munenori
2 years 11 months ago

It also gives at least some support to the metric. Nine innings and four runs is the only example that doesn’t meet the criteria for a QS and still gives the team a positive chance of winning. The fact that it is so evenly sliced at 6 innings, 3 runs was surprising.

JayT
Guest
JayT
2 years 11 months ago

That’s what I was thinking too. I was quite surprised that 8 innings 4 runs was significantly worse than 6 innings 3 runs. That’s definitely against my gut feeling.

nd910
Guest
nd910
2 years 11 months ago

It seems counter-intuitive at first, but think about the difference between those six outs in the bullpen vs with the starter. By the 7th, the starting pitcher is usually over 90-95 pitches and has no platoon advantage to work with.

This type of result really makes me question the traditional SP/RP roles.

novaether
Member
novaether
2 years 11 months ago

Nine innings and four runs isn’t the only example. You’ve got 5 innings and 2 runs and other smaller stuff too.

The thing that really bothers me is that (according to this chart) 0 IP and 0 R is better than 6 IP and 3 R, yet the latter is a quality start. We can pick our arbitrary definition of “quality” (and perhaps there is value is resting the bullpen), but certainly it should be better than giving your team <50% chance of winning, no?

Xeifrank
Guest
2 years 11 months ago

Will the inconsistent pitcher tax the bullpen more? The bullpens would get the same overall innings to eat, but the inconsistent pitcher would cause the bullpen to eat more innings in short periods of time, leaving the bullpen vulnerable to a back to back poor start.

AJP13237
Guest
AJP13237
2 years 11 months ago

II think it might be plausible however the inconsistant pitcher would save the bullpen a lot of innings on his good days, so it might balance out.

KDL
Guest
KDL
2 years 11 months ago

Would a simple innings per start average answer this question…or is that too simple a solution?

KDL
Guest
KDL
2 years 11 months ago

For reference:
Cain in 2012…just over 6.2 innings per start.
Weaver..2012…just under 6.1 innings per start.

So the cumulative effect appears to be negligible.
It would be a matter of whether there is a negative/positive/measurable impact on having bullpen innings bunched together.

Xeifrank
Guest
Xeifrank
2 years 11 months ago

Right. The effect is having the bullpen innings bunched together.

Would your bullpen be able to handle a 3ip start or a 9ip start following a 3ip start?

KDL
Guest
KDL
2 years 11 months ago

My initial thoughts:

Obviously less innings is better. You can distribute them to better relievers more easily.

But the more innings scenarios are also more likely to be games you’re already out of. (I mean if the starter gives up 6 in 3 ip…how often does that put you in a good spot to win?) And you won’t necessarily be running your best relievers out in these situations.

On this hand…the inconsistent pitcher is either turning over low leverage situations, or situations that are very favorable to the ‘pen.

On the other hand…what the hell do I know

I have no idea how to research this effect. Ideas anyone?
Because I suspect we’ll learn that the best rotation for bullpen effectiveness is going to employ both types pretty equally. Running the risk of too many 3 ip starts will burn out a ‘pen. But having to pitch 2-3 ip every single night adds up, too.

Neil Weinberg
Editor
2 years 11 months ago

Agree with James, love the chart. I, too, am curious if consistency is a skill. Another thing to consider for the future is strength of opponent. If you have a bad start against the Astros, your team has a better chance to come back. How much context is too much?

badenjr
Guest
badenjr
2 years 11 months ago

Being consistent is only bad if you are consistently bad. By definition, if you are inconsistent, you will have some good and some bad starts. If you are pitching in the majors though, you better have more good starts than bad ones. I’d imagine that inconsistency is only a desirable trait because the inconsistent pitchers who survive in MLB are those who are very good most of the time, despite the occasional clunker of a start.

payroll
Guest
payroll
2 years 11 months ago

The Cain-Weaver comparison is enlightening, but I’d love to see a broader comparison of the two types of pitchers and the actual game outcomes of those starts. Because I suspect the inconsistent starters leave messier base-out situations, on average, and in real games that would make them as valuable or even less valuable than the starter you can consistently pull at the conclusion of the 6th after 100 pitches and 4 runs, and follow with a who gets to pitch a full inning with the bases clear at the outset.

DJAnyReason
Guest
DJAnyReason
2 years 11 months ago

Option 2!

Pitcher A 3 IP, 5 R 9 IP, 0 R 12 IP, 2.50 RA/9, 2.50 FIP
Pitcher B 6 IP, 3 R 6 IP, 2 R 12 IP, 2.50 RA/9, 2.50 FIP

Which Pitcher helped their team more?

Sam
Guest
Sam
2 years 11 months ago

According to his chart, pitcher A provided about a .600 winning percentage, while pitcher B provided about a .550. So technically pitcher A, I guess…

Brian
Guest
Brian
2 years 11 months ago

I just like that this chart demonstrates the fallacy of innings eating pitchers, which has always been annoying to me whenever the Twins try to rationalize paying a Pavano or Correia a lot of money. 7 innings/4 runs is much worse than 5 innings/3 runs.

channelclemente
Guest
channelclemente
2 years 11 months ago

A question for consideration. How do ‘holds’ of men on base by the reliever(s) that follow a starter in his last inning of record figure into such an analysis?

Anon
Guest
Anon
2 years 11 months ago

I haven’t looked into the details, but the analysis is based around the per start basis you are investigating. Even if the metric contains flaws, it might also contain some good insight.

http://www.breitbart.com/Breitbart-Sports/2013/07/30/value-add-baseball-born

olethros
Guest
olethros
2 years 11 months ago

The biggest flaw is the word “Breitbart” in the URL.

Anon
Guest
Anon
2 years 11 months ago

NBC, CBS, Disney (owns ABC and ESPN), and Fox all have news and political commentary components. Does that disqualify NBC Sports, CBS Sports, Fox Sports, and ESPN from having legitimate analysis of sports?

If your biggest critique does not even address the content of the metric discussed, that speaks highly of the metric (or poorly of your effort).

olethros
Guest
olethros
2 years 11 months ago

Not at all. But Breitbart’s well-documented history of blatant falsehoods disqualifies them from ever being considered a reliable source. And more importantly, it disqualifies them from ever having me click intentionally on a link to their site, because unique visitors are how they make money.

Anon
Guest
Anon
2 years 11 months ago

You are saying blatent falsehoods from part of a company disqualifies a completely separate sports portion of that company from being legitimate. Good luck finding a company to provide you sports news and analysis; that standard disqualifies NBC, CBS, Fox, and Disney, all of which have at one time or another stated blatent falsehoods from the news/opinion portion of the company (as have lots of other companies including newspapers and radio stations).

If you want to punish an entire company for the political portion of that company, that is your choice. However, I disagree about that disqualifying the sports portion from being credible.

Oh, Beepy
Guest
Oh, Beepy
2 years 11 months ago

It’s almost like we go to alternative sources for our sports information, like these crazy websites or something.

olethros
Guest
olethros
2 years 11 months ago

No. NCB, CBS, Disney occasionally make errors in their reporting, which are almost always later corrected. The Breitbart organization actively created and propagated utterly false stories, often with a pronounced racist component to them. Even Fox, horrifically biased as they are, doesn’t outright fabricate their stories. Breitbart is to news what Jenny McCarthy is to medical advice.

The Sauce
Guest
The Sauce
2 years 11 months ago

Bur the metric is silly. It’s basically similar to the one in the article, except instead of judging a pitcher on run prevention it judges a pitcher based on run support.

Anon
Guest
Anon
2 years 11 months ago

Since you mentioned a ‘pronounced racist component’, I’ll give you an example from NBC: http://news.yahoo.com/blogs/upshot/nbc-fires-producer-over-edited-zimmerman-911-call-201124740.html

People don’t get fired for occasional ‘errors in their reporting’; they get fired for blatant falsehoods.

olethros
Guest
olethros
2 years 11 months ago

You do realize that proves my point, don’t you? NBC investigated and fired the guy within days of the incident, and publicly corrected the mistake. Breitbart continued to support and pay O’Keefe to manufacture fake news and commit felonies.

Anon
Guest
Anon
2 years 11 months ago

How does that prove your point? The firing only occured because of the attention from other media outlets. I only provided a link to a single extremely visible case, but that doesn’t mean it is the only occurrence. NBC currently staffs people who created/transmitted falsehoods similar in style and scale.

nsacip
Guest
nsacip
2 years 11 months ago

I think it would be interesting to apply this approach to an entire team. And also look at hitting this way as well.

What is the contribution of the second moment of runs scored and runs conceded on a team’s record? On a team’s ability to win a five or seven game series? On a team’s ability to win three five or seven game series?

My intuition tells me that the “more inconsistent team” would do better.

olethros
Guest
olethros
2 years 11 months ago

Why in hell is pitching 4 innings worse than pitching three when allowing 2-6 runs?

chuckb
Guest
chuckb
2 years 11 months ago

When i read your comment, i thought, “surely he read the table wrong.” Nope. This is a great question and I’ve got no guess. Hopefully someone can suggest a reason.

chel
Guest
chel
2 years 11 months ago

Maybe this is skewed because of spot starts by relivers and the rest of the bullpen pitching the entire game

olethros
Guest
olethros
2 years 11 months ago

Surely that doesn’t happen frequently enough to juke the data that much, does it?

Jim Lahey
Guest
Jim Lahey
2 years 11 months ago

Less outs for the offense to score if 4innings are over?

Also If sample size is appropriate, think this could be showing that players with the lead early tend to coast, and are more prone to giving up a big lead in the latter innings?

Steven
Guest
Steven
2 years 11 months ago

The probelm with your first explanation is that if you take your starter out after 2 innings 6 RA you are hoping you pen will go 2 and give up 0. But that is what the starter did by himself by going 4 giving up 6.

The second one is probably not the case, but the first one cannot be the case.

Mike
Guest
Mike
2 years 11 months ago

I think the extreme spikes you’re seeing in 9-inning starts resulting in wins, even with multiple runs being given up, is that it’s pretty unlikely for a pitcher to be left in for a complete game when they’ve given up for runs except in the case where the game’s result is no longer in doubt. I suspect the average runs scored by the pitcher’s team in those high-run complete games is significantly skewed toward the extremes.

Al Dimond
Guest
Al Dimond
2 years 11 months ago

Yeah… the number of runs the starter’s team has scored definitely impacts when the starter is pulled. Especially in the National League, where a trailing starter is much more likely to be pulled for a pinch-hitter than a leading one. It might be interesting to split the results NL vs. AL, as in the AL starters are probably pulled for reasons more purely related to their own performance.

B N
Guest
B N
2 years 11 months ago

This. Number of innings a pitcher is left in actually captures how much offense their team put up. It is in no way independent. Ironically, runs against is also not independent for the same reason (as if you’re left in longer, you’re more likely to give up more total runs). I’m not quite sure how to adjust for this, but it gets pretty obvious at the extremes.

Ron
Guest
Ron
2 years 11 months ago

I agree completely. I was going to comment on selection bias, especially in the 9th inning, and was reading all the comments to make sure it wasn’t redundant.

Nathaniel Dawson
Guest
Nathaniel Dawson
2 years 11 months ago

Thought I’d point out that Bill James actually hypothesized the same thing in one of his Abstracts back in the 80’s. A team that gives up 0, 2, then 10 runs would likely have a better win/loss record than if they allowed 4, 4, and 4. Turn it around the other way, a team that scores exactly 4 runs per game would have a better win % than a team that averaged 4 runs per game but were inconsistent. The conclusion being it’s better to have inconsistent pitching and consistent hitting, rather than the other way around.

He was talking about it on a team level, rather than individual pitchers, but the concept is the same.

sgnthlr85
Member
sgnthlr85
2 years 11 months ago

Great stuff

Biesterfield
Member
Biesterfield
2 years 5 months ago

This is correct. Because baseball is a zero sum game, what is good for one team is bad for the other team.

The reason why this whole thing exists in the first place is because winning percentage as a function of runs is not a linear relationship. It follows a Weibull distribution. If you regress standard deviation of runs scored on winning percentage, you get a negative relationship. If you regress standard deviation of runs allowed on winning percentage, you get a positive relationship. Which confirms Matt’s article.

George Resor
Member
2 years 11 months ago

For some of the latter innings run combinations you should consider looking at all games where a team only allowed that many runs over that many innings because then you would get a larger population, also it could remove some bias introduced by complete games. The starting pitcher is from the visiting team he can only pitch 9 inning if his team winning or tied so that would definitely bias your system making the pitching 9 innings super valuable. your graph would imply having your starting pitcher allowing 6 runs over 9 inning is better than 4 runs over 8 innings which is not true.

Ruki Motomiya
Member
Ruki Motomiya
2 years 11 months ago

Isn’t the graph somewhat skewed by sample sizes? Not a lot of players are going to go a full 9 innings while giving up 4 runs and not have a reliever come in: Odds are the matchup is a blowout if that occurs. On the other hand, a six-inning three-run start is much more common. (Though as I read on, you addressed this later, so…good on you and I should read everything before typing up comments!)

Something else I’d be interested in: How many times was the game lost because a later reliever messed up the game, rather than the SP? Or, in other words, a chart like that, but with if they left with a lead or not.

My initial inclination is that the debate over you want an inconsistant or consistant pitcher depends on the consistancy of your offense: The more inconsistant your offense (IE Scores 8 runs one game, 0 runs the next game), the more consistant you want your pitching (Because you don’t need a gem to win when you score something like 8 and a gem doesn’t help you if you don’y score any). Likewise, a consistant offense (Scores 4 runs each game) might prefer an inconsistant pitcher (Since the guy who gives up 4 runs per makes it close and iffy, but the inconsistant guy means you’ll win his good starts and lose his bad ones as per normal).

It might also depend on the rest of your rotation: I suspect a rotation full of inconsistant guys would be wasteful, for example. While the playoffs ARE a crapshoot, it is also something to consider: You really don’t want a guy to go out and throw out 6 ER when you only have 5-7 games max to decide a winner.

DrEasy
Guest
DrEasy
2 years 11 months ago

Say the opponent always scores 4 runs a game, for argument’s sake. Say your pitcher A has an ERA of 8 (!) over a season, but with a very high standard deviation, so that in quite a few games he manages to allow fewer than 4 runs. He’s obviously going to win more games than pitcher B, who consistently allows 8 runs per game.

Conversely, if pitcher A has an ERA of 1 over a season, during which he occasionally implodes and gives up more than 4 runs per game, he is going to lose a few games, whereas a pitcher who consistently allows 1 run per game will never lose.

It seems to me that that’s all there is to it. Now if you don’t like the simplifying assumption of the opposition systematically scoring 4 runs,you can run a Monte Carlo simulation to allow for whatever standard deviation you’d like. Results will also be different if you use a different mean instead of 4, obviously.

DrEasy
Guest
DrEasy
2 years 11 months ago

Oops, it’s your team who needs to score 4 runs per game, obviously.

Michael
Guest
Michael
2 years 11 months ago

Would like to get your opinion on who the two weakest pitchers on my team are? This is a H2H points-based league with a heavy emphasis on wins (+7 points), QS (+3 points) and losses (-5 points). My pitchers in question are Erasmo Ramirez,
Alexi Ogando, CC Sabathia, Nathan Eovaldi, Randall Delgado, Joe Kelly, Gerrit Cole, Sonny Gray, Ryan Vogelsong. I have a trade in place for draft picks in exchange for Kyle Lohse and Ricky Nolasco. Which two pitchers should I drop though?

Michael
Guest
Michael
2 years 11 months ago

Also: I’ve already clinched the playoffs. So, this would be a move to prep for the playoffs instead of a win-now kind of move.

Oh, Beepy
Guest
Oh, Beepy
2 years 11 months ago

As a fantasy fan, go die in a fire.

If the article is in green and on FanGraphs and not RotoGraphs, leave it alone.

Max
Guest
Max
2 years 11 months ago

does anyone have a possible reason why 6 ip/3 runs is significantly better than 8 ip/ 4 runs? something to do with bullpen use perhaps?

Kevin
Guest
Kevin
2 years 11 months ago

Sort of makes sense. You have to imagine that whatever relief pitchers you put in will cumulatively have an ERA lower than 4.50, so most of the time you would end up in a better position using your bullpen than getting 4.50 ERA out of your starter for those innings.

Oh, Beepy
Guest
Oh, Beepy
2 years 11 months ago

Its also probably pretty unfair to include things like 3IP 0ER starts where it was likely some kind of injury or rain delay forcing the pitcher to exit.

Also, apparently if you’re gonna give up 2 runs, you better not go 4 innings. 3 is better.

dodgerhater
Member
dodgerhater
2 years 11 months ago

along the lines of what suicide squeeze said, would this information be useful as a predictive measurement of a pitcher’s “value” or would this have to be relegated to retroactively evaluating a pitchers performance. It would seem that like he said a pitcher who has been excellent but is inconsistent may very well have been more valuable on a per game basis than a pitcher who has been above average but consistent in one season but you would run into snags when trying to predict his future success. the example I’m thinking of off hand is Oliver Perez, for one season he was arguably the best pitcher in baseball but was plagued by inconsistency and ultimately the poor performances began to drastically outnumber the dominant ones.

beckett19
Member
beckett19
2 years 11 months ago

Nathaniel Dawson-
Your point about James hypothesizing on this matter is spot on. I didn’t read the article you reference, but I did read one in “The Politics of Glory” where he wrote a computer program to simulate a career of a pitcher and measured how many pennants the team won in the career. James too found that a short shooting star-type career was more valuable than a long term consistent career.

Tom
Guest
Tom
2 years 11 months ago

Something that kinda pertains to this, I’ve often wondered what is more valuable, an oft injured player that posts a good WAR but only in 100 games with a replacement level backup, or a player that posts that same WAR but in 160 games.

Ruki Motomiya
Member
Ruki Motomiya
2 years 11 months ago

That would depend on the value of the player replacing him and the salary of the team in question.

The first player is much more valuable to, say, the Yankees who can pay for a good replacement player, rather than…say…the Rays who might need to take a flyer on someone.

Ruki Motomiya
Member
Ruki Motomiya
2 years 11 months ago

Also, the oft injured player may have the injuries sap his skill, which could cause a quicker decline.

knucks
Guest
knucks
2 years 11 months ago

Maybe this shows why a knuckleballer can still have value, even though some days it flutters, some days it doesn’t.

Chuck
Guest
Chuck
2 years 11 months ago

That chart is really interesting to me, too, particularly the “value” of pitching the ninth. For example, 9 innings, 3 runs is (slightly) better than 8 innings, 1 run. I’m guessing this is because if the game was truly a nailbiter, the closer would be brought in, so that if the starter is going into the ninth inning having allowed 3 runs, he probably has a bigger lead than the closer going into the game with only one run scored so far?

Chuck
Guest
Chuck
2 years 11 months ago

Ugh. Duh – and also if they’re pitching the ninth inning on the road they are either tied or winning. Duh, Chuck. Duh.

Jon Roegele
Guest
2 years 11 months ago

Enjoyed this Matt!

I find it interesting the jump in win% between a starter going 4 and 5 innings. When you say you only included starts with a whole number of innings pitched, do you mean you rounded down, or that starts with partial innings are removed?

I’m wondering if this phenomenon could be from cases where a starter is left in to TRY to get the W, and ends up leaving part way through the 5th inning with the base/out state in rough shape for the next guy. It would depend how you handled the partial innings I guess as to whether this is possible.

If so, I’d love to see you (or someone else I suppose) look at the 4th-5th inning in more detail, to see how much average win% drops by trying to leave in the starter long enough to qualify for the W as opposed to taking him out when win% would be higher with a new pitcher. It just seems like the jump 4th-5th is so much higher than other innings, that something like this must be up.

Nice work Matt!

Steven
Guest
Steven
2 years 11 months ago

I believe by “only included starts with a whole number” he meant that he has the data for every start length and runs allowed amount, those that go 4.1, 4.2, 5 etc. Thus, when he analyzed Cain vs Weaver, he performed no rounding.

He only gave the chart with whole number because it would be too bug of a chart to post otherwise. Furthermore, I don’t think he rounded for the chart either, simply omitted the data for the other starts. That was my impression anyway.

Leo
Guest
Leo
2 years 11 months ago

Great work!

Have you though of quantifying this variability through the use of standard deviation (or higher moments like kurtosis) somehow?

Luke
Guest
Luke
2 years 11 months ago

i would feel more comfortable if this used FIP;

but this article sits uneasy with me, and I am not sure why;

probably, because the article looks at pitcher ‘wins’ as the end goal, and we all know that there is not a strong correlation between a pitchers talent level and ‘wins’.

Carl
Guest
Carl
2 years 11 months ago

Incredible article. Would suggest the following:

Have 2 charts 1 for SP and 1 for relievers.

Run/re-run the top candidates for CYA each year for the last decade and see who had the better W-L record (should have won) for the Cy Young.

Billy
Guest
Billy
2 years 11 months ago

Would inconsistent starters be more useful to teams with mediocre offense, while more consistent starters of the same overall caliber would be more useful to good offensive teams? It would seem so.

swanized
Member
swanized
2 years 11 months ago

I think pitcher consistency is valuable to a team that scores a lot of run while bad offensive teams would prefer inconsistent starters.

Say you have a starter that always gives goes 9 innings and gives up 4 runs, on a team that averages 2 runs a game that starter almost never helps them win games. A team that averages 6 runs a game would love to have that starter though.

On the flip side if you have a starter that alternates between 9 shutout innings and 9 innings with 8 runs allowed, the team that averages 2 runs a game would love to have that guy while he’d be useless on a team that averages 6 r/g.

CircleChange11
Guest
CircleChange11
2 years 11 months ago

I recall some information from here or a similar site that concluded that if a pitcher has a lot of talent, consistency is better. But if they have lesser talent, they are better suited by inconsistency.

The name that jumped to my mind was Edwin Jackson. For some reason teams allow him to absorb more than his share of bad outings where he’ll pitch 5-7 innings and allow 6-10 runs, which makes his overall season stat lines look a lot worse than they could.

I also notice that when broadcasters use the term consistency, they really mean consistently high performance. No one seems to praise a pitcher that allows 4 runs over 6 IP every time out. But a pitcher that mixes in 2 shutouts with 2 stinkers will be lambasted for inconsistency.

Obviously managers like consistency because they can plan for it.

Brandon Firstname
Guest
Brandon Firstname
2 years 7 months ago

I’m almost halfway through writing a community article that does pretty darn exactly what you’ve done here. The research is the same, the presentation is the same. It’s all pretty crazy, just how darn close my article is to this.

I started out with a hypothetical two starts, went on to find the win values of all of the different IP/runs allowed, and then went on to calculate pitcher wins above average and WAR. If only I would have seen this article earlier, instead of doing all the digging myself :)

You handled the information very well, though, good article which I’m glad I found so that I didn’t end up submitting something pretty identical on accident.

obsessivegiantscompulsive
Member
2 years 6 months ago

Very interesting article, I found it fascinating!

You mention using Game Score as a potential metric to use. Have you given any thought to using Ron Shandler’s book’s PQS (Pure Quality Start) methodology? It is a saberized method for evaluating a quality start.

I’ve been using it to examine how teams do when their pitchers have a quality start vs. not, and obviously, they win a lot, but so far I’ve been finding win rates of 70-80%, which puts a number on how much it affects winning.

The PQS methodology looks at the percentage of time the starting pitcher is dominant (4 or 5 PQS) and this relates squarely with your study here, trying to examine how consistent a pitcher is in delivering a quality start.

It is shocking to me that a more predictably good pitcher like Cain could be penalized for being good at what he does, while a more variably good pitcher would benefit. I look forward to your future studies into this phenomenon.

wpDiscuz