(it takes me a while to get there, but in this post I propose an alternative to the current FanGraphs Points scoring system for pitchers)

Last week, LuckyStrikes asked me to do a post talking about and defending the scoring system used for pitchers in ottoneu FanGraphs Points leagues (I’m the one that developed it).  He pointed out that Doug Fister has been a top-20 pitcher thus far in FanGraphs Points, and went so far to say that it seemed like any scrub pitcher in San Diego or Seattle seems to do well in this system.

Here’s the thing: Fister arguably has been a top-20 pitcher thus far (or, at least, right on the fringes of top-20).  He has 2.7 WAR, which ranks exactly 20th in MLB right now among pitchers, with a 3.09 ERA and a 3.13 FIP in 125 innings!  He’s been fantastic.  In fact, the entire Seattle rotation has been fantastic:






Honestly, if a scoring system did NOT rank all of these pitchers highly, I’d be pretty concerned.  Maybe their xFIP indicates some of them won’t carry this forward.  But we’re scoring based on 2011 performance, not what they’ll do in the future.

To further illustrate the point, here’s a comparison of how the top 100 starters by FanGraphs Points compares to their current FanGraphs WAR:

Comparison of FanGraphs Points totals to fWAR for pitchers.

While not tremendously surprising given that they are both “built” upon FIP, it is the case that FanGraphs Points for pitchers tracks closely with FanGraphs pitcher WAR.  But there are differences.  Some of that is the fact that WAR corrects for park effects, while FanGraphs Points do not.  But more importantly, while WAR uses replacement level as its baseline, FanGraphs Points is designed to scale more like an absolute runs statistics like wRC to match up with the hitting points.  That means that FG Points will give a lot more credit for playing time than WAR, which is the main reason that you see more scatter on the left side of this figure.  Some pitchers have thrown a lot of innings, but have not performed well in those innings, and so they accumulate little WAR but a decent number of FG Points.  Again, this is by design.  To get a better comparison to WAR, you really should look at points above a replacement player in the same number of innings, which I’m not doing here.

The point is, the FG Points system works pretty darn well in terms of capturing season-level pitcher value.  That said, while I’ve gotten almost no questions about the hitter points (I assume everyone loves them), I’ve fielded a ton of questions over e-mail and in other contexts about how pitching is scored in ottoneu.  Some people simply do not like it.  The reason I think that there are so many questions about the pitcher scoring is that, on a start-to-start level, it doesn’t always jive our perceptions.  Take, for example, Francisco Liriano’s no hitter:
9 IP, 0 ER, 0 H, 0 HR, 6 BB, 2 K.
It was worth 31 points in the FanGraphs Points system, which is the score you’d expect from a fairly average, solid start.  And it was a NO HITTER.  Perhaps a really lucky no-hitter, but a no-hitter nonetheless.

What about Madison Bumgarner’s disastrous start:
1/3 IP, 8 ER, 8 H, 0 BB, 1 K
That was worth +4 points!  A positive score when it was a historically bad appearance.

The reason they are scored as they are is that FanGraphs Points uses only innings pitched, strikeouts, walks, and home runs to score pitchers (just like FIP).  It works great at the season level, but at the single-start level it can lead to some outcomes that are admittedly counterintuitive.  That doesn’t necessarily mean that they are wrong, or in need of fixing.  But it takes a leap of faith to accept that Bumgarner’s start wasn’t really as bad as it appeared.

A Proposed Alternative

In an effort to come up with an alternative, I’ve been fiddling with a modified points system that includes point penalties for hits allowed.  I’ll provide discussion of the methodology at the end of the post, so scroll down for that.  But here’s what I’ve come up with:

IP: +7.4
K: +2
H: -2.6
BB: -3
HBP: -3
HR: -12.3
SV: +5
HLD: +4

The only differences between this system and FanGraphs Points are the points attributed to innings (previously +5), hits (previously 0), and home runs (previously -13).

Here’s a comparison between the current FanGraphs Points and this new system:

As you can see, it doesn’t much of a difference at the half-season level.  The correlation is 0.984 between the two scores, and the standard deviation of the difference among the top 200 pitchers is 32 points.  By season’s end, based on past years, the standard deviation will be ~46-50 points.  So we’re talking about a difference of one or two starts’ worth of points (or less) for two thirds of all pitchers.  Furthermore, the average scores track very well, with a slope very close to one.  Better pitchers score slightly (~20 points?) higher in this system, while poorer pitchers score a comparably lower amount.  It’s not a big difference.

The pitchers who get the biggest bump moving to this new system are those who have an ERA that is substantially better than their FIP: Justin Verlander (+108 pts), Jared Weaver (+87 pts), Josh Beckett (+84 pts), etc.  And the pitchers hurt the most are (mostly) those who have had FIP’s smaller than their ERA: Chris Carpenter (-47 pts), Jeff Francis (-65 pts), Jake Westbrook (-63 pts), etc.  So, what we’re doing in this system is rewarding pitchers for getting “lucky” and penalizing pitchers for getting “unlucky.”

What do we gain from this?  Here is a table showing all of last Sunday’s starts, along with their scores by FanGraphs Points and this new points system:

Three takeaways:

1. For the most part, it doesn’t matter much.  Most games are are scored within 4-7 points under the two systems.

2. It does matter in those games in which pitchers give up far more or far fewer hits than expected.  For example, Jake Peavy had a bad start on Sunday, lasting only 4.1 innings and giving up 5 runs.  The old system rates it as a not-good-not-terrible 20-point start: he walked 2, struck out 2, allowed no home runs, but didn’t last 5 innings.  The new system rates it as a bad, near-zero start because he gave up 10 hits in that time, which led to a lot of runs.

Similarly, Cole Hamels was very good over 8 innings on Sunday, striking out six, walking two, and allowing only 3 hits and one run.  FanGraphs Points gives him 46 points for the effort, which is great.  But this new system gives him 11 more points because he only allowed three hits, bumping his total up to 57 points overall.

Two other examples: Francisco Liriano’s no hitter I mentioned earlier?  Instead of 31 points, it’s worth 53 points under this new system.  And Madison Bumgarner’s disaster?  Now, it’s worth -19 points instead of +4.

3. It does tend to be the case that the best starts get a few more points under this system, while the poor starts get slightly fewer points.  That means that the spread for what a pitcher might score on a given day has increased under this system.  It also makes streaming more dangerous, which is probably a good thing.

Is This a Change Worth Making?

I have run this by Niv Shah, who owns/runs the ottoneu games, and he is considering changing over to this system in ottoneu points leagues and/or in pick six at some point (though he has yet to make a decision).  And unless I uncover significant problems with it, I’m also going to propose we move to this in the Yahoo league I run, which uses pretty much the same points system as ottoneu.  As I see it, these are the pros and cons to changing the system:

* Individual starts are more intuitively scored in the new system, and are a better match for what happened on the field.
* Overall season scores don’t change a lot (few pitchers would see a real change in value), but when they do they more closely follow stats like ERA than FIP (i.e. they match on-field performances better).
*  The rare pitcher that has the ability to consistently outperform his FIP will be rewarded in this system.
* Fantasy managers may be able to use knowledge of FIP/xFIP and such to gain an advantage over less savvy managers, as in most other fantasy leagues.

* Change means that some pitchers will inevitably see a change in their value.
* Fielders will have more influence over pitcher scores than before, and the same is probably true for park effects.
* Lucky pitchers get rewarded more than they used to when they are lucky, and unlucky pitchers get penalized.  This is how baseball works, of course, so we could view this as a “Pro.”  But we need to acknowledge that this is, by in large, what we’re imposing by adopting this system.

To me, the pros of a system that better matches real baseball outweighs the cons.  A google spreadsheet showing current 2011 statistics and scoring by both systems can be found here.  Please feel free to take a look.  I’d be interested to hear what you think about this in the comments below.

A few words on the methods (you can skip this if you want)

To figure out appropriate coefficients for hits and home runs under this new system, I used the pitcher statistic base runs equation found here to estimate total runs allowed for the 2010 season.  I then used the +1 method to determine how many extra runs would come from a hit or a home run.  This simply means you add one to the home run total and see how much your estimated total runs increases while holding the other factors constant.  Using this, an extra home run (but no extra hits) gives +1.23 runs, which equates (using 10 points per run, as in the hitting scoring) to -12.3 points.

An extra hit provided an extra 0.515 runs.  However, I opted to divide this total in two before converting to points, which resulted in the current point value of -2.6 points.  I did this for two reasons.  First, it adds some regression into the system.  Hits are important, but they are also subject to a lot of things that pitchers cannot control, and therefore I think it’s appropriate to reduce their impact a bit.  Second, if you don’t do this, you start seeing the scores for top pitchers soar from the 1200 range up into the 1400’s, which completely obliterates hitter scores.

Innings Pitched totals were just increased until the scores were on the same scale as the previous system.  I wish I could claim there was a better justification for them being 7.4 points, but there really isn’t.  It’s just a value that “works.”

I expected the home run scores to decrease more than they have once we added hits, maybe getting them into the -8 point range.  The reason they’ve continued to be important is that they, combined with hits, are used to estimate doubles and triples.  Once you start adding in points for hits, and adding more points for outs to counter this, you need the points on home runs to keep ground ball pitchers valuable.

Also, I considered changing the values of walks, hbp, and k’s as well.  However, the spreadsheet indicated that the value of walks and HBP’s should increase in magnitude (to -3.3 pts), and the value of strikeouts should decrease substantially.  This would mean adding even more points to innings pitched to offset those loses, which led (like with the hit points) to extreme values at the top of the scale, and to a loss in value to certain pitchers that seemed inappropriate.  Therefore, I opted to keep them as they are.

