# A Random Walk with FIP

Recently, I have begun to notice more and more distain for Defensive Independent Pitching Statistics (DIPS). There is a sizable group of individuals that believe some DIPS such as Fielding Independent Pitching or FIP is a poor metric because certain pitchers consistently “outperform” their FIP. More specifically, some starting pitchers consistently have lower Earned Run Averages than their FIP implying that there is something that FIP fails to account for. While there is no denying that FIP is imperfect, I could argue that all metrics are imperfect, thus saying so is somewhat trivial. Unfortunately for those that use Matt Cain and the likes as poster boys for “Why FIP is Flawed”, a small handful of counter examples is incapable of delegitimizing a stat like FIP.

**Thought Experiment**

Let’s begin by making some assumptions:

1) FIP is a perfect statistic that accurately measures a pitcher’s true talent level.

2) ERA equals FIP + ε, where ε can be seen as the luck or error term. => ε = (ERA-FIP)

3) ε is independent and symmetrically distributed around FIP.

4) There are 100 starting pitchers in the league (There are in fact about 150, but we’ll use 100 for simplicity)

Now that we have established this idealized situation, we can now begin our thought experiment. Because ? is symmetrically distributed around FIP, there is an equal probability that a pitcher will have an ERA lower than their FIP as they will have an ERA higher than their FIP. Using this fact, it follows that in our first year, if we have 100 pitchers, we expect half to outperform their FIP. This means that there are 50 players that outperformed their FIP in year one. Of those 50 players that outperformed their FIP in year one, we would expect 25 (.5*50) of them to outperform their FIP in year 2 by pure chance. Of those 25 players that outperformed their FIP in year two, we would expect 12.5 of them to outperform their FIP in year 3 by pure chance. Similarly, we can continue down this path halving the number from the year before. In year four, we would expect about 6 pitchers to have continued to outperform their FIP, and by year 5 we would expect just over 3 pitchers to have consistently outperformed their FIP by pure luck.

Because we started with 100 pitchers, we expect that about three of the pitchers would outperform their FIP in 5 consecutive years, by randomness alone. Many people point to those three pitchers and say, “Clearly, FIP is not accounting for something those three pitchers do.” We can now completely discount that argument for the “simulation”, because we have assumed FIP to be perfect. Thought experiments are nice because they easily allow you to comprehend and visualize a phenomenon, but there is not a lot to glean, if the experiment is completely incongruent with reality.

**Reality Check**

I began by working backwards, looking at starting pitchers in 2011. Thankfully, our leaderboard has a stat called E-F, which is precisely the ε term I described above. 50% percent of the pitchers outperformed their FIP in 2011 (this is a nice start. Of the 50% that outperformed their FIP in ’11, 41% (a bit lower than we would expect) outperformed their FIP in 2010, giving us about 22% of our original group of pitchers (in the thought experiment we had 25% at this point. Of the remaining 22%, 52% outperformed their FIP in 2009, giving us about 11% of the original group – very close to the expected 12.5%. Of the remaining 11%, 56% outperformed their FIP in 2008 giving us about 6% of our initial group – almost identical to the 6.25% from the thought experiment. Finally, of the just over 6% that outperformed their FIP for four consecutive years, 60% outperformed their FIP in 2007, giving us a not so unexpected final total of 3.6%. If you look at how these numbers compare to our thought experiment above, the similarities are staggering. I went through the same process again, but this time starting in 2007, and working my way to 2011. The results were 56%, 25%, 11%, 6.7%, and finally 4% of the original starting pitcher group in 2007, 2008, 2009, 2010, and 2011 respectively, which again is striking similar to our idealized situation above. The pitchers that outperformed their FIP for 5 consecutive years were Ted Lilly, Jeremy Guthrie, and of course, Matthew Cain.

This doesn’t mean that Cain & Friends have solely benefitted from luck, nor does it mean that FIP is in fact perfect, but it does mean that using Cain and others like him to discredit FIP doesn’t make sense.

Print This Post

Many have a more nuanced view. If you project next-year ERA from 1 year statistics (i.e. 160-240 innings), FIP will on average be more accurate than ERA. If you project next-year ERA from longer-term statistics (600+ innings say), ERA will be on average more accurate than FIP. FIP is a tool, and (like all tools) has its limitations.