Should we be trying to predict FIP instead of ERA?

In the past months, I’ve written constantly about ERA estimators. During that time I have moved all the way from being a major advocate for more advanced/complex ERA estimators (for example: xFIP and SIERA), to my current stance, that I’m really not sure there is any use for more complex estimators. Honestly, moving that far across the spectrum was not easy, but all the numbers that I found backed the more simple estimators.

I have decided to move away from the world of estimating runs, for at least one article. Based on a suggestion from Colin Wyers, I’ll attempt to find the best way to predict the individual components of fielding independent pitching (strikeouts, walks. home runs). The ability to project those elements for a pitcher in the coming season would be valuable to a major league team.

I tried a slew of different multiple regressions using measures based on PITCHf/x data and from Baseball Info Solutions, but no measure that I could find was more successful at predicting those three components than the components themselves. However, this failed research brought me to a different, much simpler idea.

I have spent the past months trying to find the best ways to predict runs. The problem there is that the number of runs a pitcher allows is affected by a lot of things outside of a pitcher’s control; which is why we have measures like FIP. I’m currently in the camp of using FIP, on a per season basis, as my starting point for analyzing a pitcher’s true performance rather than runs allowed or ERA.

So, in the midst of trying to predict the components of FIP, I decided instead to attempt to predict FIP. If FIP is better than ERA at describing a pitcher’s performance, then why don’t we try to predict FIP instead of runs?

My goal had thus changed from finding the best measure for predicting runs to finding the best measure to predicting FIP.

The study

I took a sample of starting pitchers who threw at least 120 innings in Year X and at least 100 innings in Year X+1 for the years 2004-2012, the same sample I used for predicting runs in an earlier article.

I tested both complex and simple estimators:
{exp:list_maker} FIP
Predictive FIP (pFIP)
{/exp:list_maker}I used a simple linear regression to test each of these estimators in Year X versus each starter’s FIP in Year X +1. I used r-squared to measure the explanatory power of each estimator.

The r-squared tells us the percent of variation in FIP in Year X+1 explained by the estimator in Year X.

The results (n= 703)

May I Have Your Autograph, Please?
The payoff of being polite.

Estimators r^2
SIERA 32.20%
xFIP 32.12%
pFIP 31.37%
FIP 28.67%
K-BB 27.12%


The results of this test were fairly interesting. The two main conclusions that I drew from them were:
{exp:list_maker}FIP is easier to predict than Runs Allowed
The complex estimators are better than FIP itself at predicting FIP {/exp:list_maker} The r-squareds for the various estimators in this study ranged from .27 to .32, which is much higher than the r-squareds that I found when testing the same estimators for this sample against runs allowed (the range was from .17 to .20). Thus, it seems that FIP is easier to predict than runs allowed or ERA.

This conclusion may seem obvious to those well versed in sabermetrics. FIP has only one factor (home runs) with substantial amount of variability, while ERA has multiple factors that, on a per season basis, promote a great deal of variability (especially batting average on balls in play, BABIP). Logically, it would make sense that the statistic that is less affected by random variation would be easier to predict.

While, the fact that FIP is easier to predict than ERA may seem logical to some, I think statistical evidence to back that conclusion is interesting and useful.

In the introduction to this article, I briefly discussed my swift ideological move away from more complex estimators. Yet, based on this sample, the two complex estimators that I referred to (xFIP and SIERA) did the best at predicting future FIP.

Is the fact that more complex estimators do a better job of predicting FIP in a subsequent season than FIP itself enough evidence to back their usefulness?

My answer: Possibly.

There are a few issues with the question I raised.

First and most important: Is predicting FIP more important than predicting runs? I may be in the minority here, but I would argue that the answer to that question could be yes.

If I worked in a major league front office, I think I’d be more concerned with how a pitcher would perform independent of fielding. Because I know what to expect from my team’s defense, I could combine what I knew about a pitcher’s future FIP with what I knew about my defense to get a final (better) prediction of how many runs I could expect that pitcher to give up.

The second issue has to do with my simple statistic (predictive FIP). It is true that the complex estimators were significantly better at predicting FIP than FIP; however, they were not significantly better at predicting FIP than pFIP.

The r-squareds for SIERA and xFIP were higher than pFIPs, but the difference was not statistically significant (at an alpha level of .05). So, I cannot say with much certainty that this study is straw in the cap of more complex estimators.

Finally, I’d like to leave these questions with the community.

1. Should we be trying to predict FIP rather than ERA?

2. Does this study help back the usage of more complex estimators?

Any comments or emails would be much appreciated.

References & Resources
All statistics come courtesy of FanGraphs.

Print This Post
Sort by:   newest | oldest | most voted
Glenn DuPaul
Glenn DuPaul

@Brad what will happen with what? ERA or FIP?

Brad Johnson
Brad Johnson

If I recall (and I might not), the article was just about the best use of FIP and xFIP and not about how they relate to ERA.

It might have been more correct for me to say “FIP is best used to describe what SHOULD HAVE happened”

Brad Johnson
Brad Johnson

This jives with previous research at Fangraphs that suggested that FIP is best used to describe what happened while xFIP is useful to predict what will happen.

Glenn DuPaul
Glenn DuPaul

@vivaelpujols K-BB predicts FIP better than ERA.

The r-squared for FIP was .2712
The r-squared for ERA was .1841. 

K-BB does better than most of the other estimators at predicting ERA, but it predicts FIP better than ERA, just like the other predictors, it just happens to be worse than the others because it lacks the home run component.

I hope that answers your question.


I find it interesting that K-BB does not predict FIP as well as it does ERA.  Any guesses on why this is the case?

Jon Roegele
Jon Roegele

Interesting article, Glenn.

I’m partial to FIP myself, and have also been recently looking at BB% and K% estimators. As you noted though, year-to-year these two are pretty stable. I guess home runs are the place to focus.

Predicting ERA seems like a much more complicated, and thus perhaps interesting task. This is a fun area of research! I look forward to reading more of your work on this topic.