Regression, Where Art Thou?

One of the statistical terms we mention quite a bit here when evaluating players is regression. The given definition of the word is to go back or return to a previous state. With regards to baseball players, we use the word to describe what is likely to happen to players either over- or underachieving at any given point in a season.

Chipper Jones was hitting .400 halfway through the season. Did we expect him to continue that torrid pace all year? No, his performance was expected to regress as more plate appearances were accrued. The term can be a bit confusing because it so often finds itself used with overachieving players, but, just like Chipper Jones, it goes both ways… to clarify, that’s a switch-hitter joke. Regression can also refer to a player like Robinson Cano, who performed so poorly at the beginning of this year that you knew he just had to get better. His regression resulted in an impovement.

Three pitchers we have discussed multiple times on this site in posts involving this very term are Joe Saunders, Ervin Santana, and Gavin Floyd. All three are in the midst of career years, but either their ERA-FIP differential or true talent projections told us that they would not be very likely to sustain their performance levels all year. Since all three have made 27 starts, let’s compare the first 18 to the most recent 9 (keep in mind that the FIP is crude here, merely adding 3.20 instead of the exact figure):

Joe Saunders, LAA
18: 120.1 IP, 105 H, 14 HR, 31 BB, 63 K, 3.07 ERA, 4.44 FIP
9:  50.1 IP,   62 H,  5 HR, 17 BB, 18 K, 5.19 ERA, 4.79 FIP

Gavin Floyd, CHW
18: 111.2 IP, 87 H, 17 HR, 47 BB, 75 K, 3.63 ERA, 5.10 FIP
9:  55.1 IP,  56 H,  6 HR, 17 BB, 44 K, 3.58 ERA, 3.94 FIP

Ervin Santana, LAA
18: 121.1 IP, 106 H, 12 HR, 32 BB, 112 K, 3.56 ERA, 3.43 FIP
9:  62.1 IP,   57 H,  6 HR, 11 BB,  71 K, 2.70 ERA, 2.89 FIP

Oddly enough, only Saunders has experienced any type of regression over his most recent nine starts. Floyd and Santana have improved. While Floyd’s ERA is essentially the same in the split, his FIP is much better, meaning the performance has been more skill-driven than before. Santana has been pitching lately like the guy on the Mets with the same last name, if not better. With only four or so starts remaining for each of these players, barring some Sabathia-in-April-type performances, not much damage to their overall stat-lines can be done.

Still, one year of data isn’t enough to evaluate a player and the true talent level will still give us a much better estimate. This is why, even though Floyd and Santana are pitching very well, I would have to imagine they do not strike confidence in fans of the South Siders and Halos. The last important thing to remember is that regression does not always result in a bad season. Even though Saunders has performed poorly lately, he is not that bad, and still has a very solid ERA. He isn’t as good as we were “led to believe” early on, but not as bad as his most recent starts. The jury is still out on Floyd and Santana. Hopefully, next year, at this time, we’ll know if they are for real or flashes in the pan.





Eric is an accountant and statistical analyst from Philadelphia. He also covers the Phillies at Phillies Nation and can be found here on Twitter.

5 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Bill Krevski
15 years ago

Santana has always had #1 starter stuff, this year he commands it and in turn has become a #1 starter. I see no reason to expect any sort of regression, he’s as legit as they come, but you dont have to take my word for it, just turn on the Angels every 5th day and see for yourself.