Fluke watch: Deciphering pitching illusions

After the discussion in the comments section of the last Fluke Watch, I thought I’d spend this next column explaining the goal of this column and the theories upon which this work is based. Consider this like a mini-primer or FAQ of sorts.

The Problem

Suppose you have pitcher X, who has the following statistics in 2009 and 2010:

2009: 6.0 K/9, 4.0 BB/9, 1 HR/9, and an ERA of 4.42, with a BABIP of .300.
2010: 6.0 K/9, 4.0 BB/9, 1 HR/9, and an ERA of 3.10, with a BABIP of .215.

If I asked you whether Pitcher X’s 2012 ERA was going to be closer to his 2009 or 2010 ERA, you’d pretty quickly say 2009. This is because you can see that Pitcher X’s peripherals have remained the same from one year to the other, with the ERA improvement clearly coming as a result of a greatly reduced BABIP, which is almost certain to regress.

Now Suppose instead we have the following Pitcher Y, who has pitched a full season in 2010 and one month in 2011 with the following statistics:

2010: 6.0 K/9, 4.0 BB/9, 1 HR/9, and an ERA of 4.42, with a BABIP of .300.
2011: 8.0 K/9, 4.0 BB/9, 0.8 HR/9, and an ERA of 3.46, with a BABIP of .300.

Now, what performance would you expect to see out of Pitcher Y for the rest of 2011? Unlike Pitcher X, whose performance “improvement” was pretty clearly the result of just random variance (luck), Pitcher Y’s improvement seems to stem from an improvement in his peripherals. Thus, we’re more inclined to believe that this improvement is real and Pitcher Y can continue to do well the rest of the year.

But this belief relies upon a key point, one which is definitely not so clear, that Pitcher Y can keep up his improved peripherals for the rest of the year. How can we be certain of this? After all, random variance in a small sample size (one month) can certainly account for improved peripherals. His improvement STILL could be an illusion, after all.

The Pitch-f/x solution

It’s here that Pitch-f/x can be of some use. Improvements in a pitcher’s peripherals (or, really, any improvements, including BABIP), can stem from three things: A change in how the pitcher is pitching, external factors, or luck.

External factors are the easiest of the three to explain. Perhaps a pitcher has issues pitching at home (let’s say in Colorado) and has thrown two-thirds of his starts or more at home. Perhaps he’s faced a whole bunch of weak-hitting teams, or teams who have a lot of lefties or righties. We can easily adjust for such factors to make our future projections.

Then there’s luck, or really, random variation: Sometimes a change in results is simply luck or random variation. A pitcher could be throwing the same exact pitches to the same exact locations to the same exact batters 100 times* and the results will NOT be the same each time.

*Presume for this example that the batters don’t gain any experience from each of these at-bats.

What will be the same in general, over the long run, is that if a pitcher pitches the same exact way and there are no special external factors, the pitcher’s overall peripherals will remain the same. This shouldn’t be surprising to anyone.

But a lot of the time, improvement in pitcher results is due to a change in how the pitcher makes his pitches. This can happen in many different ways:
{exp:list_maker}The pitcher could add a new pitch
The pitcher could eliminate a pitch
The pitcher could adjust his distribution of pitches
The pitcher could adjust where he aims the pitches {/exp:list_maker}
These things, and others, are completely within the purview of Pitch-f/x. If you look at the Pitch-f/x data, or just take a rough look at the graphical displays provided by Texas Leaguers, FanGraphs, or Joe Lefkowitz’s site, you can spot such changes in how a pitcher pitches, and in many cases this is pretty easy to do. Some changes of course are subtle and can only be seen by looking at the data itself, which is what this column aims to do.

Tebow or Not Tebow, a Visualization
When it comes to the Mets' famous minor leaguer, it's not just will he get major league time, but should he.

Obviously, not every change in a pitcher’s pitches will explain the change in a pitcher’s results. But when there is a change in results, particularly in the peripherals, we would expect there to be one of these visible-with-Pitch-f/x changes. And, indeed, we do see such shifts (a change in pitch usage, such as a switch from a four-seamer to a two-seam fastball, is a common explanation for a real change in results).

Of course, not all aspects of pitching are captured by Pitch-f/x. For example, a pitcher could learn to deliver his pitches in a more deceptive way, and Pitch-f/x would not be able to detect it (Pitch-f/x can detect roughly the pitch’s release point—sort of—but not the pitcher’s actual method of delivery). Similarly, a pitcher’s tipping of his pitches through his actions prior to his delivery will not be seen in Pitch-f/x data.

There are, of course, other things Pitch-f/x doesn’t pick up that I’m not listing here, but the system captures a bunch of things that could change in a pitcher’s motion so as to cause a real noticable change in his results.

So when a pitcher seems to improve or get worse but hasn’t changed his pitching at all according to Pitch-f/x, the logical conclusion is that the cause of such improvement (or the cause of the worse results) is the result of luck or random variation. Thus, the logical conclusion is that the pitcher’s results will regress to his career numbers (or to the numbers from previous years if career numbers aren’t usable for some reason).

Which year is the outlier?

“If there is no change in a pitcher’s pitches this year, why couldn’t last year be the outlier and this year’s numbers be the real thing?”

This is a frequent comment on Fluke Watch posts. The answer is simple: sample size. If you have two sample sizes that are the result of the same exact pitches but have different results, generally the larger sample is more likely to be indicative of the true numbers. In other words, if a pitcher hasn’t changed anything, it’s more likely his numbers will regress to what he did over a whole season last year than for him to continue what he’s been doing for 1-2 months of the season.

Really, when a pitcher doesn’t change anything and has better or worse results, we should expect, long-term, for the pitcher’s results to regress to the total average results of his pitches.

So for Jhoulys Chacin, as talked about last article, we should expect his groundball rate to regress to his average groundball rate over the last year and two months (essentially his career rate) rather than simply his groundball rate from last year. Due to the sample sizes involved, of course, this means that he’ll be expected to regress to much closer to his previous year’s rate than his rate for this season.

I hope this answers the standard comments and questions that we see in Fluke Watch Posts. If there are any other questions, please comment. Next time, we’ll be back looking at pitchers.

Print This Post
Sort by:   newest | oldest | most voted

Who would be an example of this, in 2011?