IT’S SO *** **** HARD TO THINK

WITH ALL THESE DUCKS EVERYWHERE!

In August of 2011, I introduced **Should Hit** (in three iterations: ShH, SHAP!, and Complete SHAP!). Should Hit is essentially a simple regression of walk rates, strikeout rates, home run rates, and BABIP on weighted runs created plus (wRC+). In both its calculation and its simplicity, it is very similar to FIP — but its uses and impact are quite unlike FIP.

Like FIP with groundball pitchers, the formula has some biases — known, accepted (by me, at least) biases. For instance, because it ignores doubles and triples completely, Should Hit naturally undervalues players who excel at the extra bags and overvalues to the sluggers stuck at first. It presumes a certain number of doubles and triples for every player based on their home run rate and other peripherals — all poor proxies for something that is a verifiable skill or weakness in many players.

Ultimately, though, the tools (ShH and its brethren) work rather well. For the curious thinker, ShH can admirably predict what a player might hit with a normal/career BABIP or if their BB% or K% or HR% changes. However, at the time of its uncovering, I was wrongly under the impression that the current FanGraphs iteration of the wRC+ formula **did not include stolen bases**. It mattered little to me at the time — the only reason I thought the uncovering was so interesting to begin with was that only four peripherals could explain almost 93% of the variation within wRC+ (and that is still amazing to me!)

But today, we are going to add in SBs and stand back with a decanter of thought and ask ourselves: “What the hell did we just make here?”

Studying economics in graduate school, I learned the one, core truth about the art of research. Allow me to relay the basic gist of my conversation with the professor:

“So [this economist] came up witd* [this idea], but afterwards, when nations and international bodies tried to implement his idea and it didn’t work, what did the economist do?” asked Turkish Professor.

“Write a book about it?” I said.

“Yeah, he went out and wrote a bunch of papers explaining why it didn’t work and why everybody didn’t understand it. And that’s how the field works: You come witd and idea; you publish some papers on it; and then you spend the rest of your career saying it’s right.”

**He was Turkish. That’s how said “with” — he also said “turd” instead of “third.” And that’s why we loved him.*

So, I’m going to break this — shall we say — Turkish rule of American economics by admitting my first impressions were off. As noted above, I initially thought wRC+ (here at FanGraphs, not as a whole) did not include steals, so by disregarding steals I was only doing myself a favor.

I tried to look for a suitable proxy for doubles and triples — a proxy specifically because I felt including *actual* doubles and triples would sully the defensive independent ideals of the tool. Unfortunately, as I observed here, the most logical proxies — steals and speed scores — were terrible predictors of doubles and triples.

So, I (and I still think rightly, given those circumstances) disregarded steals as a proxy for extra base hits. But let us pause a moment to consider some facts about the very nature of defensive independent statistics — more importantly, why? Why should it have been so important to me to keep the metric defensive independent?

- •Defensive independent stats became important following the earth-sundering study by Voros McCraken in ancient America, some 100 sabermetric years ago, wherein he discovered and concluded:
**Pitchers cannot control where the ball goes.**This led eventually to Tom Tango’s FIP, which to this day remains as one of the absolute best pitching evaluation metrics available. - •
**Hitters — unlike pitchers — actually**Not only do players normalize in a wider range of BABIPs, they also can have differentiating skill levels with regards to hitting doubles and triples (which is essentially a product of gap power and foot speed) as well as infield singles and RBOE.*do*have some control over a pitch’s in-play result. - •
**Why, then, would you need a defensive independent hitting metric?**Defense plays into a hitter’s stats*much*less than it does a pitcher — hitters are interacting with (as in hitting against) a different defense at least every four games, while pitchers are interacting with (as in playing in front of) virtually the same defense throughout the season. - •Defensive independent hitting, therefore, is not going to give us any clearer information than wOBA or wRC+ if it comes to us as a
**static metric**.

You, dearest reader, will notice I have carefully avoided referring to ShH and the like as metrics or statistics — because, in truth, they are not. ShH, SHAP!, and Complete SHAP! are *tools* at best. They can give a nice static number, but they are best served in helping us identify fluctuations and their causes.

But let’s save that bit for Part 2. ;)

As promised, I have once again run my Regression Machine, with the dial set to “Include Steals This Time.” More specifically, I regressed BB%, K%, HR%, BABIP, and SB-rate (as in: SB divided by PA — I’ll explain! Just wait a second!) on wRC+. Which gives us approximately this, what I call **Fielding Independent Offense** (FIO, pronounced *fye-oh*):

**FIO** = -51.57 + 275.21(**BB%**) – 180.52(**K%**) +

1184.34(**HR%**) + 151.75(**SB/PA**) + 422.14(**BABIP**)

I call this Fielding Independent Offense instead of Batting or something similar because it includes

Here is what it looks like compared to wRC+:

And here is what ShH looks like compared to the wRC+ of that same group:

So we gain a sliver of R-squared, which is nice, but more importantly, you will notice the plotting is much, shall we say?, tighter in the first graph — fewer dots scattered away from the line like a salt spilled from a diner salt shaker. Also, the standard deviation of the difference from wRC+ is considerably lower, and the median and average both equal exactly 0 — unlike ShH, where they equal 1 and slightly more than 0, respectively.

**Important to note here:** My Regression Machine only included — brace yourself — three years of data. Yeah, I’m crazy, right?

Here’s the deal, and I looked at this a LOT with ShH; the simple fact is that the MLB run environment is *really* rapidly changing. Using data from 2000 to present (or even larger) would essentially be like writing an article about the life on top of Mount Everest whilst I’m actively scaling down its side. We do not know quite where, or if, the league will bottom out (or if it will go back up). Run production has gone down two years in a row and trying to use more than three years of data — as I did with my very first incarnations of ShH — just creates an upward, steroid-fueled bias.

So instead, I am using 2009 to 2011 combined seasons of “qualified” hitters, while just happily saying, “When this whole run environment thing calms down, I’ll go ahead and recalculate everything, so shut up.”

Anyway, we’ll get to the testing of this puppy in a little bit, but for now, let us end by taking one more look at the doubles and triples problem. To do this, I tried a couple of different method until I arrived at this: I took the doubles and triples for each player, divided them by their PAs (working with this 2009 to 2011 dataset again). Using these rates, I multiplied them by their *The Book* run expectancy, and then added them together, effectively created a wRC for doubles and triples per plate appearance. Comparing that to FIO minus wRC+, we get this:

Yeah, the bias is still very much there, but at least it only explains about 14% of the deviations away from wRC+. In an interesting, if not frustrating, twist, the R-squared here is larger than in the same study using ShH — meaning the doubles and triples bias explains more in FIO than in ShH. But when we consider that FIO uses SB and is thus explaining more overall — and that the coefficient is less steep in the FIO bias study (-127 versus -141 with ShH) — then it pretty much comes out more of a wash.

Ideally, the R-squared would be near zero, but **FIO is not meant to replace wRC+**, so this problem is less of a THING, more of a thing.

If we look at the specific places where FIO and wRC+ deviate from each other, we can actually notice a really interesting and important bias:

Tell me if you can spot some common traits in the 11 players FIO overvalued the most:

*Player, difference between wRC+ and FIO*

Ian Stewart, +13

Willie Bloomquist, +10

Todd Helton, +10

Ryan Spilborghs, +9

Elvis Andrus, +9

David Murphy, +9

Alexei Ramirez, +8

Scott Podsednik, +8

Troy Tulowitzki, +8

Pedro Feliz, +8

Carlos Gonzalez, +7

That’s right, the majority of them play for — or have played for — the Colorado Rockies, the Chicago White Sox, or Texas Rangers. We see a similar, though not as dramatic, bias on the other side of the spectrum:

Josh Willingham, -8

Kevin Kouzmanoff, -8

Shane Victorino, -8

Ryan Braun, -8

Will Venable, -8

Rickie Weeks, -8

David Eckstein, -9

Carlos Quentin, -9

Chase Utley, -10

Jose Reyes, -10

Andres Torres, -11

A number of speedy guys who play in pitcher’s parks. It makes sense. FanGraph’s wRC+ accounts for parks, FIO does not (and frankly does not need to).

So FIO has some bias — as did wRC+ — but overall, it comes *even closer* to predicting a player’s offensive output than ShH does. Does it completely ignore fielding? No. Heavens no. BABIP has a LOT to do with fielding (which is one of the reason FIO makes a better tool than a stat) as well as SBs. In fact, but ignoring caught stealing — which merely helps to keep the calculations simple — we are assuming each of these base-thieves is a steal master and that the defenses they play against are all equal, neither of which is true.

Let’s get to Part 2 already, and use this freakin’ thing!

For digestive and giggling purposes, here is the same order of players as the most recent graphs, except set to the tune of ShH minus wRC+ — which helps illustrate how the average change from switching from FIO to ShH does not fully represent the real change, on a player-by-player basis, it is making:

(ALSO: Note the difference in scale used here and the preceding graph.)

*Go here for Part 2.*