Defensive Independent Hitting, Or ShH

Maybe there’s is a better way to predict how well a hitter is doing? Rather than glancing at his OBP and SLG and OPS or his wOBA and wRC+ and then mentally calibrating that number according to an inflated or deflated BABIP, maybe we can find a simple means of combining the key elements into a single formula.

Well, I believe I have stumbled onto just such a formula.

Th’other day, when I was trying to solve the mystery of the Tampa Bay Rays and their utterly broken run expectancy chart, I began ruminating about the relationship between walks, strikeouts, and an ability to create runs. You see, the Rays tend towards true outcomes: lotsa walks, lotsa strikeouts. So, for some strange reason — be it bad luck or bad hitter-type chemistry — the Rays seem to have an inability of reaching a standard run expectancy with the bases loaded.

Anyway, I began to investigate this trifle and produced an interesting comparison:



The greener area (less walks, more Ks) is obviously worse, just as the white area is obviously better. In between, however, is the tenuous white-greenish area of mixed results.

What particularly piqued my interest was not really a new finding of any sort, but merely the visual represent of the different means of success among MLB offenses.

For instance, the Rangers put a lot of balls in play. They walk very little and strike out very little, but are still a successful offense (116 wRC+ or 16% above average). Meanwhile, the Yankees and Red Sox have a lot less defensive dependence, taking bases on walks and striking out much more often. These teams have similar results, but lay on different points on the spectrum.

So I axed myself: “What does the relationship of plate discipline look like with respect to run scoring?”

Not a profound or unanswered question by any means, but a fun exercise. The relationship is neither surprising nor overwhelmingly strong:

What surprised me with this dandy little regression — and what made me wander down the rabbit hole — was the high R-squared and minuscule P values (not shown). I did not expect that the BB/K ratio would represent ~47% of the variation of wRC+. When I think of great hitters, I usually do not immediately think about their balance of walks and strikeouts.

I then pondered: “How deep does this relationship go? Could there be a defensive independent means of evaluating a hitter?”

If half of a team’s offensive variation comes from walks and strikeouts, then maybe homers could make up the rest of that variation? Well, a half dozen regressions later, I concluded two thangs:

1) Defensive independent events — walks, strikeouts, and homers — have a very strong correlation with park-adjusted run scoring (wRC+).

2) And BABIP fills any and all remaining gaps.

The beauty of BABIP is that it encapsulates basically the junk drawer of remaining elements. BABIP has luck, speed, and defensive dependence in it, so the resulting R-squared is basically infinity.

I took these two little, unsurprising yet key, facts and slung them at a decade’s worth of hitters. Then, I looked at the more recent era — let’s call it the Dying Ball Era please! — and produced this:

I call it Should Hit, as in: Yuniesky Betancourt should hit 80 wRC+ with a normal (career) BABIP. Of course, if you put in a players present BABIP instead of their career, then you should get something like above, where there’s a nearly one-to-one relationship.

The formula is simple, which is why I love it. Regressing K%, BB%, HR%, and BABIP on wRC+ (from 2009 through 2011), we get (approximately) this:

Should Hit = -60 + 277(BB%) + -184(K%) + 1133(HR%) + 465(BABIP or xBABIP)

Walks and strikeouts and home runs normalize way more quickly than BABIP, which can go crazy for whole seasons. Should Hit — or ShH (pronounced shh, as in shut up) for those who love acronyms* — allows us to use the three more stable (and more adjustable) elements of hitting to our advantage.

Not only do walk, strikeout, and homer rates stabilize quickly, they also have some of the highest variations through a player’s career as individuals are constantly changing their approach or dealing with pitchers adapting to them. Whereas BABIP is a slow swinging pendulum — constantly based around a consistent point, but never quite there — BB%, K%,and HR% are small needles quickly finding exact points which change slightly almost every season.

So, using Should Hit, I can predict how any player would perform given an array of BABIPs. Take this year’s anomaly, Casey Kotchman. After getting off-season surgery on his eyeballs, Kotchman has instantly gone from a worse-than-league-average hitter to a 134 wRC+ hitting machine. Most non-Rays fans (and myself) look at Kotchman’s crazy .360ish BABIP and say: “I know what comes next. Pride goes before destruction, a haughty spirit before a fall, and a high BABIP before a nasty regression — or something like that.”

Anyway, given the peculiar Best Shape of My Life story preceding his resurgence, Kotchman has earned a slew of devotees committed to believing he will keep his pace up. If we put his numbers in Should Hit, we get this:

So, if we think Kotchman’s new vision can really sustain his ultra-high BABIP, then he can legitimately be a 120 to 130 wRC+ hitter. But, the truth is he’s not walking much and striking out more than usual. Should Hit does not like that and says, t’were his BABIP to revert to career norms, he’d be having maybe the second-worst season of his career.

Now, ShH is not an xBABIP tool. If you want an xBABIP calculator, then go here. For ShH, I prefer to just use a little common sense and career BABIPs, assuming a player has more than one season of data.

Because I luv crowd-sourcing, here’s a Google Doc with the Should Hit formula. Feel free to download or save a copy and play with it to your heart’s darkest desires.

For the Google Doc, just input the walk and strikeout rates of a player’s current season. Then add their present home run total (or career totals work well enough too) and the present (or career) plate appearances — this calculates the home run rate.

Then, input their present BABIP. You will notice the resulting wRC+ is not as high as it is in their present 2011 season. I believe this comes from a calibration issue due to the Dying Ball Era’s lowered expectations.

Modern “plus” metrics having lower standards right now because of the league-wide depression in offenses. Originally — as noted above — I used a whole decade of data to form Should Hit’s slopes. Unfortunately, the distance between 45-degrees and the regression narrowed each time I sliced off a season of the Steroid Era.

So, the present rendition has a bit of a calibration issue. All that means is you will want to put in their present BABIP first, then your predicted BABIP (or and xBABIP) for comparison’s sake. Or, you can just use my Should Hit (Advanced!) which adds this layer for you:

In closing:

1) Play around with Should Hit and ShHA! and let me know what you think.

2) Are there problems in my reasoning? Let me know. I’ve used what I hoped was the simplest reasonings and the simplest methods (linear regressions), but maybe it’s more complicated than it appears to me.

3) Let me know what you think about the external validity of this little tool. As I mentioned, the Steroid and Dying Ball Eras have thrown significant monkey wrenches into league averages, but I would like to think I avoided those problems. Also, has anyone else done something similar to ShH in the past? I imagine others more brilliant have long-explored defensive independent hitting already, but I could not recall of such a thing.

Finally, have fun!

*I do not like acronyms in sabermetrics. They make otherwise simple concepts seem complex and alien to the un-inundated. Initially, I wanted to avoid an acronym altogether, but in the name of reasonable spreadsheet column widths, I decided to go with ShH. And in the name of younger audiences, I decided against ess-hit (written “SHit”) for more obvious reasons. If you must, though, just think of it as expected weighted runs created plus, xwRC+ — which looks like a virus in your registry.




Print This Post

Bradley Woodrum (@BradleyWoodrum) writes about Chicago sports at Cubs Stats and about cats and economics at Homebody and Woman.

100 Responses to “Defensive Independent Hitting, Or ShH”

You can follow any responses to this entry through the RSS 2.0 feed.
Click here to view comments in a non-threaded output.
  1. Bob Loblaw says:

    Gotta include LD%, GB%, FB%, HR/FB% or this isn’t very helpful.

    Vote -1 Vote +1

      • Bryce says:

        We can test how helpful those would be, no?

        Vote -1 Vote +1

      • Yuck. I just wrote a long response and my internet crapped out on me.

        Here’s the cliff-noted version:

        @Bryce: The problem with line drive rates is how they are subject to a scorer’s opinion and perspective. Walks, strikeouts, homers, and BABIP are clearly defined, unmistakable events.

        Moreover, the correlation and predictive power of Should Hit seems to be fine without the batted ball data. One of the reasons I like the formula so much is that it is simple (just 4 variables) and has a crazy strong correlation (.95+ R-squared).

        We could indeed add more variable, but it doesn’t seem necessary.

        Vote -1 Vote +1

      • Barkey Walker says:

        Ks and BBs are no less subjective. If you want to say once it has happened, then lets go with this: once the scorer puts it in the book, it is there. Same diff as a K or BB.

        Vote -1 Vote +1

    • James says:

      You mean integrate an xBABIP calculator? “ShH is not an xBABIP tool.”

      Vote -1 Vote +1

  2. Alex says:

    Reminds me of ProOPS.

    Vote -1 Vote +1

  3. Chops says:

    This is an awesome article. Thank you for this.

    Vote -1 Vote +1

  4. tangotiger says:

    Bradley:

    Take this:
    277(BB%) + -184(K%) + 1133(HR%)

    Divide each by 92, and you get this:

    (3*BB% – 2*K% + 12.3*HR%) * 92

    As you can see, you’ve basically separated a player’s batting line into a FIP component and a non-FIP component.

    Vote -1 Vote +1

    • Yeah, that is essentially what I wanted to do — I think. (NOTE: The article’s original title, which didn’t fit, was “Defensive Independent Hitting, Or Something Like That.”)

      The idea is HR, BB, and K — things a batter actively controls — plus BABIP — something that is kind of determined by the less-changing hitter’s profile — will ultimately equal the batter’s true talent level.

      So yeah, it’s basically FIP + defense, which ought to equal ERA (or, more accurately in this analogy, RA9). Except here, we can adjust the defense element according to our expectations (the xBABIP element).

      ShH in itself is not predictive, but it can be in the right hands.

      Does any of this make sense?

      Vote -1 Vote +1

    • Telo says:

      Yea, this was my thought while reading the article – it’s FIP for hitters. Very nicely presented though, Bradley. I enjoyed it. (Sans the omission of the obvious FIP correlation, which probably should’ve been discussed.)

      Vote -1 Vote +1

      • Oh yeah, it’s definitely FIP for hitters. I didn’t mean to not mention it. I guess I figured — given the title and Should Hit’s formula — that the FIP similarities were more obvious than they were.

        I certainly wasn’t trying to be a sneaky FIP thieve. I love FIP.

        Vote -1 Vote +1

      • Telo says:

        Hahah I figured you weren’t trying to pull a fast one. It may have just been a nice conceptual intro/segue was my thought.

        Vote -1 Vote +1

  5. Corey says:

    What decade did you use? As you said, if the Steroid Era changes the slope towards 45 degrees, than using the 80s could propel it downward in an environment of repressed batting runs.

    Vote -1 Vote +1

  6. Dan says:

    Isn’t this basically saying that there’s a correlation between HR and ISO generated from doubles and triples? You directly account for walks with BB%, hits with BABIP and K%, and homers with, well, homers. All that’s left in wOBA that isn’t a direct input here is the distribution of hits on balls in play, and it already makes basic intuitive sense that a guy who hits more homers would likely have more extra-base hits in general.

    In other words, I think it can be made more specific by using it as an ISO or 2B/3B calculator by using HR-Rate to regress other XBH totals and create a more detailed estimate of a player’s line, providing a more thorough insight than simply a wRC+ estimate. wRC+ is great, but the more detail, the better IMO.

    Then again, I could be totally wrong…

    Vote -1 Vote +1

    • Dan says:

      This is exactly what I was thinking, but having trouble typing.

      Vote -1 Vote +1

    • I was really surprised to find such a strong correlation in the absence of doubles or triples. Indeed, it caught me off guard, but it makes sense that homers could be a decent proxy for these two.

      I’m not really sure I understand your ISO or 2B/3B suggestion. I would need to see it, as all nerds would, in the form of a formula.

      Vote -1 Vote +1

      • balagast says:

        I agree HR’s will tend to be a good indicator of ISO, but there will be some circumstances where it just doesn’t work very well.

        A perfect example of one is Jose Reyes this season, his ISO is being fueled by his crazy amount of triples.

        Vote -1 Vote +1

  7. Sean says:

    When this is ironed out, can we get it on the leaderboards?

    Vote -1 Vote +1

  8. Yirmiyahu says:

    So doubles/triples power doesn’t exist in the world of ShH. Big flaw.

    Vote -1 Vote +1

    • mcbrown says:

      That is kind of the point. Whether or not this accomplishes the stated goal (and I’m not yet sure it does, for a few reasons), that goal as I understand it was to come up with a hitting equivalent of FIP, which relies entirely on the true outcomes. How would one include 2B and 3B in a “fielding independent” calculation without invalidating the “fielding independent” aspect?

      Vote -1 Vote +1

    • It apparently doesn’t need to exist. Trust me when I say: I’m just as flabbergasted as you. I’m guessing somehow with HRs as a proxy for power and BABIP as a filler for all other hits, it somehow works out (at least mostly works out).

      When we think about it though, FIP ignores doubles and triples too. So maybe there’s something about those two hits that tie the to defense more than we realize.

      Vote -1 Vote +1

      • Yirmiyahu says:

        I understand that you’re using HR’s as a proxy for those things. And it makes sense to me that pitchers wouldn’t have any particular skill that influences that (beyond luck, defense, and park factor). But, there’s a very specific type of hitter who has line drive power, as a identifiable skill, and your system ignores that.

        I understand that no stat like this is going to completely accurate, but your stat has a more systematic flaw in that ignores a certain type of skill.

        Vote -1 Vote +1

      • Matt says:

        Intuitively that makes sense, that doubles and triples are defense-dependent. Hard hit grounder to third can easily be a triple or a groundout based on who is playing third and where they are positioned. So in that sense, is this basically FIP for hitters?

        Vote -1 Vote +1

      • @Matt: Yeah, this is very similar to a FIP for hitters. The only extra variable, of course, is the BABIP.

        Vote -1 Vote +1

      • @Yirmi: Yeah, I want to look into for myself, but if there is indeed a systematic flaw in it, I would like to fix it. I’m certainly not married to this present rendition of Should Hit and would happily do whatever to improve it.

        Vote -1 Vote +1

    • Yirmiyahu says:

      Yeah, Bradley, I put a huge sample of guys into your formula to see how accurately it judged their actual RC+. Your stat systematically overrates big, slow, Three True Outcomes sluggers by assuming they will hit more doubles and triples than they do. And it systematically underrates line drive hitters with lots of doubles/triples by assuming they don’t have power based on a lack of HR’s. It also ignores stolen bases.

      Vote -1 Vote +1

      • Good stuff! I had not noticed that issue with my larger regressions.

        Perhaps the solution would be to include doubles and triples into the regression? The problem then becomes that, with BABIP, these inputs would be double counted.

        Hmm… Any ideas?

        Vote -1 Vote +1

      • Toffer Peak says:

        I’m glad you already did the work because I was going to note in the comments referring to PrOPS earlier that this was the big downfall of PrOPS as well; it overrates TTO players and underrates “hitters”. For proof sort players by OPS-PrOPS on this page: http://www.hardballtimes.com/thtstats/main/index.php?view=props&linesToDisplay=50&orderBy=ops_minus_props&direction=ASC&qual_filter=1&season_filter=2007&league_filter=2&pos_filter=All&Submit=Submit

        You will find that players that tend to be overrated by PrOPS were TTO players such as Dunn, Burrell, Howard, Uggla, etc. Players who were underrated by PrOPS included “hitters” such as Utley, H. Ramirez, Chipper, Sandoval, etc.

        While PrOPS/ShH may work when you throw enough players into the regression it ultimately has too much systematic bias to be of much use. You’re better off just expanding a player’s sample size to get a handle on their performance.

        Vote -1 Vote +1

      • Nathan says:

        So perhaps one could replace BABIP in ShH with Slugging, Balls in Play or maybe an xSlgBIP stat? Or augment the stat with the ISO equivalents? (That’s just one more number to plug in).

        Is there anything like xSlgBIP or xIsoBIP?

        Vote -1 Vote +1

  9. mcbrown says:

    Bradley, I’m curious about what led you to add BABIP to the mix. It has a different denominator (balls in play) than the other ratios in the formula (plate appearances). I’m having trouble wrapping my mind around that in terms of mapping this regression onto a “real world” interpretation.

    Vote -1 Vote +1

    • Hmm… I don’t believe the different denominators should complicate the regression, though smarter minds could easily correct me on this.

      In terms of the real-world interpretation, the regression resulted from me asking: “How much do walks, strikeouts, homers, and luck and speed and everything else in BABIP determine a player’s hitting ability?”

      BABIP is essentially the Greek letter tagged onto the end of an economist’s equation (something economists love to do) which symbolizes “er’thing else.” In this case, we have an actual value for everything else, BABIP.

      Vote -1 Vote +1

      • mcbrown says:

        I agree that it doesn’t complicate the regression – the regression is what it is. I think what I’m wondering is whether adding a ratio on a different denominator has introduced any logical problems.

        As some other commenters have said or hinted at, essentially what this equation does is take a direct wOBA or wRC calculation, remove some inputs and then add back a new input. We might just as easily use some other rate or rates in place of BABIP, or not remove 2B and 3B in the first place, since BABIP itself is defense-dependent.

        Sorry, I am having trouble articulating what I’m getting at. I would really need to reflect on this for a while to pin down exactly what I think the issue is here.

        Vote -1 Vote +1

      • Dan says:

        yay epsilon!

        Vote -1 Vote +1

      • mcbrown says:

        And after reflecting further I realize I am ignoring the biggest change in going from the standard wOBA/wRC calculation to your ShH – the change in weights, or the actual “FIP-izing” of the true outcome inputs (not to mention adding K’s, which are nowhere to be seen in wOBA). Something still doesn’t sit quite right about the mixing of denominators, but I think I have convinced myself that whatever issue may lie down that line of thought is a secondary effect.

        Thanks for this – you’ve certainly provoked a lot of thought from me!

        Vote -1 Vote +1

      • Al Dimond says:

        There’s one reason I can think of that BABIP’s different denominator matters when comparing players: players can have vastly different BIP/PA ratios. BABIP matters more to players that put the ball in play more.

        Vote -1 Vote +1

      • @Al: Ooh. good point. I need to think on that,

        Vote -1 Vote +1

      • Al Dimond says:

        @Bradley: Maybe, instead of BABIP, you use BABIP*BIP%? I guess that simplifies to H/PA, or H%…

        But because the usefulness of this metric is looking at what a hitter’s offensive contribution would be with different BABIP, you’d probably hold BIP% constant and just vary BABIP for a player when drawing the “Kotchman spectrum” graphs. These graphs would then look more interesting — a TTO guy would have a different slope than a guy that puts the ball in play a lot.

        Vote -1 Vote +1

      • corey says:

        Glad you pointed this out, because it’s what I’ve been thinking. By adding BABIP you’re making the formula define itself, I’m a little surprised your r-sq isn’t 1, probably a result of small errors in wRC+. Since I don’t really know exactly how wRC+ is calculated I can’t quite comment on that, but assuming it’s a more or less accurate reflection of baseball success (which it’s certainly trying to be other wise nobody would use it), then you’re basically defining runs created from its origins of walks, strikeouts, homers, and ability to find holes in the defense. There’s nothing left except perhaps stolen bases. So you’re defining runs created from its origins here, of course you have an absurdly high r-sq. If you broke it back down a little bit and took out babip I think you might have something productive here and I would really like to see what it looks like. Conceptually babip is obviously the last piece, but I think you need to drop it from the model at least initially, which has the added benefit of allowing us to see the added r-sq from babip.

        Also, why did you exclude p-values? They’re important and it casts suspicion on your data. Particularly since I’ve seen some people on this site try to make p<0.2 significant.

        Vote -1 Vote +1

  10. Jerome S. says:

    Been waiting for something like this. Brilliant, absolutely brilliant. If we use FIP for pitchers, why aren’t we using FIH for hitters in some respect?

    Vote -1 Vote +1

  11. Morse says:

    I can only imagine the player at the top of the 3rd graph is Jose Bautista.

    Vote -1 Vote +1

  12. Derek says:

    Nice Article Thanks!!!!

    First thing, the Red Sox and Yankees don’t strike out ALOT more they strike out 17% and 17.5% compared to the Rangers 15%. That is not a huge difference.

    Second thing, this is great. It is really like FIP for hitters. Yes there will be players who will be consistent outliers (Jose Reyes) just like there are pitchers who are constant outliers of FIP.

    Vote -1 Vote +1

  13. I think you have a real insight with this construction, nice work. This is a hairbrained question perhaps, but can one generate coupling, or interaction parameters between players that you sequence together, as in a batting order or portion of a batting order, to maximize an effect? Maybe I haven’t posed the question too clearly, but perhaps you can see what I’m going. Your comment about economist reminded me of something I read years ago in options theory reminiscent of that.

    Vote -1 Vote +1

    • An excellent question! I myself have been wondering this thing, but with my limited intellect, it’s kind of like trying to eat a mountain — I’m not sure where to start, or if it’s even edible.

      This line of thinking sources directly with my frustration with the Rays. It seem like some of their offensive inefficiency comes from the collection of such similar hitter profiles. Therefore, a guy like Casey Kotchman (a contact hitter) can prove more valuable to the Rays relative to his value on other teams.

      Vote -1 Vote +1

    • If there were such a way of relating performance, that would at the very least make trade acquisition time an inyteresting exercise in looking for synergy.

      Vote -1 Vote +1

    • Where to start, I don’t have a reference, but the ‘coupling discussion’, if a recall, came up in the discussion of options pricing and securitization. That would have been sometime in the 1970′s or early 80′s when people thought Scholes was a shoe insert.

      Vote -1 Vote +1

  14. Dan says:

    I keep seeing FIP for hitters, but really it is FIP plus BABIP for hitters, which is kinda throughing me through a loop. So let me get this straight, the use of this would be to ask “What would this player’s wRC+ be if he had a different BABIP?”

    Vote -1 Vote +1

    • Dan says:

      I am thinking the reason this works is due to the BABIP and HR together? HR obviously ups wRC+ and as for the doulbe and triples hitters, I am guessing that they hit lots of line drives which would be seen in an increased career BABIP? Is this a true statement?

      Vote -1 Vote +1

      • “What would this player’s wRC+ be if he had a different BABIP?”

        Yes, that is precisely its biggest and best use.

        “I am guessing that they hit lots of line drives which would be seen in an increased career BABIP…”

        That seems to be the case, but probably could use more investigation.

        Vote -1 Vote +1

  15. JTripp says:

    No one has said this yet so I will… shhhhhhhhHeeeeeeeeeeeeeeeeeeeet

    This is a very impressive piece of work. I’m amazed that the Three True have that much of an impact on WRC+.

    Vote -1 Vote +1

  16. Randy says:

    When I read the title, I immediately thought “FIP” and didn’t think it needed to be mentioned explicitly.

    Vote -1 Vote +1

    • Yeah, when ever I see DIPS or defensive independent anything I think FIP, but it’s unfair of me to assume that word association is the same for everyone else. Indeed, I should have somewhere explicitly said: “This is a FIP for a hitter, plus BABIP.”

      Vote -1 Vote +1

      • RC says:

        FIP isn’t defense independant. Its highly reliant on BABIP, and actually thinks a pitcher is better if his BABIP is higher.

        Vote -1 Vote +1

  17. Nick #2 says:

    Great work.

    I think you should call it FIB though, Fielding Independent Batting. Relates the stat directly to FIP and it’s a quick to tongue one syllable acronym.

    Vote -1 Vote +1

  18. Raul Ibanez says:

    My favorite part was when you axed yourself that question.

    Vote -1 Vote +1

  19. jim says:

    this is what i love about fangraphs: shit i don’t really fully comprehend, but which remains interesting and analytical.

    Vote -1 Vote +1

  20. Vijay says:

    I said “Oh snap!” when I saw the R^2 of .94. Out loud. I haven’t said “Oh snap” since 2003.

    Vote -1 Vote +1

  21. Mike says:

    So Bradley, when the Rays offer you a job are you going to take it?

    Vote -1 Vote +1

  22. James M. says:

    If you want to incorporate 2B/3B, the easiest way is to substitute SABIP for BABIP. Of course LWTS would be more accurate but I don’t think it will make a big difference.

    Vote -1 Vote +1

  23. Izzy says:

    This is an interesting invention. Perhaps you can use it for your next market inefficiency article. Bill James recently came up with something similar (I think it’s similar anyway) called the Abe Lincoln Scores: His formula counts all bip as the same regardless of whether it was a triple or a double play.
    His looks like this:
    (K*0)+(BIP*1) + (BB*2)+(HR*4). The idea is similar, right?

    Vote -1 Vote +1

    • corey says:

      That can’t possibly be right, K*0 will always =0, so if it equals 0 why would it be included in the equation?

      Vote -1 Vote +1

  24. satanorsanta says:

    As a phillies fan I decided to check a few players against this for this season of play. Ryan Howard ShH 119 wRC+ 124. Shane Victorino ShH 127 wRC+ 152. Chase Utley ShH 123 wRC+ 147. Placido Polanco ShH wRC+ 88. And since he was talked about upthread Jose Reyes ShH 123 wRC+ 152.

    It looks like ShH works okay for slow/average players but is pretty bad for fast players.

    Vote -1 Vote +1

  25. Joey says:

    Hummm… shouldn’t this be called sHIT?

    anyway, joking aside, if it is finalized and incorporated I personally think either xHIT, FIH or xRC+ would maybe be a bit more of a self-explanatory name. ShH isn’t horrible of course, but it is fairly foreign sounding where something like xRC+ would fit right in with everything else around here already. One would see it and probably be able to figure out what it tells you pretty easily.

    Otherwise, interesting findings and fantastic job!

    Vote -1 Vote +1

    • Really, my preference is that wRC+ would eventually just be called “hitting,” rather than the tongue-twisting acronym it is now. Likewise FIP should be “pitching.” Of course, if I can’t have my druthers, then the most accurate name for ShH would probably be xwRC+ (pronounced “ex-werker plus”?).

      Vote -1 Vote +1

  26. jts5 says:

    excellent work. i was just thinking about this the other day

    Vote -1 Vote +1

  27. david says:

    why did i have to stumble on this article at 11 pm? now i’m gonna be up all night fooling around with this.

    thanks for all the good work.

    Vote -1 Vote +1

    • Wes says:

      Me too. Just ran the #’s on my most disappointing investment of the season: Evan Longoria. He’s doing everything right except not hitting it right at people. I hope he starts getting some hits to fall in soon.

      Vote -1 Vote +1

  28. michaelfranko says:

    This is a pretty awesome article.

    Honestly it pretty much answers every possible question you would ever really want to know about variations in hitters’ performance year to year.

    In short, it is perfect; good work.

    Vote -1 Vote +1

  29. Matt P says:

    The question I would ask is whether hitting doubles, triples and home runs is just luck. I would think that this isn’t the case.

    Therefore, instead of using HR%, I would use ISO. By using ISO, you are taking into account a players ability to hit for power. Also, I would recommend looking into adding a park factor. It’s harder to hit for power in certain stadiums.

    When I did a regression using ISO instead of HR%, ISO was statistically significant(P-Value less than .0001) and I had an increase in the R^2 value.

    Vote -1 Vote +1

  30. Trickman says:

    Rather than regressing on wRC+, why don’t you regress on something less period dependent like wRC/100?

    You’d eliminate the impact of changing eras resulting in different normalizations and might be able to eliminate some more of the noise as a result.

    We all know that 100 wRC+ today is not going to result from the same BB%, K%, and HR% rates today as in 2004. The actual weighted runs created won’t be impacted by the league-wide shifts in offensive or defensive performance — there’s just a lower bar for what constitutes average.

    Vote -1 Vote +1

  31. acerimusdux says:

    I think you would be better off including all extra base hits, not just HR. I have a measure which I have found extremely helpful for forecasting minor leaguers for future success, simply extra bases per strikeout, calculated (TB-H)/SO.

    Obviously, hitters do have a great deal of influence on doubles and triples, not just HR. It doesn’t make sense to just lump doubles and triples in with BABIP.

    There have been many attepms to divide offense into patience and power, with isolated power and isolated discipline being amongst others that have proven flawed. For valuing current performce, it seems generally best to use rates per PA as a measure. But maybe for predicting future performance, using SO as the denominator is more accurate?

    I haven’t really used BB/K in this way before, but I’ve used XB/SO, or (TB-H)/SO, for years with minor leaguers in this way, and it generally provides a very strong signal, not for how good a player is or will be, but for which minor leaguers will continue to produce at higher levels. What it really indicates is whether current performance will translate to future performance. Maybe it does that for MLB players as well?

    I’d love to see you plug XB/SO into your regressions and see if it improves things at all. It seems to me BB/SO and XB/SO would the natural pair, conceptually.

    Vote -1 Vote +1

  32. Blue says:

    Given the strong R-squared relationship, I’d be tempted to reduce the number of years of data down to, say, 2008-2011. You might lose a bit of R-squared, but the parameter estimates would better reflect the current offense environment.

    Vote -1 Vote +1

    • Blue says:

      Never mind, I just caught that you did exactly this.

      If you’ve got the steroid-era years, you might want to do a quick run for a regression for seasons in the middle of that time and compare the parameter estimates with the 09-11 population.

      Vote -1 Vote +1

  33. TheMooseOfDeath says:

    Bradley, if anyone were to heatedly argue with you about ShH (especially if they happen to be named Scott), please please please let it devolve into this:

    http://www.youtube.com/watch?v=mlv7Bp-L2MM

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

*