Intro to Splits

As you’ve probably noticed, David unveiled split data as the newest addition to the site yesterday. This is something that has been in the works for quite a while, and David worked long and hard on getting this on the site. For the first time, we’ll be able to really break down how a player performs against different pitcher types, as things like xFIP by handedness of batter have not previously been available.

However, as RJ noted a bit this morning, we do want to encourage wise use of split data, because these are the types of numbers that can be abused at times. In practicality, any split is going to be a smaller subset of a larger sample, and when you reduce your sample size, you increase the amount of noise in the number. There’s no way around that.

In fact, you can slice and dice numbers enough ways to always find some way that a player performed abnormally. Whether it’s batting average against lefties on Tuesdays or FIP in alternating months, these are the kinds of numbers that really mean nothing. They are the kinds of splits that give rise to things like the “lies, damn lies, and statistics” cliche. When looking at split data, we’d suggest limiting your conclusions to effects that are well known – platoons, parks, pull or opposite field results, etc…

Finally, you also want to keep the overall performance of the league in a specific situation in mind when looking at split data. We’ll get league averages by situation on the site in the not too distant future, but here’s a sneak peak at some batted ball league averages (2002-2009), so that you can compare players against a baseline for each type of struck ball:

Bunts: .376/.376/.377, .336 wOBA
Grounders: .231/.231/.253, .214 wOBA
Flies: .217/.212/.602, .328 wOBA
Liners: .727/.723/.974, .734 wOBA

It really is stunning how important hitting line drives is. Unless you’re regularly pounding fly balls over the wall, any other batted ball type is just not very productive. In fact, when you look at the BABIP split for fly balls, you see that 87 percent of non-HR flies result in outs. Line drives are where it’s at.

We’ll have more on the proper way to use split data over the next few days. Enjoy them, find interesting nuggets hidden away, but also remember to use them judiciously. You don’t want to voluntarily cut your sample size in half if you don’t have a reason to.

Print This Post

Dave is a co-founder of and contributes to the Wall Street Journal.

21 Responses to “Intro to Splits”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. B N says:

    (Head scratch) I’m clearly missing something here as far as a minor detail. For Flies and Liners, do we know why the OBP is less than the AVG? Is this counting plays where guys are getting thrown out when going for an extra base or something of that sort?

    Vote -1 Vote +1

  2. BS says:

    I’ve always been curious about this, but how are flies and liners distinguished? Is it arc angle? Ball flight height? Or is it a more subjective determination?

    Vote -1 Vote +1

  3. Lucas A. says:

    The league averages would be an amazing addition.
    Dave, is there any chance of defense-independent pitch value statistics based on batted-ball type making way onto the site?

    Vote -1 Vote +1

  4. don says:

    .376/.376/.377 – okay, who has a video of the extra base bunt?

    Vote -1 Vote +1

  5. glassSheets says:

    Joe Mauer’s batted ball’s by pull-center-opposite are great fun to look at

    Vote -1 Vote +1

  6. Detroit Michael says:

    It’s be nice to have a refresher about the level of platoon persistance. I know the research has been done before but did the statheads reach consensus on whether a batter who clobbers LHP with an exceptionally large historical observed platoon split can be expected to do that going forward. Or should one just project his overall batting ability and expect a normal platoon split?

    Vote -1 Vote +1

  7. Pip says:

    Super addition!

    As as future improvement, I humbly request pitcher splits by role (starter, reliever).

    Vote -1 Vote +1

  8. Joel says:

    Dear Eric Van, …

    Vote -1 Vote +1

  9. went9 says:

    Thanks for the splits Dave. It’s cool to have lefty/righty splits at your fingertips. It just keeps getting better here.
    Again, thanks much.

    Vote -1 Vote +1

  10. labe says:

    here is just something i was thinking about….
    this may be exactly what tra* does, so i’m not sure if it’s anything new.
    knowing the average slugging percentages per ball in play allows us to create a babip sort of idea for “isolated power against”.
    using regressed batted ball % i calculated what a pitchers expected iso against should have been. just to name a few pitchers, some chosen because i expected a large difference…
    expected iso iso diff
    kershaw: 109 82 -27
    volstad: 115 201 +86
    u.jimenez: 98 97 -1
    nolasco: 126 174 +48

    Vote -1 Vote +1

  11. labe says:

    so it’s easier to read:
    ………….expected iso…..iso…..diff
    kershaw: 109…………….82…. -27
    volstad: 115…………..201… +86
    u.jimenez: 98……………97…… -1
    nolasco: 126………… 174….. +48

    Vote -1 Vote +1

  12. FireOmar says:

    any chance we could get day/night splits? had to go over to for those. Miguel Cabrera day game hangover related.

    Vote -1 Vote +1

  13. Dan says:

    Any plans to add these to the “leaders” pages so they can be sorted, etc?
    Keep up the great work.

    Vote -1 Vote +1