So Bad We Don’t Qualify

The Astros released Bill Hall last week to make room for the returning Jason Bourgeois. The 31-year-old Hall had been awful this season for the Astros, producing 0.9 wins below replacement. His fielding has eroded in recent years, and his offense, aside for what appears to be a fluky uptick last season, has left much to be desired for almost five seasons. His wOBAs since 2006? Try .317, .297, .261, .342, .269.

Sure, there was some reason to think that the .342 might be more indicative of his offensive proclivities, but the signing was odd for a team like the Astros. To guarantee $3 million at the major league level and include an option for next season suggests that the team believed Hall had some upside. They still have Jeff Keppinger under control, but his injury opened up a spot. Since Hall isn’t a puts-butts-in-seats kind of guy, it probably would have made more sense to use some farmhand in the spot until Keppinger returned.

Where did his -0.9 WAR rank, you ask? Good question, I say, as the original theme of this post changed from why signing Hall made little sense for the Astros when a realization was made scanning the leaderboards. The default setting on our leaderboards filters only for batters that qualify for a batting title. Sorting the players by WAR from the bottom up, I expected to see Hall’s name toward the top of the list. It wasn’t there. The list went:

- Chone Figgins (-1.2)
- Aubrey Huff (-0.9)
- Miguel Tejada (-0.9)
- Dan Uggla (-0.8)
- Orlando Cabrera (-0.8)
- Hideki Matsui (-0.8)

Hall was nowhere to be found, even though his -0.9 was the same as Huff’s and Tejada’s mark. Hall has been so bad this season that he doesn’t even qualify for most leaderboards.

Granted, WAR is a counting statistic that shouldn’t require any type of playing time filter, but this struck me as incredibly interesting. The correlation makes sense: play poorly and your playing time will be cut. But Hall and his colleagues in this regard had to be monumentally bad to the point that they soared to the top of the trailerboard in such a small relative sample of plate appearances. Here’s a quick look at these players:

Magglio Ordonez (-1.1 WAR): The captain of this club, Ordonez has been hampered by an ankle injury. That might explain part of his .207 wOBA in 106 plate appearances, but not all. Fun fact (as long as you aren’t Magglio Ordonez): Jose Bautista‘s wOBA is exactly 2.5 times higher. While Ordonez isn’t a full-time fielder anymore, his efforts in 73 outfield innings this season were three runs below average. It’s too early to close the book on him given the injury and his success at the plate over the last three seasons, but his ability to recover from said injury remains to be seen.

Bill Hall (-0.9 WAR): In 158 PAs he struck out a whopping 37.5 percent of the time, compiling a .224/.272/.340 line with a -7.7 UZR. His line drive rate improved to right around 24 percent, and his BABIP followed suit at .341, but when the ball is rarely put in play those rates don’t matter all that much. Of players with 50 PAs, only Adam Dunn, Ryan Langerhans, Kelly Shoppach, Ramon Castro, and Ian Stewart have whiffed more frequently. Dunn is in the midst of an unusually horrific season; Langerhans and Stewart are having trouble holding onto major league jobs; and Shoppach and Castro are catchers, so they can get away with it given their place on the defensive spectrum.

Edwin Encarnacion (-0.9 WAR): Some players are on here as a direct result of their suckitude with the bat. Encarnacion, aptly nicknamed E5, makes this list primarily on the lack of merit of his glovework. In just 43 games his UZR is a spectacularly bad -9.1. If you’re more into the fielding percentage metric, consider this factoid Dave Cameron discussed recently. As of May 27th, Encarnacion had a .784 fielding percentage. The next worst was Andy LaRoche at .892, a gap of 108 points. Encarnacion is one of the worst fielders in baseball history, and if he can’t even match the lower league average with the bat, what actual purpose does he serve?

Willie Harris (-0.8 WAR): Harris is primarily a pincher these days, hitting for pitchers, running for slowpokes, or replacing defenders late in games. The last part is particularly ironic given that he has stunk, in limited action, in both the infield and outfield. Because he is fast, or was once fast, the assumption that he still has the speed, instincts, and first steps necessary to ably field certain positions is laughable. His .220/.310/.300 line should improve, but it’s high time the Mets started using him strictly as a pinch-hitter and pinch-runner instead of thinking his fielding abilities from 2004 are still in the repertoire.

Dan Johnson, Reid Brignac, Tyler Colvin (-0.8 WAR): Based on my twitter feed, I thought Dan Johnson was one of the best players in baseball over the last few years. There isn’t really much to say for DJ, who walks quite a bit but doesn’t have much power outside of Japan and isn’t an elite fielder. Getting a .350/.430 OBP and SLG from a first baseman is okay in a transition year, with a stopgap, but he couldn’t even get to that point.

Brignac was on a league average pace last season, hitting a bit below average, but with solid fielding skills at the toughest infield position. This season, he has a .171 wOBA, and while the fielding skills haven’t gone away, he has a .171 wOBA.

Colvin hit .254/.316/.500 last season, leading many to wonder if he was “for real”. He has only stepped to the plate 85 times this season, but has a mere seven hits. His walk rate has improved, and the strikeout rate isn’t substantially worse. His major issue is a 12.7 percent line drive rate and a microscopic .094 BABIP. He’ll definitely get another chance, and he cannot possibly be this bad. He might not be for real in terms of blossoming into a hard-hitting corner outfielder, but he also isn’t anywhere near as bad as he has looked this year.

Some of these players have clear and established levels of productivity that suggest their performance this season is nothing other than small sample silliness. For others, like Hall, it more likely signifies the decline or end of a career. It’s easy to stink in the major leagues if skills slip by even a small margin, but it seems very difficult to be so bad to the point of not qualifying for a leaderboard.




Print This Post



Eric is an accountant and statistical analyst from Philadelphia. He also covers the Phillies at Phillies Nation and can be found here on Twitter.


29 Responses to “So Bad We Don’t Qualify”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Jason T says:

    Bugger off, Chone.

    Vote -1 Vote +1

  2. Jay says:

    Colvin is 0 for 23 on ground balls. Maybe Darwin Barney can teach him how to get those to go through.

    Vote -1 Vote +1

    • CircleChange11 says:

      At the ML level, I would expect poor contact (medium speed) ground balls to be recorded for outs 23 times out of 23.

      Hard hits ground balls are a different story as are very weak ground balls.

      Actually, in terms of just getting on base, you’re likely better off at the ML level hitting a weak, weak, grounder rather than a routine one.

      Colvin’s also a lefty, and if he’s hitting grounders, he’s likely “rolling over” on outside half pitches. Those balls are not hit with authority.

      Batters have demonstrated that if you hit balls hard enough, you can get groundball hits even with a “lefty shift” on.

      Just saying :ground balls” isn’t specific enough. If we watched all 23 at bats where he grounded out, would we classify ANY of them as being good at bats or good contact?

      I really like Colvin (even as a cardinal fan), but the league adjusts, advance scouts scout, etc.

      I feel bad for Bill Hall … always seemed to good guy, gives a lot of effort, would play anywhere. I know he’s had a good career, and made a living playing ball … I just hate to see veteran guys struggle toward the end. I know it’s the “circle of life” and all that … just tough to watch sometimes.

      Vote -1 Vote +1

  3. Hayson Jeyward says:

    So… are small sample UZR measurements valid or not? It seems that we see references to these tiny 2-month sample sizes daily in one article or another, despite being admonished and warned by other authors that these sizes are essentially meaningless at this point in the season.

    Are the 146 defensive innings played by Willie Harris or 73 by Magglio Ordonez thus far in 2011 really worth using?

    Obviously various authors have different opinions, but it seems that using UZR in such limited sample sizes is erroneous and misleading.

    Vote -1 Vote +1

    • Reverend Black says:

      People never get this. Small UZR samples have very low predictive value, but that doesn’t make them “invalid”. They are real data and should be cited as freely as anything else. What you must do is wait much longer before drawing conclusions about a player’s ability based on UZR samples smaller than ~3 seasons.

      This isn’t different from offensive stats except in the size of the sample required. You don’t take a week’s worth of PA and draw a firm conclusion about a hitter’s ability. But you also don’t consider that data “invalid”.

      Vote -1 Vote +1

      • Luke in MN says:

        “This isn’t different from offensive stats except in the size of the sample required.”

        This isn’t right. It is not at all clear how well the defensive metrics are actually measuring what they’re attempting to measure. There’s no way to “check” and see that the data is actually measuring things correctly in small sample sizes. That’s not to say its entirely worthless in small sample sizes, but it’s very different from offensive statistics, which measure whatever they’re measuring with nearly perfect and verifiable accuracy.

        Vote -1 Vote +1

      • don says:

        Except that offensive events are much easier to classify. A single is a single and except in the relatively rare cases where a fielder dove and missed and the scoring decision is a little subjective there isn’t much more to it. A foot or two may be all that separates a sharp grounder an average third baseman would get to from one that would take an very good defensive play to turn into an out and I don’t think that the batted ball classification is quite precise enough to be sure that UZR is even a perfect measure of what has happened in small sample size.

        Vote -1 Vote +1

      • Bronnt says:

        What the two commenters above me just said. Small sample sizes in UZR are going to have a lot of noise wherein some fielders will see a lot of balls more sharply hit than others within certain zones that will make the dataset messy. Over very large sample sizes, these things get smoothed out to the point that you can rely on it a bit more, but it’s a bit troubling to say that the data is perfectly accurate and acceptable at face value in small samples.

        I mean, for an example I’m familiar with, Jordan Schafer of the Braves has been in Atlanta for a total of 10 games. He’s made a total of 29 plays (19 in zone, 10 out of zone). UZR says that he’s already saved FOUR runs above AVERAGE. I fully believe he’s a really good defensive centerfielder, but to say that he’s been 4 runs better than an average centerfielder over 90 defensive innings is insane.

        Yes, I did pull out an ultra-small sample size, but that’s the point. It’s not just that it’s a non-predictive sample, it’s also a sample that I do not believe could accurately be describing what’s really going on-that somehow, a average centerfielder (by all means a good defensive player) allows 4 more runs to score on just 29 total plays.

        Vote -1 Vote +1

      • Reverend Black says:

        Luke – you seem confused about what the word “this” refers to in the sentence you quoted. I haven’t said that UZR isn’t different from offensive stats, I’ve spoken to the issue of sample size in relation to validity. With respect to that issue, there isn’t a difference between offensive stats and UZR.

        There are validity issues that come with metrics like UZR, but they are related to measurement, not sample size.

        Vote -1 Vote +1

      • Reverend Black says:

        Don – “A foot or two may be all that separates a sharp grounder an average third baseman would get to from one that would take an very good defensive play to turn into an out and I don’t think that the batted ball classification is quite precise enough to be sure that UZR is even a perfect measure of what has happened in small sample size.”

        Shifting the goalposts a but. Offensive stats, as you had just acknowledged, aren’t going to perfectly measure (reflect) what happened either. I certainly haven’t claimed or implied here that UZR perfectly reflects anything. We are talking about validity only as it is effected by sample size.

        The issues you are pointing out are related to sample size only indirectly. In other words, they are issues related to the ‘mechanics’ of metric itself which are not mitigated significantly by a sample getting larger.

        Vote -1 Vote +1

      • Reverend Black says:

        Bronnt – “Over very large sample sizes, these things get smoothed out to the point that you can rely on it a bit more, but it’s a bit troubling to say that the data is perfectly accurate and acceptable at face value in small samples.”

        That would be a troubling thing to say. I’m glad I’ve never said it.

        “I fully believe he’s a really good defensive centerfielder, but to say that he’s been 4 runs better than an average centerfielder over 90 defensive innings is insane.”

        With all respect, this is a conclusion for which you haven’t shown your work. The “makers” of UZR have shown theirs.

        “Yes, I did pull out an ultra-small sample size, but that’s the point. It’s not just that it’s a non-predictive sample, it’s also a sample that I do not believe could accurately be describing what’s really going on”

        Skepticism is fine, but more is required to conclude that UZR data is invalid in small samples.

        Vote -1 Vote +1

      • matt w says:

        Another point to what the previous commenters said is that UZR doesn’t claim to measure on-field happenings. OBA, for instance, measures how often a player reached base safely per plate appearance (not counting fielders’ choices, I think). That’s an objective measurement of what happened on the field — if someone’s OBA is .300, you know they reached base 30% of the time, no matter what the sample size is.

        UZR (rolling the tape) “puts a run value to defense, attempting to quantify how many runs a player saved or gave up through their fielding prowess (or lack thereof).” That’s not a measure of what happened on the field; it’s a measure of what happened on the field compared to what would have happened on the field if an average fielder had been playing. You need a reasonable sample size to make any judgments about what would have happened on the field if an average fielder had been playing. So in a small sample size, there’s not enough basis for saying that fielder X saved (UZR many runs) compared to the average fielder.

        Vote -1 Vote +1

      • Bronnt says:

        R. Black-

        UZR DOES have validity concerns, for the reasons Luke specified. There’s no objective verfiable measure of comparison. I can’t recreate the circumstances of every game with the substitution of a league average defense. Even in a small sample size, I can say “This player reached base 4 times in 5 plate appearances,” and I can definitely say it’s better than average over the sample cited, even though it’s meaningless.

        Due to difficulties involving the data uptake of UZR, there are reasons to be concerned if the subjective elements aren’t coloring the data. Considering that UZR entirely ignores things like pop-ups, and line drives for infielders, and may lack the proper camera angle at various points for accurate classifications, or that you run the risk of your video scout overlaying his subjective views onto the data (which is obviously a concern, since otherwise BIS wouldn’t rotate their scouts), validity concerns increase in small sample sizes for UZR.

        Vote -1 Vote +1

      • Reverend Black says:

        Matt – “You need a reasonable sample size to make any judgments about what would have happened on the field if an average fielder had been playing.”

        Why do you say that? The average fielder is a useful mathematical fiction in the exact same way the average hitter and average pitcher are. (No questions about the validity of FIP- or wRC+ offered yet.) They are dynamic, season to season, but that’s not really a challenge to the validity of the metric; it’s just an important thing to remember about what the numbers are expressing.

        Vote -1 Vote +1

      • Reverend Black says:

        Bronnt – We seem to be talking right past each other now. I haven’t and won’t defend every aspect of UZR from every situation. I commented in response to the idea that it’s sample size specifically that undermines the validity of UZR data. Sample size only undermines the data’s reliability (predictive value).

        There are a handful of serious issues & limitations with UZR data in general – no disagreement there. But to say that the data isn’t valid in small samples because of them is to say that the data is never valid. Validity isn’t a sliding scale like reliability is.

        Vote -1 Vote +1

      • matt w says:

        “The average fielder is a useful mathematical fiction in the exact same way the average hitter and average pitcher are. (No questions about the validity of FIP- or wRC+ offered yet.)”

        But FIP- and wRC+ don’t claim to measure the players’ performance versus what the average player would have done in their stead; they claim to measure the players’ performance versus the average of what other players actually did. FIP- can easily be expressed in terms of the average of others’ FIPs; UZR isn’t expressed in terms of the average of any statistic for other players.

        The problem isn’t the average, it’s the counterfactual nature of what’s being measured. FIP- works because it doesn’t say “An average pitcher would’ve given up this many runs compared to this pitcher,” it says “This pitcher’s FIP was a certain percentage above or below the average.” UZR says “This player made a certain number of plays more or less than the average player would have made given the balls that were hit to them.” How are you going to express that without the supposed mathematical fiction and without the counterfactual?

        Vote -1 Vote +1

      • matt w says:

        Think of this — suppose that I developed a statistic, WZR, designed to quantify how many runs a center fielder gave up in the field compared to how many Willie Mays would’ve given up in the field on the same plays. Willie Mays isn’t a fictional construct; but WZR would still be counterfactual, because we can’t compare the player’s performance to Mays’s performance on the same balls. We still need to come up with some way of measuring the counterfactual, WWWMHD (what would Willie Mays have done)? And looking at a small sample of fly balls, we wouldn’t be able to come up with an accurate measure of WWWMHD; so in a small sample WZR couldn’t be said to measure what actually happened on the field.

        That’s why the issue isn’t with the mathematical fiction of the average fielder; it’s with the counterfactual nature of the statistic.

        Vote -1 Vote +1

    • gort says:

      I think when you’re 1) looking at extremes, and 2) not really attempting to make scientific projections, SSS isn’t such a big deal

      Vote -1 Vote +1

    • Bryz says:

      The small sample UZR numbers aren’t really valid. However, they do show what has happened so far, but should not be used to show what to expect in the future. E5′s UZR is terrible right now, in fact his 2011 UZR/150 at 3B is -73.6. His career UZR/150 is -13.1.

      Long story short, E5 has been absolutely horrific on defense thus far, but we can’t assume that it will stay that way for the rest of the season on the basis that he’s been a much better defender (albeit still bad) in his career.

      Vote -1 Vote +1

  4. j6takish says:

    What happened to Chone Figgins? Players with his skill set don’t typically fall off of a cliff like that. Defense, OBP, and Stolen bases aren’t exactly ‘old player’ skills

    Vote -1 Vote +1

    • Bronnt says:

      Last year, Figgins’ problems were largely a result of being asked to play a defensive position which he, historically, did not play very well. This year, his problems seem to be mostly BABIP related. He was always a guy that relied on a high BABIP and a very solid BB rate. Sometimes when guys are pressing hard, and their average goes down, they get more aggressive and don’t take as many walks as they should, which might be why his BB rate is declining. But right now he’s .121 points below his career BABIP. There might be some correlations in his batted ball data, but he’s not this bad a hitter.

      Vote -1 Vote +1

    • Neuter Your Dogma says:

      Or….His problem is he got his 36M guaranteed in 2010 and decided to retire.

      Vote -1 Vote +1

    • CircleChange11 says:

      Be careful not to confuse the “symptoms” for the “problem”.

      The “symptoms” is his defense, or his BABIP, etc.

      That’s not what the “problem” is. I don;t know what it is either, but low BABIP isn’t a problem … it’s the symptom of a problem.

      Pitchers seem to be in the strike zone more than in the past. Whether its due to a “new era” (if you will), the emphasis on “making them hit their way on”, or whatever. But, there’s really not a reason not to challenge Figgins on every pitch. Walking him would be the dumbest thing you could do.

      Vote -1 Vote +1

  5. Pat says:

    I think a lot of this data just goes to show that it’s not easy to find an average player these days. I like WAR and think it makes a ton of sense but these “replacement players” are starting to seem like a myth. You really don’t know what to expect anymore from a veteran utility player or a AAA callup, and if every team knew they would give them a mediocre performance, then no one would sign a guy like Hall for more than the minimum.

    Vote -1 Vote +1

    • DO YALL WANT A HAM says:

      If you want to see a replacement player, just look at half of the Tigers’ lineup on any given day.

      Vote -1 Vote +1

  6. Shattenjager says:

    I was always amazed by the fact that David Ortiz once had -0.5 WAR in a season . . . in 25 plate appearances.

    Vote -1 Vote +1

  7. beastwarking says:

    Figgins :(

    Vote -1 Vote +1

  8. Joel says:

    WWWMHD (what would Willie Mays have done)?

    That is exceptional!

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>