It Took This Long to Release Davies?
The Royals officially released Kyle Davies earlier this week following poor performances in parts of five seasons. Acquired from the Braves in exchange for half of Octavio Dotel‘s 2007 season, Davies seemed like a solid return. He threw hard, complemented the heater with a decent yet underwhelming curve, and was not yet eligible for arbitration. In other words, he was exactly the type of project the Royals were looking for. While the trade itself was good for the team at the time, Davies has arguably been the worst regular starter in baseball over the last five seasons. His presence on the Royals roster this season, as well as his career numbers, invites discussion regarding tendering contracts, building rosters and large gaps between ERA and estimators.
Since debuting in 2005, Davies has thrown 768 innings with below average peripherals, poor controllable skill marks and abysmal run prevention rates. He has a career 6.4 K/9 and 4.3 BB/9, and a 39 percent groundball rate. He doesn’t miss many bats, exhibits poor control and struggles to keep the ball on the ground. The anti-Halladay, if you will. He stranded runners at a meager 67 percent clip, which hurt mightily given his 1.62 WHIP and .318 batting average on balls in play. Plenty got on and plenty came around to score, translating to a 5.59 ERA. Though his xFIP and SIERA were a bit kinder — they are identical at 4.91 — they didn’t exactly vouch for his performance.
Since 2007, the year he was traded to the Royals, there are 83 pitchers to throw 600+ innings. Davies has the sixth lowest WAR (5.6), the worst ERA (5.40), third worst SIERA (4.87), fifth highest BABIP (.313), second worst WHIP (1.58) and second worst strand rate (66.8 percent). No matter how one slices it, Davies has been the, or one of the, worst starting pitchers in baseball. His career, however, has been anomalous in the sense that most pitchers with numbers this bad don’t have the benefit of throwing almost 800 big league innings.
His release isn’t all that surprising, at least when compared to the shock accompanying his being tendered a contract this season in the first place. The Royals already had a crowded rotation and paying Davies $3.2 million was a complete and utter waste.
Dayton Moore has done wonders in rebuilding a barren farm system since taking over, but he has shown virtually no sign whatsoever that he can competently build around those prospects with major leaguers. Tendering Davies a contract of that ilk was just another negative feather in that cap.
Now, it’s easy to look at his WAR and estimators — 4.21 SIERA this season, 2 WAR last season — and conclude that he isn’t that bad. Generally, that argument would pass muster, but I would argue that Davies is a reverse-Zambrano, a Bondermanian disciple whose ERA will never match what the controllable skills suggest. Experiencing a large disconnect between ERA and FIP in a single season isn’t tremendously uncommon, but the number in our sample shrinks substantially when a single season extends to multiple ones.
Since 2005, Davies ERA is 70 points worse than his FIP, which is by far the largest gap among qualifying pitchers. Behind him are Mark Hendrickson, Jeremy Bonderman, Ricky Nolasco and Javier Vazquez. While Hendrickson isn’t noteworthy for anything other than his height and basketball skills, the other three are all infamous for posting ERAs in excess of their estimators. Pitchers in this mold, and those on the opposite end, tend to defy what estimators suggest as some portion of their skillset prevents the two sets of information from forming a consensus.
It is without question that, over a smaller sample, ERA estimators inform on more levels about a pitcher’s effectiveness than ERA itself. But over a longer period of time, and over multiple seasons, ERA tends to emerge as a more predictive source of information. In Davies case, I would argue that his 768 innings is a sufficient sample to determine that his run prevention skills are quite poor, and that his estimators don’t paint a more accurate portrait of his performance.
Of course, since WAR is contingent upon his controllable skills, it is likely that his value has been inflated, and that his career 5.9 WAR tally should be lower. On a non-guaranteed, one-year deal, perhaps he is worth a flier, but for a team like the Royals, he didn’t make sense last year and made even less sense this season. While he finds himself in the company of Bonderman, Vazquez and Nolasco, there is a clear difference between those three and Davies: their estimators suggest all-star level production.
Davies will undoubtedly latch on elsewhere whether it’s this year or next, and might even pitch another few seasons, but teams shouldn’t count on much from him. It’s more likely than not that his career ERA is closer to his true talent level than his estimators, even without having thrown 1000 total innings yet.












9

I just don’t understand why they wanted to continue to put a middle reliever into the rotation. He makes me think of Kyle Snyder, who became marginally effective for a little while in long relief on the back of some still mediocre peripherals. He was consistently terrible in the rotation, though. But why would you want to pay $3.2m to find out if you have a marginal middle reliever?
In addition to the whole “being a lousy major league pitcher” thing, there is also the matter of drinking, apparently.
Davies was arrested for drunk and disorderly in Florida last night:
http://florida.arrests.org/Arrests/Hiram_Davies_5672104/
id guess the one caused the other
This is actually an excellent argument for why WAR should be based on actual run scoring results, not component-based estimates of what the runs allowed “should” be. IF the estimators are wrong for a player, WAR is wrong, and if the estimators aren’t wrong, there’s not much difference anyway. I understand why it’s done your way, because it makes WAR more predictive of future performance, but it throws it into question as a summary of what actually happened. As your acknowledge here, there are players for whom the components never seem to add up, and ERA is probably better for longer periods.
Baseball Reference, as I understand it, uses actual event based WAR, which is why they have a better -2.6 WAR for Davies. That sounds more reasonable, there’s no way you could convince anyone that Davies was significantly above replacement level (except 2008).
Just to clarify my own comment, I realize that FIP relies on “actual events” like K’s, B’s, HRs, but since these have been used because they are typically predictive of runs allowed (esp. in the future), it might make more sense to use the runs, rather than what helps you predict them. Cut out the middle man, since WAR is typically used as a “what happened” type stat, rather than a predictive tool.
A player who is consistently “unlucky” with runs allowed (FIP-ERA) hasn’t accumulated value despite the poor results, they show potential for future value. Davies is a good example of this, and right now, so is Brandon Morrow – WAR here loves him, 6.9 over the last two years, bbWAR gives him 2.7. If I was thinking about valuing him as an asset, maybe using fWAR is better. But in evaluating how much value towards winning he has provided, I lean to the bbWAR
fWAR’s way is correct because it identifies what the pitcher is responsible for. brWAR simply averages the defense’s contributions across all pitchers and calls it done.
But pitchers are responsible for not pitching well with runners on base (strand rate is the typical culprit for large differences), it is a skill, and one with an identifable change in approach, pitching from the stretch.
The article itself notes that fWAR doesn’t work for Davies and a series of other pitchers. brWAR doesn’t seem to have this problem as it correctly says Davies was awful. Acknowledging that for small samples fWAR is probably better, what type of pitcher does brWAR have trouble with? I probably knew this at one point…
I truly believe using FIP is more accurate than using ERA, however, we all have to understand that no stat is, in and of itself, perfect. Using FIP for WAR paints a very accurate portrait of true pitcher skill, but there are always going to be outliers. Kyle Davies is probably one of the guys where ERA is a better indicator. Most other pitchers, the opposite is true. So it’s not a one-or-the-other proposition here. Using FIP is accurate probably 95 percent of the time — but the other 5 percent are often publicized much more.
But pitchers are responsible for not pitching well with runners on base ….it is a skill…….
Can you post some research to back this assertion up?
Look, pitchers do something physically different when there are runners on base to throw the ball, and are different at keeping runners from stealing. Do either of these things need research to back them up? Is it really a leap to say that some pitchers are better at this than others? Pitching from the stretch is not some finer, invisible point of the game, it’s a major change to the pitching motion that every player has to figure out. Some actually get better. I am aware that pitchers generally don’t lose velocity with the change in motion (although this would seem to indicate they are trying to throw harder with runners on to, or that the windup is useless).
I don’t have any comprehensive research, but someone like Javier Vasquez, with a way higher fWAR (56) than brWAR (37), was also way better with no one one base (.708 OPS) as compared to with men on (.776 OPS), with a drop in k% (19% of PA with men on, 22.4% with bases empty). If I recall, he stunk at preventing steals too.
I’m not in strong disagreement that FIP is better than ERA (although RA would be preferrable to me for the modern low error total game), I just think they tell you different things.
Returning to Morrow, has he currently had a good season? I would argue that no, not really. It’s an encouraging season, but his results have not been very good so far. FIP says that’s unlikely to continue (although it’s been two years now, it’s getting annoying as a Jays fan), which is a rational evaluation of the “potential” of his year to this point. But I don’t think it’s a rational evaluation of his actual year. Using Morrow is a bit unfair though, since he’s the biggest outlier for two years in a row now.
OK, ending my long-winded point: Usually FIP and ERA match reasonably well, so most of the time it won’t matter which one you use. When FIP and ERA don’t agree, that means the results have been either notably worse or better than expected based on BB, K, and HR rates. FIP tells you what to expect going forward much better than ERA, that’s been shown. But what it’s been shown to be better at is predicting future ERA (RA?), which kind of acknowledges that runs against prediction is the goal. I’m not sure why you reverse this process, and are essentially predicting past, already known, results.
Actually, keep it as it is, and that way both remain available. Not that you were going to change it based on my comments…
eric, in fairness, it’s not like the royals were running out mark belanger all those years at shortstop along with utley at 2nd and beltre at third..
so there’s that too.
“Dayton Moore has done wonders in rebuilding a barren farm system since taking over, but he has shown virtually no sign whatsoever that he can competently build around those prospects with major leaguers.”
I know this is the knock on Moore, but considering that this is the first year any of those prospects have made the majors, how do we know this? He did a credible job filling in and building around them with Cabrera and Francoeur, both cheap pickups who have outperformed expectations.
Seems to me that what Moore was doing in years prior was biding his time trying to keep some sort of team on the field while he was building what he wanted to put together. His stated goal all along was to field a team consisting mostly of home grown players. The Royals didn’t have a farm system to speak of for him to build around when he took over, no trade chips to flip for kids, and as far as free agents go they weren’t exactly a popular destination for quality players.
Well, SOMEbody has to be the worst, right?
Wow, it’s almost hard to believe he was that far above replacement during those years… 5.6 WAR! HoF prob.
That’s mostly because WAR is based on FIP and as the article shows, FIP almost certainly overestimates Davies abilities.
Is it FIPdif or something that is the ratio between ERA and FIP? I think over an equivalent three season sample the sum of that stat should be renamed DAVIES. It’s not bad luck when you suck that bad over that long of a period of time.
Excellent candidate to pitch in Japan.
so in other words: all of these stupid stats you use are complete garbage.
isn’t FIP a predictive stat? then why would it be used in WAR, which is supposed to show how a player peformed in the past?
If it’s in the past you just call it an estimate. Still a predictor.
Where is the evidence that ERA itself, over the longer period of time, becomes more predictive than ERA estimators? Not questioning its existence, would just like to know how extensive that evidence is, what it suggests about how longer a period of time, and so on.
Richie, this was discovered when Matt and I originally tested SIERA, and further confirmed in Colin Wyers’ attempt to debunk estimators. When you look at a larger sample, the noise associated with the smaller samples is reduced, not eliminated, but reduced, and ERA becomes more accurate. The issue though is that people don’t use estimators really for larger samples.The majority of people use them to help with fantasy roster decisions. I can’t access the links at work right now, but hopefully that points you in the right direction.
So ERA itself just gradually overtakes its estimators over time in terms of predictive power? No particular spike after ‘XXX’ amount of innings or anything?
Oh, and thank you!
But is he THE worst starting pitcher of the past 7 seasons or so, or just close to being the worst. I thought you kind of hedged on that. I would like a definitive opinion. There have been so many terrible pitchers to choose from: Adam Eaton come to mind. Is Ramon Ortiz still pitching somewhere? I know Jo-Jo Reyes is, and though he probably doesn’t have the IP yet the historical losing streak has to be a point in his favour. Surely Davies has some serious competition we’d have to consider before we laud him worst of the bunch.
When are we going to get a pitcher WAR based on tERA or a similar metric that takes batted ball type into account.
Vogelsong was pretty bad for a long time and look at this year. This guy sucks, but who knows if can put it togeather??
I can’t believe I’m typing this: I swore off protecting Kyle Davies after he put up miserable start after miserable start this season (only to throw a gem and keep us on the hook). In 2008, he took a step up in every category. Even the skeptics at Royals Review were wondering if he’d turned a corner. In 2009, he slipped back, but the K’s were up a bit, and it was simply the walks that were the problem (we and Dayton Moore told ourselves). Then in 2010, he looked OK again (partly b/c of the run environment, but we didn’t all really realize that yet) – not great, but maybe 2008 wasn’t a mirage after all. So when Moore gave Davies the 2011 season to prove once again that he was not worthy of a starting role, we didn’t really mind.
As mentioned above, the ERA you’re talking about happened with the likes of Jose Guillen, Yuni, Billy Butler, and in general one of the consistently and repeatedly worst defenses during his time in KC. It seems like the argument that ERA is a better predictor needs to consider this – maybe you did, but didn’t mention it.
“Crowded rotation”? If you consider Bruce Chen, Sean O’Sullivan, and Jeff Francis in your opening day rotation “crowded”, then I guess so. Yes, the 3.4 M was risky, but giving Davies one last chance to prove himself wasn’t such a bad move (and with Moore, it was–relatively speaking–a good move).