Evaluating the Royals

The Kansas City Royals have lost 11 consecutive games, and at 3-13, they have the worst record in baseball. Even their most ardent supporters are walking away. Whatever optimism their farm system had generated before the season began has been washed away in a sea of losses, and now, the Royals just look like the same old last place team they’ve always been.

That’s the story if you just look at wins and losses, anyway. If you look a little deeper and ask why the Royals are currently 3-13, though, the story becomes a lot more interesting.

At the plate, the Royals are averaging just 3.56 runs per game – AL average is 4.47 – third worst in the American League. They are just 0.03 runs per game better than the vaunted Mariners offense. So, the offense has been a problem, right? Well, sort of, but not in the way you might think. The Royals overall line on the season is .255/.315/.413, good for a .316 wOBA. The average AL team is hitting .253/.320/.412 with a .320 wOBA, so overall, KC has been just slightly below average at the plate in their first 16 games. So, how does a team with an average batting line score nearly a run less per game than expected?

Timing. With the bases empty, the Royals have posted a .333 wOBA, fourth best in the American League. With men on base, that’s fallen to .298 – third worst in the league. With runners in scoring position? .275, ahead of only the Oakland Athletics.

The Royals have been pretty good at starting rallies, and absolutely atrocious at getting anything out of them. If the Royals hits had come in a more normal distribution of situations, they’d have scored 68 runs instead of 57. This isn’t just a minor blip – the Royals lack of situational hitting has been a major factor in their overall record.

But, just adding 11 runs doesn’t just fix a 3-13 team, right? So, let’s look at their run prevention. They’ve given up 5.06 runs per game, fourth worst in the American League. Most of the optimism that existed about the Royals before the season related to their young hitting, so it’s not a big surprise that the run prevention hasn’t been all that good. However, the pitching hasn’t actually been as bad as you might think.

BB% – 9.8%, 12th in AL (average is 8.3%)
K%: 18.6%, 8th in AL (average is 8.6%)
HR/9: 1.00, 7th in AL (average is 1.09)
FIP: 4.16, 10th in AL (average is 4.10)

They’ve issued way too many walks (I’m looking at you, Jonathan Sanchez), but they’ve been exactly average in getting strikeouts and slightly above average at keeping the ball in the yard. The walks help push their FIP slightly below average, but they’re a lot closer to the middle of the pack than they are to the teams whose pitching staffs have truly been costing them games. So, if they have a 4.10 FIP, why are they giving up 5.06 runs per game?

Let’s go back to the splits.

Bases Empty: 3.63 FIP, 5th in AL
Men On Base: 4.77 FIP, 12th in AL
RISP: 4.00 FIP, 9th in AL

It’s basically just the same story as on offense. With the bases empty, their pitchers have been pretty darn good. Put men on base, they’ve been a lot worse, even above and beyond the normal difference that pitchers generally face pitching out of the stretch. But FIP doesn’t even tell the whole story here. Back to the splits we go, but this time, we’ll look at BABIP instead of FIP:

Bases Empty: .295 (t-10th in AL)
Men On Base: .317 (11th in AL)
RISP: .369 (14th in AL)

The Royals have given up more hits on balls in play than an average team in each situation, but they’ve been far and away the worst in baseball in BABIP with RISP – the Red Sox are the next worst AL team, and they’re even 30 points better than KC, coming in at .337. The AL average in that situation is .299.

Put together the situational FIPs and BABIPs, and opposing hitters have posted just a .705 OPS against the Royals pitching staff when the bases are empty, but an .841 OPS with men on base and an .840 OPS with runners in scoring position. As with the hitting, the distribution of when the hits have been allowed has been a huge factor in the team’s lack of production.

As it stands, the Royals have scored 57 runs and allowed 81. With a more normal distribution on timing of hits, though, it’d be pretty close to 70 for both RS and RA. And, as everyone who has been beaten to death with pythagorean expectation over the past 20 years knows, a team’s runs scored and runs allowed are a better evaluator of how a team has played than simple wins and losses. Pythag suggests that the 57/81 split in their RS/RA means that the Royals have played more like a .313 team than a .188 team, and their underlying components of run scoring and run prevention suggest that they’ve played more like a .500 team than a .313 team.

So, how should we evaluate the Royals after 16 games? The results have been atrocious, but simply changing the unsustainable distribution of their performance would drastically alter their record. There’s a reason we all look at context-neutral statistics when trying to evaluate the worth of an individual or a team, because context-specific performance contains wild fluctuations that generally hold no predictive value.

The fact that the Royals have been – by far – the worst team in baseball in the clutch during the first couple of weeks of the season doesn’t tell us anything about how they’re going to perform in those situations going forward. Looking beyond the standings reveals that the Royals have actually performed decently at times, and perhaps all this panic in Kansas City is an overreaction to events that just won’t continue. The standings tell us what has happened, but they generally don’t tell us anything about why those things have happened.

That’s why, when Sports Illustrated approached us about helping them come up with a different approach to the traditional weekly Power Rankings column, we agreed to team up with them and offer an alternative to the “re-write the standings in text form” approach. SI still wanted the column to reflect a snapshot of a team’s season performance to date, so just like with any other approach, small sample size is going to offer up some early season weirdness, such as the Royals coming in at 7th in the rankings released yesterday. No one thinks the Royals are really the seventh best team in baseball, including us. In fact, if the rankings were re-done to reflect last night’s game as well, they would have already fallen to 9th, and their .525 WAR Winning % puts them closer to 17th than to 8th. The ordinal ranking looks bizarre, of course, but what Team WAR is saying is that the Royals have played like a team that should have won roughly as many games as they lost.

That shouldn’t be all that controversial of a statement, honestly. The Royals are 3-13 instead of 8-8 because of a lack of performance in the clutch. Pretty much every study ever done on the issue shows that clutch performance has no predictive value, and when evaluating a player or a team’s overall abilities, their clutch performance should almost always just be ignored. Because WAR is a context neutral metric, it does exactly that, and so it will return results that do not line up perfectly with a team’s win-loss record.

This is a feature, not a bug. By rating teams simply based on how they hit, pitch, and field, the SI Power Rankings are going to highlight situations where a team’s win-loss record might not be telling the whole story. If you prefer a set of power rankings that simply reflects the current standings, there are a ton of places offering just such a write-up. With these rankings, you are very likely to see some rankings that look strange, but they’ll give you the chance to see something you might not have seen otherwise. And to me, that’s the entire point of advanced metrics – to shine a light on the non-obvious story, and help us understand the why behind the result.



Print This Post



Dave is the Managing Editor of FanGraphs.


Sort by:   newest | oldest | most voted
pssguy
Guest
pssguy
4 years 4 months ago

You pretty much say no value should be placed on the results so why hitch yourself to the SI waggon? First impressions and all that

William Strunk Jr.
Guest
William Strunk Jr.
4 years 4 months ago

Cluster Luck plays havoc at small samples on uber-derivative functions like W-L and ERA, so it’s useful to examine the component parts of such functions to ferret out the Cluster Luck or other anomalies (which you would expect to smooth out with more inputs).

For that reason, most GMs probably ignore WAR, RC/G, ERA, W-L, ERA+ etc, because the GM is not ranking an entire player pool, the GM’s first filters (need, budget, health, availability) shrink his player pool to a tiny group of players. Ranking a tiny group is subject to Cluster Luck and small sample anomalies, so the GM focuses on the base components: bb/k, bb%, K%, etc and scouting. Sabermetrics provides quantum metrics for the front office to supplement its scouts, and broad derivative models for the historians, fans and fantasy players to sift large helpings of disparate data.

I think.

steven
Guest
steven
4 years 4 months ago

as far as pitching with the bases empty versus pitching with runners on base, aren’t some pitchers stronger in the windup versus the stretch? if the royals have a few of those, this could explain the disparity- to some degree.

Baltar
Guest
Baltar
4 years 4 months ago

Excellent post.

Colin Wyers
Guest
4 years 4 months ago

I’m sorry, but what on earth are you talking about? According to wRC+ and FIP, as taken from Fangraphs, the Royals should have scored 4.21 RPG and allowed 4.54 RPG. Hitting better in the clutch would have improved their runs scored, but they don’t get to an above-average rank without their +8 team UZR.

geo
Guest
geo
4 years 4 months ago

But they do have that +8 team UZR. You can’t leave that out.

Colin Wyers
Guest
4 years 4 months ago

Dave Cameron did, when he explained how the Royals have such a high team WAR. Instead he talked about clutch hitting.

Colin Wyers
Guest
4 years 4 months ago

In other words, if the Royals had hit to their wRC+ and pitched to their FIP, they’d have a Pythagorean win expectation of .465. Or, sure, seventh best team in baseball.

James Gentile
Member
4 years 4 months ago

When Colin Wyers yells at Dave Cameron it’s like hearing Mom and Dad argue all over again.

sam
Guest
sam
4 years 4 months ago

#6org

Billy
Guest
Billy
4 years 4 months ago

This is just the worst. It’s sad, really.

chuckb
Member
chuckb
4 years 4 months ago

You mean your comment? Because I was thinking exactly the same thing.

jim
Guest
jim
4 years 4 months ago

oh, good one

sprot
Guest
sprot
4 years 4 months ago

smh

John
Guest
John
4 years 4 months ago

#7org

Mr. Deez
Guest
Mr. Deez
4 years 4 months ago

So if the Royals got their “situational BABIP” up they’d be a .500 team instead of 3-13 so naturally that means they’re #7 in all of MLB.

Heisenberg
Guest
Heisenberg
4 years 4 months ago

“And to me, that’s the entire point of advanced metrics – to shine a light on the non-obvious story, and help us understand the why behind the result.”

Except when you subscribe to UZR, which is not a solid metric.

CJ
Guest
CJ
4 years 4 months ago

Which fielding system do you prefer and why?

UZR is a Joke
Guest
UZR is a Joke
4 years 4 months ago

They all suck pretty hard. In a few years when FieldFx is in full force, everyone will be talking about UZR being the RBI of fielding statistics. There is opportunity value in UZR, which is why a single season’s worth of data is not accurate.

asdfasdf
Guest
asdfasdf
4 years 4 months ago

Nerd.

LIz Phair
Guest
LIz Phair
4 years 4 months ago

I read a lot of advanced baseball articles and you my friend are the best around.

regfairfield
Guest
regfairfield
4 years 4 months ago

You had the opportunity to go “yeah, this is a blip caused by small sample size UZR but in the long run this is a decent method”, but instead you decided to double down on the results and imply that you’re smarter than the people who disagree with you.

sprot
Guest
sprot
4 years 4 months ago

YEP

jim
Guest
jim
4 years 4 months ago

you seem surprised, are you not familiar with dave cameron’s work?

tom
Guest
tom
4 years 4 months ago

Just read the last line of the post:

“And to me, that’s the entire point of advanced metrics – to shine a light on the non-obvious story, and help us understand the why behind the result.”

Here I was thinking the point of advanced stats was to better assess and objectively measure performance (and in some cases project it). When the primary purpose (“the entire point”…. really?) is to be contrarian or ahead of the curve or be viewed as smarter than the layman, it leads to the advanced stats being misused.

Yes it is boring to read an obvious conclusion based on advanced statistics, but the over-riding need to be “non-obvious” has led to some really poor application of advanced stats and an increasing trend of bad analysis and misuse of statistics

Here’s a crazy strawman…. maybe the variation is in the #’s with noone on base and the men on base #’s are more representative. Given the absurdly small samples we are dealing with there is no way to know. Why is the assumption that the men on base #’s are the outlier? Well because that’s what fits the story being told…. a more fundamental approach to the analysis (including regression) would sort out some of this out.

Matthias
Member
Member
4 years 4 months ago

“Why is the assumption that the men on base #’s are the outlier?”

Because there is a lot of data to suggest that teams performing very differently with men on versus with the bases empty tend to regress toward a more common level of performance somewhere in between.

Tom Tanger summarized some results here:
http://www.tangotiger.net/clutch.html

The premise is that clutch exists, but a lot of data is required to show a little clutch. The conclusion is that much of the spread in performance in various situations is due more to “luck”, or randomness, than to a true difference in skill ability.

So basically we have to adjust our null hypothesis to the overall expectation–that teams will regress toward past expectations–and wait until there is enough evidence to reject that notion.

Matthias
Member
Member
4 years 4 months ago

I should clarify, Tango’s research was in reference to individual players. However the concept still applies. Larger samples are needed before we should accept that the Royals can’t hit or pitch “in the clutch.”

tom
Guest
tom
4 years 4 months ago

My point is not whether or not there is clutch hitting…. it’s the assumption that the noone on base # is the “real” # and the Royals are just getting unlucky and that of course is the # things will regress to.

Given the size of the sample…. where is the “in between”?

The tone of the article suggest it’s obviously closer to the noone on base # and it’s a luck thing – given the sample size I don’t see how that can be assumed.

chuckb
Member
chuckb
4 years 4 months ago

Interesting stuff, Dave. Thanks for posting.

It goes without saying that we shouldn’t draw too many conclusions from a sample consisting of 10% of the baseball season.

RichW
Member
RichW
4 years 4 months ago

Except that is what SI suggests. It seems ridiculous to me.

SI say

<>

RichW
Member
RichW
4 years 4 months ago

Except that is what SI suggests. It seems ridiculous to me.

SI say

By utilizing WAR, we can better identify which teams are actually playing well and will likely sustain their success going forward.

Blob
Guest
Blob
4 years 4 months ago

This is where the numbers can only take you so far.

Anyone who has watched the Royals all year knows that they are willingly and serially sacrificing runs on the basepaths. Why is there no mention of that here?

Hitting stats tell you they should be scoring runs. That they are not is far from just the lack of “clutch” and luck that Cameron suggests.

Ignorant Tool
Member
Ignorant Tool
4 years 4 months ago

So should we expect there to be a win streak/ course correction (trending upward) in the near future for the Royals or is their wins/losses based solely on clutch performance – which has no predictive value?

M W
Guest
M W
4 years 4 months ago

Not sure how anyone thought the Royals would be a good club this year when looking at their opening day rotation.

hawkinscm
Member
hawkinscm
4 years 4 months ago

Everybody in the rotation has been, overall, about the same as expected or better. Chen has been good. Hochevar has been mostly good. Sanchez has been Sanchez. Mendoza has been Mendoza (PCL ERA leader!!), and Duffy has been awesome. If I had known that would happen before the season began, I would have put them down for 82 wins.

Dr.Rockso
Guest
Dr.Rockso
4 years 4 months ago

This is #6org all over again.

Rob McMillin
Guest
4 years 4 months ago

As it stands, the Royals have scored 57 runs and allowed 81. With a more normal distribution on timing of hits, though, it’d be pretty close to 70 for both RS and RA. And, as everyone who has been beaten to death with pythagorean expectation over the past 20 years knows, a team’s runs scored and runs allowed are a better evaluator of how a team has played than simple wins and losses. Pythag suggests that the 57/81 split in their RS/RA means that the Royals have played more like a .313 team than a .188 team, and their underlying components of run scoring and run prevention suggest that they’ve played more like a .500 team than a .313 team.

Isn’t it equally likely that the model is broken? While it’s been some time since the Angels outperformed their pythagorean totals, they managed to do so consistently for several years before coming back to earth. Maybe what the Royals are telling us is that these models don’t work well for small sample sizes, and cases where the hits don’t come in key situations.

Wobatus
Guest
Wobatus
4 years 4 months ago

Hope Rany reads this before he finds a high ledge.

Anders
Guest
4 years 4 months ago

It seems that all you are doing is replicating the traditional rankings and text power rankings, except your input is not W-L record but WAR. I guess this could be slightly more instructive, but the error bars on WAR have to be high enough that its going to take half a season before what you are producing is meaningful in any real way. You acknowledge this pretty explicitly when you say that the Royals aren’t actually the seventh best team in baseball. At least the traditional power rankings usually have common sense tilting the scales.

If you are trying to tell people how good each team in the majors is, why not just regress heavily and weigh ZiPS projections for the first month or two of the season until there’s a decent sample size of games accumulated; it wouldn’t explicitly reflect how good the teams have been during the season, but it would much more accurately reflect how good the teams in baseball are, which presumably is the point of a power ranking.

Roombaugh
Guest
Roombaugh
4 years 4 months ago

The commenter who discussed baserunning is spot on. The Royals have had no problem creating run scoring opportunities, but have gotten thrown out with kamikaze baserunning all year. Those average with runners in scoring position numbers also don’t count the number of times that they have sacrificed with a runner on second to get him to third (this happened TWICE last night). It doesn’t matter if the team is built to play optimally based on stats if the manager refuses to follow sabermetric strategy.

CJ
Guest
CJ
4 years 4 months ago

Forget “sabermetric” strategy. Talk to any guy in the street who watches baseball and he’d say that they’re throwing away runs.

I fully expect insane baserunning to regress to more normal levels.

Cosmicsniper
Guest
Cosmicsniper
4 years 4 months ago

I know you like numbers, Dave, but to me it’s not so difficult to imagine a young team putting extra pressure on itself to perform. Double-digit losing streaks and players in slumps, especially young players, have their way of putting extra pressure on athletes to perform. It’s magnified when pitchers put runners on bases (“here we go again”) or when a hitter tries to do too much at the plate because he knows his team needs it. Even a guy like Pujols isn’t immune to this.

It’s a small sample. The numbers will be more evenly distributed after they win a few. But it’s also not rocket science. Show me any up and coming young team with an 11 game losing streak and I’d bet you’ll see the same uneven distributions…unless the team was projected to really suck.

It’s not a tabula rasa for the hitters…if it were, you’d see much more consistency…there’d be no reason to focus on anything but the mechanics and approach to hitting. Those things get all screwed up in slumps. The fact that it’s reflected in the numbers isn’t a surprise.

Nice work though!

mister_rob
Guest
mister_rob
4 years 4 months ago

Using small sample WAR (which uses ridiculously bad small sample fielding as a component) to make any kind of ranking list is assinine

According to this website, the 2 most valuable defensive players by far thus far have been Alex Gordon and Alfonso Soriano. Gordon has been so dramtic in his fielding that despite a 256 wOBA he is positive in the WAR column. And despite Soriano’s 176 wOBA and 420 OPS he does not have a negative WAR.

Its ridiculous plain and simple

clint hulsey
Guest
4 years 4 months ago

Good article as usual Dave, but it shouldn’t surprise people that 3-13 is fluky. One doesn’t have to be a sabermetrician/ a scout / math major to know that the royals aren’t 3-13 bad. Your arguments were good though

joele
Guest
joele
4 years 4 months ago

agree. these are good arguments for showing that 3-13 isn’t necessarily deserved, but to me wins and losses will always be the best measurement of how good a team is.

hoser
Member
hoser
4 years 4 months ago

I am a big fan of both Rany and Dave’s writing and their thinking. This seems like an interesting way to promote the new “Fangraphs/SI Power Ranking”. However, I see a problem.

If the Royals are just throwing away outs when they have men on base or in scoring position, wouldn’t that skew the results?

Rany’s article seems to point out worse than questionable base running and sacrifice bunting choices. These problems can only occur with runners on.

It seems that bad base stealing (worse than every other team except the Dodgers) and inefficient sacrifice bunting, the “more normal distribution on timing of hits” is not entirely random. To the extent that the disparity is due to bad manager/player decisions, wouldn’t it’s continuation be affected by whether the decision making process changes?
Discounting the disparity in scoring as a lack of performance in the clutch seems to ignore the baserunning and bunting practices, which only apply when runners are on base.

tom
Guest
tom
4 years 4 months ago

The “bug” in these rankings is the failure to regress data. Simply tagging things with “sample size” and saying take with a grain of salt and then proceeding to use the #’s anyway is frankly…. ridiculous and a complete abuse and misuse of statistics.

In addition to UZR, there can be issues with BSR, FIP (HR/FB outliers, early season park effects, opponent quality over such a small sample), and even the hitting stats (BABIP, parks again, etc…)

This is why there is this thing called “regression”… the point of it is to account for sample size, not simply to have it both ways and say “the data isn’t clean, but hey let’s analyze it anyway as if it is”.

Seems like having this published in SI with such an obvious flaw in methodology is not shining a good light on the SABR community.

CJ
Guest
CJ
4 years 4 months ago

I’m not entirely sure what you mean. If you regress to the mean, you maintain the ordinal ranking of the teams.

And saying “sample size” and posting the data anyway isn’t an abuse of statistics. It’s a sample. You’re saying “here’s a sample of data of the 2012 KC Royals”.

Ranking these samples in order of some “performance” also isn’t terrible. Pretty meaningless, though.

Antonio Bananas
Guest
Antonio Bananas
4 years 4 months ago

I think common sense would tell you that any team with a win % below .370 or above .648 is probably flukey due to small sample size.

How much of every team’s curren twin % is influenced by the quality of teams they have played. Dodgers have 7 games against the Padres, 3 against the Pirates, 3 against the Astros, 3 against the Brewers, and 1 against Atlanta. That’s why they have such a good record. Not BABIP luck (although that’s probably there), not pitchers out-performing their peripherals (although that’s probably true too). Teams just haven’t all played similar competition. KC’s schedule has been fairly brutal. Of their 16 games, 13 of them have been against teams projected to be above .500 (Blue Jays around .500 in the AL East counts). Looking ahead, May looks to be pretty rough too.

I love B-R’s SRS they have. It uses a similar approach that you’d use in college football. Measures the quality of team you’ve faced and the run differential. I think in a small sample size, this is something that could be useful.

RichW
Member
RichW
4 years 4 months ago

On the SI site they indicate nothing about sample size or regression. Is the casual Baseball fan supposed to take away from the rankings that Kansas city is a likely playoff team? If not then what is the purpose?

sportsczar
Member
sportsczar
4 years 4 months ago

If you’re going to be so brash as to use “asinine” in a comment, please take the extra one second to spell it correctly. Otherwise, you look asinine yourself. I mostly agree with your statement; I mock your presentation of it.

I am a huge fan of Dave’s, but the argument for ranking the Royals seven is unconvincing, apparently to the casual fan and learned statistician alike. In all fairness, Dave qualified his post. “No one thinks the Royals are really the seventh best team in baseball, including us. In fact, if the rankings were re-done to reflect last night’s game as well, they would have already fallen to 9th, and their .525 WAR Winning % puts them closer to 17th than to 8th.”

The worst record in baseball might actually belong to the 17th-best team? Now that I can buy. Maybe.

Metsox
Member
Member
Metsox
4 years 4 months ago

The Royals have been caught stealing 8 times already. I also feel like I can remember them getting thrown out in a few other cases. Probably wouldn’t make a huge difference though….

TecJug
Guest
TecJug
4 years 4 months ago

“But, just adding 11 runs doesn’t just fix a 3-13 team, right?”

If six of those 13 losses are by one run, I’d say adding 11 runs in the right places could do more than fix a 3-13 team.

Tom
Guest
Tom
4 years 4 months ago

I don’t think Sanchez saw you staring at him. That was an easily winnable game for KC last night…if Sanchez hadn’t walked 7 guys in 4.2 IP. The Tribe has been the opposite with the timely hitting so far, amazing KC didn’t get blown out.

objectiveobserver
Guest
objectiveobserver
4 years 4 months ago

Gotta defend the orthodoxy at all costs, eh Dave?

I think it was Emerson who said “A foolish consistency is the hobgoblin of small minds.”

wpDiscuz