The Orioles and Accepting Random Variation

Two years ago, the Baltimore Orioles gave a middle finger to the concept of regression to the mean. For six months, they won game after game by a single run, relying on a bullpen that posted the highest WPA in history to make the postseason despite the skepticism of sabermetric writers everywhere, including here on FanGraphs. The story of their season was essentially told in two numbers: 93-69 record, +7 run differential. To O’s fans, it was a fantastic season, but to writers like those found here, it was essentially a fluke.

The 2012 Orioles were a decent team that managed to distribute their runs in about the most effective manner possible, but there’s just no evidence to suggest that this is a repeatable skill over significant periods of time. And sure enough, after going 29-9 in one run contests in 2012, the 2013 Orioles went 20-31 in games decided by a lone run. For one year, the Orioles defied the odds, but as we’d expect, they couldn’t get that to carry over into the next season, and they won eight fewer games despite playing basically at the same level as the previous year.

But now, it’s 2014, and the Orioles are doing it again, though not quite to the same degree. Their 24-17 record in one run games isn’t quite so crazy, but they are outperforming what context-neutral models would suggest based on their overall performance to date. As Jeff noted, the Orioles are #2 in Clutch performance this year, winning five more games than their underlying statistics would have suggested. And once again, their bullpen leads the Majors in WPA, though it’s not quite the historical performance of two years ago.

And while Orioles fans may have been able to accept random variation as the explanation for 2012, the fact that they’re doing it again just two years later leads to suspicion that perhaps the Orioles — or maybe just Buck Showalter — have figured out how to game the system. A few comments I received yesterday, both in my chat here and on Twitter.

Comment From baltic fox
Re: bullpen talent. Maybe most guys can’t predict bullpen talent in advance, but Buck Showalter apparently can. Having a lot of contacts in baseball and lots of experience at evaluating players isn’t something that can be measured with metrics.

Comment From King Flops
Is there any amount of differentiation from predictive models that would make you question the models over simply reciting, “luck/coinflips/randomness” like a SABR doll whose string was just pulled

LossOfConsortium says
Std deviation is obviously going to occur, but when you’re to the point where the O’s have outperformed their projections 3 years in a row (significantly), don’t you maybe have to think there is a potential flaw? I love game stats, but maybe there are still some things about the game that numbers don’t solve. I understand that is Sacrilege here, but still. Perhaps a fantastic back end of the bullpen and #1 defense instill confidence in SPs who then outperform their projections? Who knows. But being so narrow minded to definitively find std deviation…?

@OsUncensored
If you are lucky for a long period of time doesn’t that mean it’s no longer luck. Doesn’t that mean there is talent or skill involved?

The sentiment expressed in those last two comments is not at all uncommon, and is actually a perfectly rational statement. At some point, when reality diverges from a model’s expectations for long enough, it is entirely correct to question whether the model works. So, let’s actually dive into the data, and look at whether the Orioles are actually evidence that run estimators are missing something.

While we only have the BaseRuns Standings page here on FanGraphs for 2014, David Appelman was kind enough to send me the data for all years back to 2002. To see how unusual this Orioles run is, I looked at every rolling three year window for every team in baseball from 2004 through 2014. This gives us a total of 330 data points, and should allow us to answer some questions about the relationship between a team’s three year BaseRuns numbers and their corresponding Win-Loss record.

With those 330 data points, I looked at the cumulative difference in winning percentage between a team’s actual record and their BaseRuns expected record. Between 2012, 2013, and 2014, the Orioles have beaten their BaseRuns expected record by a combined .129 winning percentage, or .043 per season. Over 162 games, that’s a difference of seven wins, and again, this is over three seasons. Of the 330 three-year windows we’re looking at, that ranks as the ninth largest difference, confirming what we already knew; what the Orioles are doing is unusual.

ShortName season WP bsrWP Difference 3 Year
Angels 2009 0.599 0.530 0.069 0.234
Angels 2010 0.494 0.445 0.049 0.222
Astros 2010 0.469 0.403 0.066 0.182
Angels 2008 0.617 0.513 0.104 0.162
Astros 2009 0.457 0.415 0.042 0.140
Angels 2011 0.531 0.515 0.016 0.134
Twins 2008 0.540 0.490 0.050 0.133
Twins 2010 0.580 0.530 0.050 0.130
Orioles 2014 0.576 0.532 0.044 0.129
Yankees 2014 0.517 0.479 0.038 0.126

As you’ll note from that table, there’s a lot of overlap, which makes sense, because one outlier season can be counted in multiple rolling time frames. For instance, the 2008 Angels — who beat their expected record by 108 points, easily the most of any team in the sample — are included in three of the top four data points in that table. The 2010 Astros were the ninth-highest single season overachiever, but that three-year window makes the list because it also includes the 2008 Astros in the rolling total, and the 2008 Astros had the fifth highest single season gap between actual record and expected record.

But this is just what Orioles fans are suggesting; there are examples of teams who have beaten BaseRuns not just once, but followed it up by doing it over several years. The Angels, in fact, beat their BaseRuns expected record five years in a row, averaging 60 points of winning percentage per season over those years. That’s 10 wins per year over what BaseRuns expected, for five consecutive seasons. Clearly, we have to acknowledge that it is possible to consistently win more games than the model suggests, at least over a five year stretch. It’s happened, and not all that long ago.

But here’s the thing; the existence of an outlier does not prove that a model is broken. In fact, the existence of the right amount of outliers is actually evidence that the model works really well. The question isn’t whether we can find outliers in the data; the question is whether there are more outliers than we’d expect given a normal distribution.

The normal distribution essentially states that, in a sample of data with a given mean, the results will be distributed around that mean in a way that isn’t biased one direction or the other. Most of the results will be closer towards the mean, with fewer and fewer examples as we get further away from that average, and roughly an equal proportion on both sides. The normal distribution is often called a bell curve, because, well, it takes the shape of a bell.

Statistically, the normal distribution has a rule that suggests that 68% of the results will fall within one standard deviation, 95% will fall within two standard deviations, and 99.7% will fall within three standard deviations. In order to see whether the presence of teams like the Angels, Astros, and Orioles prove that BaseRuns is missing something, we can measure just many standard deviations from the mean they have been over these three year windows, and what the overall distribution of all 330 data points is.

Let’s start with the chart of all 330 data points.

3YrStDev

It’s not perfectly distributed, but that looks very close to a normal distribution. If you prefer to see it in a curve rather than a histogram, here’s the same data, just presented with the drawn line.

BellCurve

That’s a bell curve, with a very slight skew to the left, as more teams have underachieved than overachieved over the sample we’re looking at. If we had 3,300 data points instead of 330, it’s likely that skew would go away.

But, while the chart certainly makes it look like the data is distributed normally, do the numbers actually match up to the 68-95-99 rule? Well, here’s a table so you can see for yourself.

1 SD 2 SD 3 SD
67.9% 96.1% 99.4%

Remember the rule targets are 68.2%, 95.5%, and 99.7%. Yeah, I’d say it’s safe to call this a normal distribution. In other words, there are exactly as many outliers as we’d expect to find given this many data points. The existence of the Angels, Astros, and Orioles isn’t evidence that BaseRuns is broken; it’s evidence that the model follows the normal distribution, and the fact that there aren’t more examples like them suggests that the model works pretty darn well.

For the record, the Orioles 2012-2014 record is 2.02 standard deviations from the mean. In other words, we’d expect to find an example of a team performing this well over a three year rolling window one time with only 20 data points, and so we shouldn’t be too shocked that the Yankees have beaten their BaseRuns expected record by nearly an identical amount over the last three years. What the Orioles have done isn’t actually all that crazy, and it doesn’t come anywhere near the level of suggesting that they have figured out how to exploit a flaw in the model that can be sustained over the long term.

Now, again the presence of the normal distribution does not mean that BaseRuns is a perfect model. I’m not attempting to assert that the model is beyond reproach. What I will say, however, is that if we’re going to identify flaws in the model, we cannot use the existence of the 2012-2014 Orioles as evidence. It isn’t evidence of that.

And really, the evidence also pushes back against this being a Buck Showalter effect. After all, these aren’t Showalter’s first three years as an MLB manager. From 2003 to 2006, he managed the Texas Rangers; in three of those four years, the Rangers lost more games than expected, and his overall average winning percentage in Texas was 16 points lower than the BaseRuns model. In his first two years in Baltimore, the team outperformed, but just slightly so, average 12 points of winning percentage more than the expectation. Of the nine seasons managed by Buck Showalter in the years for which I have BaseRuns data, his average bump in winning percentage amounts to 1 point of winning percentage per year.

Really, if there’s a manager that could have staked a claim to figuring out the way to beat BaseRuns, it was Mike Scioscia. From 2002 to 2011, Scioscia’s Angels beat their BaseRuns expectated records by an average of 41 points per year, winning more games than expected in eight of those ten years. The 2007-2009 Angels were 3.7 standard deviations from the mean in terms of performance over expected record, nearly twice as far from the mean as the 2014 Orioles. After a full decade of beating expectations, certainly Scioscia should be the one guy we should expect to do it, right?

Well, the 2012 Angels won fewer games than BaseRuns expected, and so did the 2013 Angels, and so have the 2014 Angels. After beating the model for five straight years, Scioscia is now on a three year losing streak. This doesn’t erase what he’s done previously, but if we’re to explain how Scioscia figured out how to beat BaseRuns, we have to also explain why he forgot how to do it a couple of years ago, and hasn’t remembered since. And those Angels are the most extreme outlier. If they couldn’t keep doing this, there’s no reason to think anyone else can either. And, for the record, the Astros and Twins — the two other franchises who regularly beat their expected record during our sample — are also on similar losing streaks since the end of their runs.

I understand why Orioles fans are frustrated with being told that their team isn’t as good as their quality record for the second time in three years. We don’t like accepting randomness as an answer, and when someone tells us that regression is coming and then it doesn’t come as soon as they said it should, confirmation bias kicks in, allowing us to believe that the prediction was wrong all along. It is difficult for human beings to observe a repeated event over any real length of time and not find a cause for the results.

But this is why we should be skeptical of our abilities as observers rather than of models that actually work really well in a great majority of the cases. In a competition where the spread in talent level is not that large, randomness is going to play a significant role in the outcome. In Major League Baseball, a team can control, to a large degree, how many and what types of baserunners they create and allow, but there’s just not any evidence that converting those baserunners into runs at a higher than expected level is a real skill, or that distributing the runs a team scores or allows in an advantageous way is something that teams can control. As simple as it might sound, the best way to evaluate a team’s performance is by simply counting up the value of the individual plays and mostly ignoring the order in which they occur.

Sometimes, the ball will bounce your way more often than others, and the small spread in talent among teams means that context-specific performance can skew the standings by as many as 17 wins, though +/- 10 wins is more normal for an outlier within a given season. When we see a team that wins 10 more games than BaseRuns suggests, we shouldn’t conclude that BaseRuns is stupid and wrong; we should conclude that yep, that’s baseball.

And it’s part of what makes the game great. If there wasn’t any variance, and every team won exactly as many games as expected, the sport would be rather boring. We should celebrate the variance that exists, allowing surprising teams to rise up and reward their fans with exciting and unexpected wins. We just shouldn’t allow those exciting rare wins to make us think that the rules apply to everyone except for our favorite team. Embrace randomness, but embrace it for what it is, and don’t try to turn it into something it isn’t.



Print This Post



Dave is the Managing Editor of FanGraphs.


Sort by:   newest | oldest | most voted
Aaron (UK)
Member
Aaron (UK)
1 year 9 months ago

But, the standings!

Shorebird Bob
Guest
Shorebird Bob
1 year 9 months ago

Yeah, who cares about what the standings say?

Mike Trout, Mike Carp and AJ Pollock walk into a sushi bar, and the sushitender
Guest
Mike Trout, Mike Carp and AJ Pollock walk into a sushi bar, and the sushitender
1 year 9 months ago

I think the O’s could win the WS and we’d still be having these debates, instead of watching the actual games.

Kevin Quackenbush's Beard
Guest
Kevin Quackenbush's Beard
1 year 9 months ago

We can do both, you know.

Well-Beered Englishman
Guest
Well-Beered Englishman
1 year 9 months ago

Wow, both of you are really bringing your best to the username game.

Dave is a co-founder of USSMariner.com and contributes to the Wall Street Journal.
Guest
Dave is a co-founder of USSMariner.com and contributes to the Wall Street Journal.
1 year 9 months ago

I’m too lazy to come up with anything clever like that.

Ballfan
Guest
Ballfan
1 year 9 months ago

is there an award for complete un-originality in the name game?

Name
Guest
Name
1 year 9 months ago

I haven’t gotten one yet.

$1 Farmer
Guest
$1 Farmer
1 year 9 months ago

How bout an award for middle-of-the road mediocre name.

Grammar Police
Guest
Grammar Police
1 year 9 months ago

“Mike Trout, Mike Carp and AJ Pollock walk into a sushi bar, and the sushitender says: ”

A quick look at the name, and all I saw was shitender. Darn near made me spit!

Bartolo Colon
Guest
Bartolo Colon
1 year 9 months ago

Waiting for the first joke about me and a shitender.

Nick
Guest
Nick
1 year 9 months ago

I think that you may be missing the point, well for me at least. I completely understand and accept the methods being used. What I would like to look at more closely is what causes the variance. Randomness can and does occur. What I would like to understand is are we associating all of the variance to randomness and overlooking things that could have an impact. Is there something in the way the Orioles are built to suggest that they may be more prone to random events. Could a team with strengths in homeruns, bullpen and defense be impacted differently? Does the teams lack of top of the rotation quality pitching, rather relying on #2 and #3 type starters make them more prone to random events? Have the Orioles found a market inefficiency with one-dimensional players that Moneyball 2.0 will be written on? These are the types of questions we should be spending time answering rather then defending the formula that works for the mass.

Surrealistic Pillow
Guest
Surrealistic Pillow
1 year 9 months ago

There are no questions to answer just yet, is the point, because the results are normal.

I flipped a coin 10 times and it came up heads on 8 occasions. What questions should I be asking?

Nick
Guest
Nick
1 year 9 months ago

That is a bad comparison.

David
Guest
David
1 year 9 months ago

It isn’t. You’re assuming causality for an outlying result that does fit within the accepted range of outliers.

LaLoosh
Guest
1 year 9 months ago

it’s bad because 1 coin flips is basically the equivalent to 10 baseball games. try and get 80% heads if you flip the coin 486 times and then come back to me.

LaLoosh
Guest
1 year 9 months ago

schnizz! supposed to read “10 coin flips”

Crumpled Stiltskin
Guest
Crumpled Stiltskin
1 year 9 months ago

Just because outliers should be expected does not mean we shouldn’t ask WHY those outliers exist? And perhaps that’s not always because of randomness.

For instance, if we take the results of the SAT, they are distributed in a bell curve also, but the “outliers” in such a test aren’t random, they are people better at taking the test. (Not saying success at the test signifies much else than being good at taking it, but there is non random reason why someone falls in the 99th percentile and others fall in 50th percentile.

Another example is the hot hand question in basketball, where the researchers foolishly asserted there is no hot-hand after looking at the data and finding it fell within a standard model, rather than the much more sensible assertion, that the data confirms not only that the hot hand exists (and the cold hand), but also that hot streaks and cold streaks should be expected. Just because it’s impossible to predict, when and for how long such streaks wi occur, does not mean that the reason the streaks themselves occur is random.

Jake
Guest
Jake
1 year 9 months ago

Suppose there was a test people, how many of those 99th percentile would suddenly drop to 50th percentile in subsequent years?

Jake
Guest
Jake
1 year 9 months ago

I mean, “suppose there was a test people could take annually….”

indyralph
Member
Member
indyralph
1 year 9 months ago

Of course it’s fine to ask questions about why there is variance. You are assuming that the Orioles were the first people to ask and answer these question, and that you are the second. And that Dave and every other statistically minded analyst is just blatantly ignoring them. Five years ago we were talking about Pythagorean records, then it was second order Pythagorean records, then it was Pythagenpat, and now it’s BaseRuns. People are making the models better. They are asking the right questions. And all that’s come of it is a better model with nothing to support that outperforming the individual events is sustainable.

TADontAsk
Member
TADontAsk
1 year 9 months ago

“it’s bad because 10 coin flips is basically the equivalent to 10 baseball games. try and get 80% heads if you flip the coin 486 times and then come back to me.”

But the Orioles aren’t winning at an 80% rate, nor is their “true” winning percentage a 50/50 chance like a coin. That’s the issue with the coin analogy – that people tend to think of everything as 50/50. In this example, if your coin was 55/45, then it’s quite possible to get 57-58% heads even over a larger sample of 1-3 “seasons”.

John Choiniere
Member
1 year 9 months ago

@Crumpled Stiltskin

My apologies for being blunt, but the SAT results idea isn’t a good comparison here, because Dave isn’t charting the number of wins, he’s charting the difference between the number of wins observed and the number predicted.

Drosophila's Tremendous Phallus
Guest
Drosophila's Tremendous Phallus
1 year 9 months ago

“Just because it’s impossible to predict, when and for how long such streaks wi occur, does not mean that the reason the streaks themselves occur is random.”

That’s exactly what randomness is. Its wholly related to the observer and the amount of information an observer has. Anything “impossible to predict” is the definition of random.

B N
Guest
B N
1 year 9 months ago

@Drosophila: That is certainly NOT the definition of “random.” For example, “chaotic” would equally apply. The “random number generator” on your computer (unless you are using system/external entropy) is not random. Despite this, unless you know the random seed, a cryptographically-secure RNG cannot be predicted with any accuracy until you have an absurd number of samples (i.e., for some of them, more than the number of particles estimated to exist in the universe). But it’s not random. Not even a little. It’s usually a simple, recursive algorithm. Maybe a few thousand lines of code.

What you are looking for is “unexplained variance.” This covers: random (happens due to a process with implicit probabilities), chaotic (happens due to a process that is deterministic, but is impossible to predict until you hit it, such as calculating the digits of pi), and unobserved (e.g., you just can’t see the factors that cause it).

Pale Hose
Guest
Pale Hose
1 year 9 months ago

I agree that the coin flip analogy is poor. In flipping a coin all deviation from the mean can be explained by randomness. The same assumption cannot be applied to the base runs model unless it is assumed that the model perfectly represents reality. I don’t believe that there is anything meaningful about the Orioles deviation from the mean, but it is still with exploring, in general, what factors can cause deviation outside of randomness.

olethros
Guest
olethros
1 year 9 months ago

We already know what the factor is – sequencing. The O’s are getting hits in bunches, and allowing them singly. The randomness is the order in which events occur. If you want to investigate if there’s some skill the O’s have at getting hits when men are on, and preventing hits when men are on, go for it. You won’t find anything, though. See also the St. Louis Cardinals.

Pale Hose
Guest
Pale Hose
1 year 9 months ago

I perfectly understand and accept that sequencing/randomness are a large part of the deviation. I am skeptical that it represents 100% of the deviation. Like I said in the original post, the comment is not about the Orioles specifically, but about models in general.

olethros
Guest
olethros
1 year 9 months ago

So what is it then? Pitch framing? Outlandishly good defense? Or is the O’s BA with runners on higher than their overall team BA, and their BA against the opposite?

If the latter, it’s sequencing. Otherwise you’ve got to find some skill or set of skills the O’s have that are worth what, 80 runs per season?

LaLoosh
Guest
1 year 9 months ago

I might suggest that in tighter run differentials, bullpens probably log a higher rate of high leverage innings, heightening their importance on game outcomes. Maybe this is not factored/weighted properly…

LaLoosh
Guest
1 year 9 months ago

I’m not suggesting that models could even predict how effective the O’s pen has been but just making an attempt at explaining the models’ ineffectiveness.

Pale Hose
Guest
Pale Hose
1 year 9 months ago

My question is what can cause deviation from the mean outside of sequencing/randomness? Some possibilities could be pitch framing, lineup optimization, measurement error in small samples of defensive statistics, etc. My original point was that as long as there are things outside of randomness that can cause deviation from the mean, then the coin flip analogy does not work.

Go Nats
Guest
Go Nats
1 year 9 months ago

All hitters hit better with men on base, but with variance in that added hitting. Perhaps the orioles have players that improve more than other hitters with men on base. Plus, perhaps they are better at getting guys out on the basepaths? Those two abilities seem like they would account for more than their share of close wins.

JJ
Guest
JJ
1 year 9 months ago

You fail to address why/how the Orioles have positioned themselves as an outlier in a normal distribution. Saying its luck of the draw is not believable. Why is there no credit given to the Orioles for playing over their heads? Do you think there is zero human element to a nuanced game? The game changes, and the Orioles found a way to win in today’s game. Don’t close yourself off to creative thinking.

mattdecap
Guest
mattdecap
1 year 9 months ago

This would be a better question to ask if the Orioles consistently did this year after year. What the article is saying, I think, is that one team being an outlier twice in three years is not that weird, or at least not weird enough to start questioning it.

Nick
Guest
Nick
1 year 9 months ago

No it wouldn’t be. You guys are missing the fact that each year the variables change. Roster construction, opposing teams, etc. The only real constant are the core players and the manager – but even then their competition level changes.

bmarkham
Guest
bmarkham
1 year 9 months ago

“You fail to address why/how the Orioles have positioned themselves as an outlier in a normal distribution. Saying its luck of the draw is not believable.”
That you consider it not believable is your limitation, not Cameron’s or BaseRun’s. Outliers don’t have a why, it’s random. Haven’t you ever played poker? That is also a game of both skill and luck.

“Why is there no credit given to the Orioles for playing over their heads?”
See, this is further evidence that you’re not even understanding what we’re talking about here. Dave is not talking about the Orioles “playing over their heads” in the sense of outplaying say, their preseason projections. He’s talking about the fact that they’ve benefited from sequencing in the form of both run scoring and run prevention. In other words the cumulative value of each individual plate appearance outcome says that on average they would win less games than they have. And as Dave mentions in the article, teams have not been able to repeat this feat reliably.

“The game changes, and the Orioles found a way to win in today’s game. Don’t close yourself off to creative thinking.”

bmarkham
Guest
bmarkham
1 year 9 months ago

“The game changes, and the Orioles found a way to win in today’s game. Don’t close yourself off to creative thinking.”
A Cardinals fan could have said this last year, when the Cardinals shattered the record for BA w/RISP. Now this year with a very similar cast of players suddenly they’re hitting much worse w/RISP.

Again, randomness. Not everything is signal, some is noise. The fun of baseball analysis is separating the two.

CoolWinnebago
Guest
CoolWinnebago
1 year 9 months ago

“Does the teams lack of top of the rotation quality pitching, rather relying on #2 and #3 type starters make them more prone to random events?”

You cant be more prone to RANDOM events, they are random. I get your point, but the orioles havent done anything statistically significant.

GiveEmTheBird
Guest
GiveEmTheBird
1 year 9 months ago

Are they random events or just different events than the model would have picked? “the orioles havent done anything statistically significant” – only compared to the model. If the model doesn’t value the odd combination of low on-base / high home-runs, bad starting pitching / good defense / good bullen then perhaps their “random”ness is really just something the model doesn’t do well in predicting.

CoolWinnebago
Guest
CoolWinnebago
1 year 9 months ago

No. Not compared to the model. Compared to the sample. They havent done anything that separates themselves from the sample in a statistically significant way.

GiveEmTheBird
Guest
GiveEmTheBird
1 year 9 months ago

To CoolWinnebago @ 1:59pm:
I don’t really follow. I guess there are two things here:
1) the predictive models, I am wondering whether they are valuing the odd construction of the Oriole’s team very well. Said differently the Orioles may have constructed their team to have more variation from the model’s predictions.

2) the ordering of events (BaseRuns vs. actual outcome), which I think is the sample you are talking about. The Orioles are 9th out of a sample of 330 – isn’t that pretty far out on the sample’s distribution curve? I think that does show they have gotten ‘luckier’ than average (just look at the games on the West coast where their HR prone offense was silent but their mediocre pitching only allowed 1 run and they eked out a win). No argument there – but I don’t think it explains the Orioles out of first place in the ALE, probably just explains them out of 7-8 games in first place.

bmarkham
Guest
bmarkham
1 year 9 months ago

” If the model doesn’t value the odd combination of low on-base / high home-runs, bad starting pitching / good defense / good bullen then perhaps their “random”ness is really just something the model doesn’t do well in predicting.”

But that is simply a baseless assertion. Do you have any other teams in mind with those same characteristics that did better than BaseRuns in mind? You’re just describing your favorite team and then assuming that the model (which you doubtlessly no nothing about) MUST simply systematically underrate those teams. It’s incredibly lazy analysis, and it’s insulting to someone like Dave Cameron who has put tons of hours into this analysis for you to so lazily disregard it just so you can think you can feel better about your favorite team.

Joe
Guest
Joe
1 year 9 months ago

The point is that the result actually is within the range of reasonable expectations, so there’s no reason to believe that the dominant effect is anything but noise.

I do think its an interesting question as to what teams have the largest degrees of projected variance, and that really is not too hard to answer as it should be baked right into the projections if we got the full data set (i.e. the full output of the simulations) rather than just the 50% number they give us here.

AlbionHero
Guest
AlbionHero
1 year 9 months ago

“The point is that the result actually is within the range of reasonable expectations”

The last 3 years Fangraphs and other advanced stats sites have projected the Orioles to be a low/mid 70s win team. They consistently outdo that by 10-20 wins. How is that within the range of reasonable expectations? I expect they’ll once again project the O’s in last place with 74 wins next year and the O’s will win 85-93 again.

indyralph
Member
Member
indyralph
1 year 9 months ago

Please re-read the post and think about this some more. It’s solid statistically and very well written. And your response is basically “Nuh-uh”.

AlbionHero
Guest
AlbionHero
1 year 9 months ago

Of course I don’t take the piece seriously, he’s still trying to act like the Orioles aren’t an elite team when they’ve proven otherwise. I could really care less about what the stats say when the on field performance says otherwise… since thats all that really matters. He’s trying to find excuses for why he constantly discredits the Orioles.

CoolWinnebago
Guest
CoolWinnebago
1 year 9 months ago

why are you even on fangraphs then

AlbionHero
Guest
AlbionHero
1 year 9 months ago

I like the stats, but they’re not the end all and should be taken with a grain of salt. Teams can outdo them or underdo them, and often do.

buddy boy
Guest
buddy boy
1 year 9 months ago

this isn’t about preseason projections. base runs isn’t preseason projections. it’s based on actual events and the resulting expected runs scored and runs allowed and thus expected win %.

Jason B
Guest
Jason B
1 year 9 months ago

Well that went sailing right past.

kevinthecomic
Guest
kevinthecomic
1 year 9 months ago

why are you so angry? i thought ignorance was bliss!

JosephK
Guest
JosephK
1 year 9 months ago

AlbionHero cares more about actual results than what would theoretically happen if you could play the season out 1,000,000 times in alternative universes and observe the distribution of outcomes. That makes him/her completely normal, and it’s a big reason why sabermetrics will never go completely mainstream.

Balk
Guest
Balk
1 year 9 months ago

“Teams can outdo them or underdo them, and often do” is what statisticians refer to as the

Nick Mandarano
Member
1 year 9 months ago

You COULDN’T* care less.

Nick
Guest
Nick
1 year 9 months ago

Actually it’s not. The entire set of data is within the expected results – ie the amount of outliers is within the expected amount. They are still an outlier.

indyralph
Member
Member
indyralph
1 year 9 months ago

Your questions are interesting, but the answer is already here. The 330 team sample is constructed teams built in all manner of ways. If there was a particular way of building a team that would cause them to be an outlier, you would see more outliers than the model suggest. To accept that there is a cause for the Orioles being and outlier, you must accept that there is something about the Orioles that is unique (in the literal sense, not the way people use unique these days) or exceptionally rare. Such a thing would likely be rare enough that it would not take such an effort to present itself.

Nick
Guest
Nick
1 year 9 months ago

The game is also different now than it was then, so comparing the ways in which those teams were outliers may be more complicated. What worked 10 years ago probably doesnt work now, and vice versa. There were a lot more 30+ homerun hitters in 2004 then there will be in 2014.

indyralph
Member
Member
indyralph
1 year 9 months ago

The intent of the game is fundamentally the same, score more runs than the opponent. The fact that teams have changed the way they go about that should lend itself to a wider variety of team constructions. And if team construction has a bearing on producing outliers, you should see more outliers correlating to certain constructions. The whole point of sample size is that the larger it gets, the less the impact of things that don’t matter and the more clearly it identifies the things that do matter.

Pale Hose
Guest
Pale Hose
1 year 9 months ago

“The point is that the result actually is within the range of reasonable expectations, so there’s no reason to believe that the dominant effect is anything but noise.”

I agree with this comment, Joe. The sentiment that I am getting around here is that because the result is reasonable that there is no reason to rigorously validate the model. That I disagree with.

LaLoosh
Guest
1 year 9 months ago

I think that if prediction models were consistently off by 15-20 wins per year most people wouldn’t waste time referring to them. The idea that the O’s have been within the “range of reasonable expectations” in attempting to justify the prediction models is laughable.

My take is that the models have difficulty quantifying defensive runs saved from team to team an that the O’s may be benefitting more from this than we think.

haishan
Guest
haishan
1 year 9 months ago

I have news for you: Prediction models are consistently off by double-digit wins per year. Doing that three consecutive years for the same team is less common, but it happens, and probably not more than we’d expect by chance.

BaseRuns isn’t a prediction model, though. It’s based on what happened on the field — it just attempts to strip away “clutch” events, because it turns out that “clutch” hitting and pitching doesn’t have any predictive value in general.

LaLoosh
Guest
1 year 9 months ago

Yeah they are and it shows how little attention we should pay to these models.

Iron
Guest
Iron
1 year 9 months ago

What do you mean by ‘consistently’? If, for instance, a model is off by three standard deviations about five percent of the time as you just finished reading…

LaLoosh
Guest
1 year 9 months ago

Hysterical. Any time that someone wants to apply practical reasoning, stat heads have a cow.

Jason B
Guest
Jason B
1 year 9 months ago

…just like any time a fan feels their team is being “disrespected” by said models, they have an entire cattle farm.

Cow
Guest
Cow
1 year 9 months ago

Dude, leave me and my friends out of this.

TerryMc
Guest
TerryMc
1 year 9 months ago

One key here is the substitution of the word “expected” with “predicted”. BaseRuns doesn’t predict anything. Instead it takes the sum of all events that has happened in real life, strips away the context (sequencing), and then generates an expected total. That difference between expected total and the “real life” total that includes sequencing is the discussion here. There is no discussion about prediction.

John Stamos
Member
John Stamos
1 year 9 months ago

Read this one: http://www.fangraphs.com/blogs/clutch-baseball-teams-arent-clutch-baseball-teams and look at the very first chart (Clutch Score vs. Actual Win% minus Baseruns Win%).

Basically, you are trying to find a correlation between the difference in BaseRuns performance and the actual outcome. The Clutch Score correlates nicely to a linear regression. Clutch describes how well a group (player/team/other) performs as it relates to the game leverage, or essentially sequencing on offense and pitching/defense. The article contains many anecdotes following this line of thinking and I think in turn addresses your position on the “market inefficiency.” The article pretty clearly lays out how clutch is not predictive so there’s no market to take advantage of there.

Justin
Guest
Justin
1 year 9 months ago

I agree with Nick.

Dave has done nothing here but show that the distribution is approximately normal, which says absolutely NOTHING about whether there is skill involved in the results.

To repeat my example downthread, you’d get results pretty much just like this if you did the same analysis on wOBA for hitters. The distribution would be roughly normal, and you’d find that there are the expected number of outliers. Going from there to drawing the conclusion that wOBA differentials are based on only luck would be pure insanity.

indyralph
Member
Member
indyralph
1 year 9 months ago

To repost my response downthread: The skills set underlying Mike Trout’s woba, ISO, contact rate, walk rate, etc., can be consistenty correlated with woba. Nobody has presented any statistical argument of traits that can cause win% to beat BaseRuns. More importantly, the skills underlying woba can be correlated year-to-year. Jeff demonstrated yesterday, pretty convincingly, that beating BaseRuns is correlated with clutch, and that past clutch is not correlated with future clutch.

Dave has demonstrated convincingly that the actual results align with what BaseRuns model produces. If you think there is a problem with the model, the burden is on you to find a statistical argument against it.

Jianadaren
Guest
Jianadaren
1 year 9 months ago

>Nobody has presented any statistical argument of traits that can cause win% to beat BaseRuns

I got a trait: situational pitching. Pitchers like Tom Glavine appeared to have a true skill in minimizing damage. When the bases are empty, hitters are challenged, resulting in lower OBP and higher SLG. In dangerous situations, the hitters are pitched around, resulting in more OBP but lower AVG and SLG.

Pitchers like that appears to be able to predictably influence sequencing in their favour and could go some ways to explain deviations from BaseRuns

http://www.baseballprospectus.com/article.php?articleid=12733

Cato the Elder
Guest
Cato the Elder
1 year 9 months ago

The difference is that wOBA is relatively predictive and efficient event sequencing, or “clutchiness” is not. That point has be shown again and again and there is really little imperative to reinvent the wheel every time this discussion is had. The point about the normal distribution is only to show illustrate that the existence of the Orioles’ run of outperforming their expected record is well accounted for by the existing model.

Justin
Guest
Justin
1 year 9 months ago

To both indyralph and Cato,

I am in no way arguing that clutchiness is a skill, and I agree that it’s almost certainly not. I’m only pointing out that this analysis by Dave doesn’t come close to showing it.

indyralph
Member
Member
indyralph
1 year 9 months ago

No, because the article that Jeff wrote yesterday, and Dave linked to above, does show it.

olethros
Guest
olethros
1 year 9 months ago

Dave’s analysis wasn’t intended to show that. It was intended to show that the Orioles’ outperformance of their expected production is not an indication that the model describing that expected production is wrong. Look at it this way – every single season, we would expect 1 team to outperform the way the O’s have, another to underperform to the same degree, and the balance of the league to fall somewhere between those two extremes.

Justin
Guest
Justin
1 year 9 months ago

indy,

You got me. He linked to something that shows it then proceeded to write an article with an irrelevant analysis. Conceded.

olethros,

“Look at it this way – every single season, we would expect 1 team to outperform the way the O’s have, another to underperform to the same degree, and the balance of the league to fall somewhere between those two extremes.”

And this is the issue. Dave’s definition of magnitude of overperformance and underperformance are based and standard deviations drawn from the observed population. If there were skill involved in over/under performance, then the standard deviation would be large, causing the Orioles and others to not look too strange.

indyralph
Member
Member
indyralph
1 year 9 months ago

The correlation between clutch and overperformance is 0.67. If something else has a stronger correlation, it would very likely be casually observable in the same way that that the relationship between clutch and overperformance was casually observed by Jeff. If your argument is that no statical model is perfect, well, you might as well just have posted that the sky is blue. Otherwise you should come up with some suggestion of why the Orioles, of 330 teams, are the exception.

Andy
Guest
Andy
1 year 9 months ago

No, wOBA is not distributed normally, for two reasons. First, MLB players are a select group of the general population. Even if wOBA reflected purely physical attributes like height, weight, muscular strength, reaction time, vision, etc., which are to a large extent normally distributed, players at this level represent the far end sliver of the curve, not the entire normal distribution. And second, hitting a baseball reflects more than just physical gifts, a lot of training goes into it, which brings in social factors that are not normally distributed. These same social factors are why wealth is not normally distributed, and is particularly skewed for professional athletes.

But the larger issue is that even if wOBA were normally distributed, this could be explained in terms of chance. The normal distribution would be the outcome of the distribution of genes in a population, which is governed by chance.

Costanza
Guest
Costanza
1 year 9 months ago

True, but wOBA for hitters would be repeatable, year-after-year. This article just showed that overperforming BaseRuns expected winning % is not repeatable. The fact that the Orioles have done it for 3 years is well within an expected outcome; an outlier or an oddity.

If it’s a skill, they’ll do it again. There is little evidence to suggest they’ll do it again, outside of the probability due to chance.

Nick
Guest
Nick
1 year 9 months ago

I disagree with that statement. Each year is different. The roster construction will be different. Their opponents roster construction will be different. Even the part that is attributed to randomness will be different. If it were related to a skill, and you played this exact season over and over again then you would expect them to repeat the results. Simply because they don’t repeat it with a different set of variables doesn’t mean that it wasn’t skill the first time.

Andy
Guest
Andy
1 year 9 months ago

I agree with this entirely. I’m not disagreeing at all with the conclusions of this article. I was just pointing out that wOBA is not normally distributed.

Andy
Guest
Andy
1 year 9 months ago

“For instance, if we take the results of the SAT, they are distributed in a bell curve also, but the “outliers” in such a test aren’t random, they are people better at taking the test.”

But why are they better at taking the test? In very large part, because of certain genes, which are distributed randomly in a population. Just because a cause for some phenomenon can be identified at some level of analysis does not mean that a broader level of analysis that cause can’t be understood as the product of random forces.

Andy
Guest
Andy
1 year 9 months ago

I think a lot of people here are under the mistaken view that an effect is either the product of random chance or of some cause or set of causes. This is a false distinction. While some phenomena may be the direct result of chance, e.g., quantum effects, even when effects have well known causes, these causes may themselves be largely the product of chance. We arbitrarily draw a line at human behavior and say that some people are better at some skills than others, but a deeper analysis may reveal chance at work in these differences.

In fact, if you want to pursue this issue seriously, you get thinkers like Meillassoux who argue that even the fundamental laws of nature must have originally been the product of chance. And indeed, any alternative to that conclusion seems to pose major problems for the scientific worldview.

Justin
Guest
Justin
1 year 9 months ago

Andy, you seem to have some serious misunderstandings about probability and chance and what humans mean when they use those terms.

Andy
Guest
Andy
1 year 9 months ago

Not at all, I’m just aware of and comfortable with different levels of analysis. The fact that something is a product of skill does not mean that it can’t also be the product of chance.

The outliers in an IQ test ARE random, in the sense that a priori we know that there will be outliers, and about how many at various points, but we don’t know who they will be.

Now to the extent that SAT scores reflect IQ, the same holds. If you want to argue that SAT scores also reflect other factors, such as social ones (upbringing, access to special tutors, etc.), I’m fine with that, but to the extent that such social factors are involved, then a normal distribution doesn’t result. Again, a classic example is wealth. Wealth is not normally distributed, it follows a power curve, and the reason it does is because social factors play a huge role.

Andy
Guest
Andy
1 year 9 months ago

I’ll just add that I understand that you and others are arguing that the Orioles may be exhibiting a skill without any reference to the chance distribution of that skill. And Dave’s point is that there is no skill involved at all, not even a skill that is ultimately the product of chance in a population.

But if this point were correct, you would expect to see the results repeated over a long period of time–just as someone with a high IQ demonstrates superior mental skill over a long period of time–and Dave’s analysis provides no evidence for this.

Andy
Guest
Andy
1 year 9 months ago

Let me rephrase it. The argument some are making here is that 1) even the products of skill can be distributed normally; 2) the winning of teams vs. baseruns is distributed normally; therefore, 3) skill could be involved in winning more than predicted by base runs.

But even without Dave’s analysis, this is a very weak argument. The fact that some things that require skill have something in common with some things that don’t involve skill doesn’t allow us to say anything a priori about whether skill is or isn’t involved. At best, this argument just puts us where we are before an analysis like Dave’s—an untested possibility.

Justin
Guest
Justin
1 year 9 months ago

Andy, your rephrasing is much better. I have no disagreements with your last post.

“At best, this argument just puts us where we are before an analysis like Dave’s—an untested possibility.”

This is my point entirely. Dave’s analysis doesn’t add anything, and leaves us exactly where we were before. I’ll repeat myself: I am NOT making an argument that overperforming BaseRuns is a skill, I am only pointing out that Dave’s analysis doesn’t do anything to disprove that notion.

KK-Swizzle
Guest
KK-Swizzle
1 year 9 months ago

I don’t purport to be a statistics expert, but I’ll try to help. When developing statistical models, you make assumptions and predictions, then you compare reality to what you expected. If they align, you have developed some measure of evidence in support of your assumptions. In this case, the assumption is that no team has a significant ability to control the sequencing component to scoring runs and winning games. The corresponding prediction is that teams’ winning percentage relative to BaseRuns performance will follow a normalized bell curve. The data presented, then, is a rather strong indicator that a team’s performance in close games is indeed random.

Now that we have data, we can create qualitative explanations of why this is the case, and test those with an entirely new statistical study. Or we can question the result and develop a counter-study that looks FOR teams’ ability to outperform BaseRuns and/or preseason projections. I’m in medical school, so I’ll leave these tasks for someone with more free time :) Does this make sense? In summation, its good to question results: you can never have too much data. Just make sure to do it in the right way!

AC
Guest
AC
1 year 9 months ago

I read somewhere that if you ask a human to write down a string of 100 random digits, and also have a computer spit out a string of 100 random digits, you can almost always tell which came from the human and which came from the computer.

The computer’s results will actually be more random, and a human’s results will be what they THINK is random. The key: look for digits repeated three times in a row. A human will almost never do that, and will bounce around the numbers more, while a computer correctly treats every digit as an independent event, and triple-repeats will usually happen in a 100-digit sample.

The takeaway is that we, as people, are just really crappy at doing and understanding randomness. The whole point is that there is no explanation when something is random. It’s just something that happens sometimes.

That’s where the O’s appear to be at. Sometimes a crappy hitter hits a HR, but that doesn’t mean that he turned the corner into Mike Trout. And sometimes a near-perfect hitter strikes out, but that doesn’t mean he’s lost his ability to make contact. Over the course of a series, or a month, or a season, those things get lost in the randomness. And over the course of a career, Buck Showalter and the O’s bullpen will have good years and bad years. Sometimes, like the example above, there will be strings of three in a row, but that’s evidence OF randomness, not evidence against it.

Fiers at the taco bell
Guest
Fiers at the taco bell
1 year 9 months ago

Randomness does occur, the Orioles are within the bounds of randomness.Check. Ergo, the model is perfect, now I’ll write a nice inflammatory piece of Oriole fan bait.

Or… you could approach this from a different angle, and wonder whether there is something about the Orioles (maybe, maybe not…) that could inform a better model.

Fred
Guest
Fred
1 year 9 months ago

Hi Dave – Great article. This is a really thoughtful response to all of the questions you were getting in the chat yesterday. I think that in these discussions, a team out performing preseason projections and having a better record than component events would suggest are being conflated a bit. The Orioles have done both and they are two separate issues, but I’m not sure the difference has been completely clear throughout these conversations.

The Ancient Mariner
Guest
The Ancient Mariner
1 year 9 months ago

As H. L. Mencken put it, “For every complex problem there is a solution which is simple, easy to understand, and wrong.”

Kevin
Guest
Kevin
1 year 9 months ago

Yeah…but he has in obvious bias in this case.
#anachronistichomer

Matthew
Member
Member
1 year 9 months ago

As it was featured on Effectively Wild today, something worth looking into as well is runs prevented on the basepath by defenses. Essentially BsR allowed by a teams defense. Something I’d love as a feature.

Joe
Guest
Joe
1 year 9 months ago

Thank you dave for doing this. I was getting annoyed by the people who were dismissively telling people “no, this is just random variation, you don’t get it” without just SHOWING them the analysis. Thats all you have to do, which you did, finally.

AlbionHero
Guest
AlbionHero
1 year 9 months ago

What this article doesn’t take into account that the Orioles have the “Norfolk Shuttle”… they constantly bring players up and down from AAA when they are hot or cold, so just because players who were bad earlier hurts the teams advanced stats doesn’t mean that it’ll have any impact on their future play. As far as I know, the O’s have the most transactions between AAA of any team the last 3 years. The O’s play whos hot now and doesn’t care about players “Regressing to the mean”.

Calvin
Guest
Calvin
1 year 9 months ago

If the players they brought up from AAA were always on the ‘hot’ side and performed on the ‘hot’ side while in MLB, it would show through in their performance both in reality and context neutral.

Except, it has only shown through in their real wins/losses, as Dave has pointed out above. This is no different than if a regular MLB player was outperforming his context neutral stats (ie. hitting home runs when the game was tied instead of when there is a 8 point lead).

That’s not to say the Orioles have not done a good job here. I have no clue. I just don’t think its the cause of them outperforming their context neutral stats.

MW
Guest
MW
1 year 9 months ago

…except the concept of “hot” and “cold” players is essentially a fallacy.

AlbionHero
Guest
AlbionHero
1 year 9 months ago

Yet it works in reality and it messes up the projections because people refuse to exist that players are human beings who have times they feel better and play better and aren’t just robots.

AK7007
Member
AK7007
1 year 9 months ago

The problem is that there’s no way to know when that “hot streak” will end – you start “playing the hot hand” without any way of knowing if today is the day that they will return to earth.

The question is, are they playing better because they feel better, or do they feel better because they are playing better? It’s probably more of the latter, and might explain why streaks aren’t super predictive.

I’d like to know how much of that “hot streak” is due to BABIP fluctuation? I’m looking forward to MLBAM batted ball data to help with that. Are players that say they are “dialed” hitting balls harder and on a line? Or are they hitting the same, but getting lucky with hits falling in?

sk
Guest
sk
1 year 9 months ago

the “hot hand” is a myth.

Nick
Guest
Nick
1 year 9 months ago

Please stop commenting, you are making Orioles fans who actually want to investigate this look dumb.

AlbionHero
Guest
AlbionHero
1 year 9 months ago

Cameron and the other “Experts” have been trolling Oriole fans for years by constantly making backhanded comments about how they’re just lucky or not that good, so I believe I have the right to not take what he says seriously anymore. He doesn’t even try to argue any of the real reasons why the Orioles buck the trends, he’s just trying to find excuses and why he’s not wrong.

Nick
Guest
Nick
1 year 9 months ago

Sure – but trying to refute that with stuff like the “Norfolk Shuttle” and “Hot Hand” and no real lack of any measurable evidence isn’t going to help.

AlbionHero
Guest
AlbionHero
1 year 9 months ago

Any O’s fans who watches the games knows what a huge impact calling up and recalling players has on the Orioles success. To just dismiss it would be trying to ignore reasons why the O’s outdo the projections.

Shankbone
Guest
1 year 9 months ago

There is a definite trend of backhanded comments for some teams and absolute lionization of others, depending on the front office. You can shorthand your contempt with “#6 org” if you like. The Giants have been in this boat. Now F/G doesn’t bother with any articles on them anymore. Good luck on the latest “lucky” run, may it go deep in the postseason.

Jason B
Guest
Jason B
1 year 9 months ago

“The Giants have been in this boat.”

Yep, the Giants took their turn as the most whiny fanbase for a while (nobody believes in us! Rings!) then last year it was the Braves (nobody believed in us!) this year it’s the O’s fans turn (nobody believes in us!). Essentially, its any team that is doing well after not being expected to do as well.

Why they so badly need validation from Dave Cameron (or anyone) is beyond me.

Nick
Guest
Nick
1 year 9 months ago

I am a Os fan who watches the games and disagree. I would put more value on Adam Jones, Nelson Cruz, Manny Machado, etc. Yes – they occasionally get impact from a player who is called up but it is not the reason they win. Tell me the last position player who was called up and helped win a game? Caleb Joseph?

emdash
Guest
emdash
1 year 9 months ago

He’s not ‘trolling’ you. He’s stating an opinion you disagree with, and it makes you mad. That’s more on you than it is on him.

Wobatus
Guest
Wobatus
1 year 9 months ago

Hey Nick, not a call-up, and not a response to your basic point, but shout out for Steve Pearce. Designated for assignment in April, and more recently slumping and was behind Delmon Young etc on the depth chart. Still worth 2.8 WAR this year. With Machado ailing again and Davis handling 3rd Pearce is again asked to fill-in where needed. He’s been one of the reasons for the O’s “surprising” success.

Nick
Guest
Nick
1 year 9 months ago

100% agree about Pearce. The thing is the Orioles have a way of signing a ton of guys like Pearce and making the most out of the ones that stick. This model may not reflect that. They’ve done it with others in the past couple years. Hell they got value out of Omar Quintanilla. That was more towards my Money Ball 2 reference. Maybe they have found some market inefficiencies in the way they collect people who have some skillset in something that is hidden by their deficiencies.

nard
Guest
nard
1 year 9 months ago

“He’s not ‘trolling’ you. He’s stating an opinion you disagree with, and it makes you mad.”

The irony.

Well played, probable pedophile.

Beimel53
Guest
Beimel53
1 year 9 months ago

Except if that was true it would also raise their projected winning% as well. How can you not get the concept that this is about sequencing?

single, single, HR = 3 runs
HR, single, single = 1 runs

Both instances the same 3 things happened but randomness of which order they occurred effected the teams winning percentage.

AK7007
Member
AK7007
1 year 9 months ago

“But Buck Showalter is smart enough to tell the players to only do things in the proper sequence!” *sarcasm*

chuckb
Guest
chuckb
1 year 9 months ago

Unfortunately he lacked that skill last season. Maybe he only had that skill in even-numbered years.

Mk
Guest
Mk
1 year 9 months ago

Maybe the o’s are really good at stealing signs.

RichW
Member
RichW
1 year 9 months ago

Wow so the Orioles can predict injuries, suspensions, and babies being born. That’s a real skill.

Shuck Bowalter
Guest
Shuck Bowalter
1 year 9 months ago

Would creating a model based on the idea of baseruns, but using weights to account for runs scored/ surrendered to stronger or weaker competition (ie, weighting runs scored against better opposing pitchers higher etc) help better describe a teams true talent level?

Or would this not really tell us all that much?

Arbitration Clock
Guest
Arbitration Clock
1 year 9 months ago

“In fact, the existence of the right amount of outliers is actually evidence that the model works really well.”

Question: isn’t the issue less that a model has outliers, which is normal, but more that we can *predict* which teams will be outliers (at least according to O’s or A’s fans–that Showalter/Scoscia are doing/did something that didn’t show up in our traditional). Isn’t that a fundamentally different allegation from the one you answered? If you say: 6% of the time, we should expect to throw a coin 4 times in a row heads and proceed to lay out 30 or so coins. I then pick out a coin and you throw 4 heads in a row. Well within probability, absolutely. But then you ask me to do it again, and again I find a coin to do that…doesn’t that make you say: hmmm?

This isn’t to say we should throw out models, but continue to improve them, using more and more inputs as more and more information comes to light. What does Showalter say, for instance, about his team’s unique alleged skill?

indyralph
Member
Member
indyralph
1 year 9 months ago

Your example has n = 2. Manny Machado walks 6% of the time. If he walked twice in a row, would you say “Boy, that’s super interesting. We must find out why he walked twice in a row.”

Arbitration Clock
Guest
Arbitration Clock
1 year 9 months ago

It’s not that Machado walks twice in a row, it’s that I predict that he does. It’s the difference between saying: tonight, one of the 36 starting hitters in two games will walk in consecutive ABs vs saying: Machado will be that hitter. And then he does.

indyralph
Member
Member
indyralph
1 year 9 months ago

I assure you that your prediction of what happens and what happens are independent events.

Jason B
Guest
Jason B
1 year 9 months ago

But as Dave points out, if you had predicted that about the Orioles last year (outperforming projections due to their “clutch” performance the previous year), you would have been badly wrong.

And if you had said “Gee Scoscia has this all figured out! He’s got the SECRET SAUCE!” you would be wrong for three years running now.

Which is not to say there’s not *some* signal there, but the evidence seems to lean toward it being more noise than signal.

Arbitration Clock
Guest
Arbitration Clock
1 year 9 months ago

@Jason B. Let me say, I agree with you generally here and let me refocus what I want to say:

What I’m saying is that the repeatability of certain teams on this list suggests that there is something else at work here which isn’t being measured. Don’t ask me what it is–I don’t know–but there is enough evidence to suggest that statistical variation doesn’t account wholly for teams like the 2012-2014 Os (lumping 2013 in there as well because…what are seasons but arbitrary endpoints and the team is roughly the same since 2012). The original comment that I’m reacting to is:

“In fact, the existence of the right amount of outliers is actually evidence that the model works really well.”

But this isn’t necessarily correct when it’s the same team or players beating the model. To me, that suggests the opposite. Think of the Matt Cain xFIP stuff from years ago. It’s completely normal to expect a few players to have that kind of ERA-xFIP split, right? Just based on normal variation. But when a player does it enough times, the odds that that player’s variation is solely do to statistical variation shoot way, way down. That doesn’t *prove* that Matt Cain possesses a skill to limit HR rates, but the possibility that that is the case does rise. Moreover, would we expect him to be able to do that every single year? No, of course not, else players would be stat-generating robots with no luck involved, which is why it’s super tough to project player performance.

And so, you can see my problem with the Angels example you and Dave mention. The Angles team now is drastically different than it was in 2008. I think it’s a confluence of factors that produce this ability for a team, not one single front office, manager, or player but many things working in concert–which makes sense, because if it were something simple like the manager, it would be easily isolated. Which is something I suspect you’d agree with.

You’re right, of course, I can’t quantify it–and if I had bet on the Angels “skill”, I would have lost big. But the fact that certain combinations of teams/managers appear on the good ends of these lists suggests this is not totally explainable by statistical variation.

And this is my quarrel with saying: we expect outliers in a correct model. Mike Scoscia

Arbitration Clock
Guest
Arbitration Clock
1 year 9 months ago

And instead of getting 5 PAs, you only get 2. No no, my example seeks to illustrate the importance of questioning givens, like whether I can spot a weighted coin.

Fiers at the taco bell
Guest
Fiers at the taco bell
1 year 9 months ago

The Scoscia strawman is interesting.

That the current iteration of the Angels don’t overperform their baseruns doesn’t invalidate the fact that earlier iterations that were based on entirely different skills did.

Entirely different teams with entirely different players. The older versions were built on speed, defence and an elite bullpen… the kind of mix that might allow a team to overperform.

Pujols Hamilton Trumbo
Guest
Pujols Hamilton Trumbo
1 year 9 months ago

The hell you say

Jason B
Guest
Jason B
1 year 9 months ago

“That the current iteration of the Angels don’t overperform their baseruns doesn’t invalidate the fact that earlier iterations that were based on entirely different skills did.”

That’s a solid plan, then. Let’s find what things help us outperform our baseruns and then totally stop doing them, because we don’t need that kind of help!

Bill
Guest
Bill
1 year 9 months ago

I would like to commend and thank you for writing this article. One long-standing gripe among us fans of teams who are labeled outliers is a lack of substantive response from the stats community when questions are asked about why projection systems get their particular team wrong over multiple seasons. You’ve made your well-supported case, and it is appreciated.

I would like to know, do you stand by your assertion that the Orioles are “an okay team with an inflated record”? They went 17-7 in 26 games since the ASG against the best the American League has to offer. They weren’t beating Texas or Minnesota, they were beating the Angels, the Blue Jays, and the Mariners. They are currently 19 games over .500 in a division that, heading into yesterday, was tied for best overall winning percentage. The remark smacks of Keith Law’s infamous “There is literally nothing the O’s can do to prove they’re a good team” in September of 2012.

I could see the “inflated record” remark if they were beating up on bad teams, but they’ve actually been beating teams even you would agree are good. So why the “inflated record” remark in yesterday’s chat?

LaLoosh
Guest
1 year 9 months ago

some people get very invested in their views and can be close-minded to accepting any deviations.

indyralph
Member
Member
indyralph
1 year 9 months ago

The article demonstrates that a sample of 330 teams playing 442 games a piece will contain outliers. It should go without saying that a over sample of one team playing 27 games, outliers will occur much more often.

Bill
Guest
Bill
1 year 9 months ago

That doesn’t preclude the Orioles from being a good team, though. They are good team that wasn’t projected to be, but still, a good team. You can be an outlier and be a good team. It just means you weren’t expected to be. To deny actual results and label the O’s as having an “inflated record” doesn’t do any service to defending the projection, which DC actually did fairly well.

In fact, I’d say labeling the O’s as an “okay team with an inflated record” actually hurts his otherwise well-crafted argument, because he’s playing into the very stereotype he’s trying to puncture.

bmarkham
Guest
bmarkham
1 year 9 months ago

Did you read the article? Hint: it has nothing to do with projections. BaseRuns is not a projection. This all about analyzing results which say that the Orioles accumulative plate appearance outcomes on average will lead to less wins that they have so far.

This is all about sequencing of plate appearance outcomes into runs, and those runs into wins. Of which there is a natural amount of randomness.

Bill
Guest
Bill
1 year 9 months ago

@bmarkham:

None of what you stated is an argument that the O’s are not a good team. So, what’d the defense for DC’s statement that “The O’s are an okay team with an inflated record”? A team that’s 19 games over .500 is a good team, no matter how they got there.

Baroque6
Guest
Baroque6
1 year 9 months ago

Thanks for this article, Dave. I sincerely love the intelligent snark that characterizes sentences like this: “…but if we’re to explain how Scioscia figured out how to beat BaseRuns, we have to also explain why he forgot how to do it a couple of years ago, and hasn’t remembered since.”

Pujols Hamilton Trumbo
Guest
Pujols Hamilton Trumbo
1 year 9 months ago

Yeah!

John Stamos
Member
John Stamos
1 year 9 months ago

I think I’m ready to discuss normal distributions and how they apply to baseball at my next dinner party.

dkdc
Guest
dkdc
1 year 9 months ago

Most of what you are doing here is simply re-proving the pythagorean thereom. I’m sure there are plenty of Orioles fans who believe they have a magical ability to out-perform their pythag, but that’s not going to be a widely held belief amongst your audience.

To me the more interesting questions are:

1) Why does the Orioles BaseRuns record continue to be so much better than their projected record? Strip away all of the sequencing, and you still have a team that is allowing and preventing BaseRuns at much better rates than the projections systems expect. At the beginning of the season (and the beginning of 2013, and the beginning of 2012) they were projected to have a negative BaseRuns differential and even now that projection is just barely positive. Yet this team has had a consistently positive BaseRuns differential for three years.

2) Ignoring actual record for a minute, Is there something they are doing that allowed their actual run differential to be better than BaseRuns differential? Most of that outperformance is on the run prevention side – is there something about the type of defense they have (double plays, catches controlling the running game, outfield kills) or their pitchers’ ability to pitch from the stretch or even something else that lets them pitch comparatively better than other teams with runners on base?

Wobatus
Guest
Wobatus
1 year 9 months ago

Yes, the fact that they have outperformed their pythag, both standard and baseruns, is different from the fact that they are beating their projections consistently. That even their baseruns record is better than what was projected.

They had the great record in one run games and high WPA from the relief staff in 2012. But their fielding wasn’t great per uzr.

Then last year they had the poor record in one run games. But they still outperformed the projections. And their fielding was great.

This year the WPA is back up, ok, random maybe, but the fielding is once again extremely good, second only to the Royals. They are 6th in wOBA hitting and 2nd in fielding, which helps prop up a fairly mediocre staff getting a boost from Zach Britton in the close, winnable games. Doing his best Jim Johnson impression. In fact, outdoing him as a worm burner. 76% groundball rate?

Anyway, a bit of a non sequitur to the article. They are a pretty good team, even by baseruns.

InspectorGadget
Guest
InspectorGadget
1 year 9 months ago

Dkdc’s first point times a million billion! A team’s W% beating its run differential or baseruns is one story. A team’s baseruns diverging from its predicted baseruns is a different story. I think the second story is more interesting and also has the most potential to tell us something about coaching. The first maybe tells us something about managerial in-game decision-making skills, but I think that overall what it’s telling us is that the luck factor looms larger than any manager’s ability to manufacture (or piss away) a win.

Someone
Guest
Someone
1 year 9 months ago

The expected error of pythag is 1-3 games, their record is 3 games better than their pythag.

Tito Landrum
Guest
Tito Landrum
1 year 9 months ago

Dave,

A big reason why this is so hard for Orioles fans, like myself, to take, is that we watch nearly every one of their games from start to finish. We know exactly which games the Orioles were fortunate to win and which games they “shouldn’t” have lost. We’ve actually witnessed all the ups and downs of the season thus far. We know the teams strengths and weaknesses. It’s hard and upsetting to hear things that may be contrary to your vision of your team when you’ve invested so much time in them (and when we know you haven’t invested anywhere near the same amount of time) and you think you have a pretty clear idea of what your team is, and isn’t capable of.

Jon
Guest
Jon
1 year 9 months ago

But you can’t tell that by watching the games. That’s sort of Dave’s point. He’s looking at a lot more factors like balls that should have been caught, or the randomness over a season in batters being LOB.

Tito Landrum
Guest
Tito Landrum
1 year 9 months ago

Oh, I absolutely get that and I appreciate it greatly. However, and I’m sure you would agree, there is also a ton of information you can decipher by watching your team play, especially when you’ve been watching nearly every inning that team has played since April. Heck, since April 2012. Or 2005, or 1997, or 1979….

:)

AK7007
Member
AK7007
1 year 9 months ago

“However, and I’m sure you would agree, there is also a ton of information you can decipher by watching your team play”

However, and I’m sure you would agree, there’s way more to cloud your judgement by watching your team play every day. I can still remember when I thought Kirk Reuter was a good pitcher because he was “a good Giant.” Since those days I’ve woken up, but I can’t say that watching the games every day helped.

Tito Landrum
Guest
1 year 9 months ago

@ak7007 I agree with you. However if you knew me then you’d know that is not really an issue for me. I can look at the team objectively despite my fandom. But you don’t know me so I understand where you’re coming from. All I’m saying is I have a ton of more information then the typical non-Oriole fan that could potentially be helpful.

my jays are red
Guest
my jays are red
1 year 9 months ago

so basically you’re saying “I’m a fan of the Orioles so I’m going to defend them no matter what, even when statistical analysis does not cooperate”

Tito Landrum
Guest
Tito Landrum
1 year 9 months ago

Ugh. No, that’s not what I’m saying, at all. And there is nothing statistical saying the Orioles aren’t a good team. They are a good team, a very good team, just as Dave Cameron says. All I’m doing is just trying to fill in the reasoning of many fans. O’s fans are no different then fans of ANY OTHER team when they feel they are being slighted. You want to read some of the harshest criticisms of the O’s? Go read the message board at Oriole’s Hangout. :)

my jays are red
Guest
my jays are red
1 year 9 months ago

everyone understands the reasoning of fans who watch every game and think their team is better. At Fangraphs we’re trying to understand *why* results occur. I understand that you’re trying to defend fandom generally speaking, and Orioles fans are just like every other fan, defensive about their team when attacked. I just don’t see your post being relevant to the statistical discussion which is ultimately what we’re here for.

RichW
Member
RichW
1 year 9 months ago

So if the O’s are being harshly criticized by fans despite the good performance is it because they think the O’s are a much better team or that they can’t recognize good results when they see them?

Tito Landrum
Guest
1 year 9 months ago

@RichW, I agree with you. However if you knew me then you’d know that is not really an issue for me. I can look at the team objectively despite my fandom. But you don’t know me so I understand where you’re coming from. All I’m saying is I have a ton of more information then the typical non-Oriole fan that could potentially be helpful.

Tito Landrum
Guest
1 year 9 months ago

@my jays. Sorry for wasting your time. My intent was to add to the discussion just from a different vantage point. I greatly enjoy, and learn from, “you here” at fangraphs. Maybe someday I too can be part of this community. ;)

Tito Landrum
Guest
1 year 9 months ago

Sorry RichW, please ignore my previous comment it was directed at a different poster. I just didn’t keep track very well. To answer YOUR question, I think it’s just the common mentality of “nobody messes with my brother but me”. Many O’s fans absolutely hate the approach at the plate that many of the players have and it is also frustrating to watch starting pitchers who just don’t seem to have put away pitches. Incidentally, the fact that the O’s don’t walk much or strike out many batters probably has a lot to do with why projection systems don’t like them very much.

my jays are red
Guest
my jays are red
1 year 9 months ago

Through your use of sarcasm you’re admitting that your “analysis” is subjective fandom, so thank you for proving my point.

Tito Landrum
Guest
Tito Landrum
1 year 9 months ago

You win… I don’t understand what I’m saying or doing wrong to cause so much vitriol (perceived on my part). I don’t think I’m saying anything controversial or rude. I respect your opinion and I even AGREE with it. I’m not really sure what it is about what I’m saying that you, or anyone else, are trying to pick apart. I’m a regular reader of fangraphs and I find the site an invaluable source of great information.

It is not hard for me to separate my love of the Orioles with the facts presented here at fangraphs and other sites.

I think you may be adding a certain tone to my comments that is not inherit when I type them.

AC
Guest
AC
1 year 9 months ago

So you’re saying the biased observer is in a better position to make the call than an unbiased observer? With all due respect, if you tried to apply that principle to any other endeavor, even you would probably reject the premise. And rightly so.

Tito Landrum
Guest
Tito Landrum
1 year 9 months ago

Sure I’m biased but I don’t believe I’m acting on that. I just have a ton of context that I can offer when it comes to the Orioles. I feel as though I have a good handle on what the team’s strengths and weaknesses are.

bmarkham
Guest
bmarkham
1 year 9 months ago

So tell me, from watching every inning of every game, what have you learned about the O’s ability to sequence plate appearance outcomes into wins at a rate better than what BaseRuns expects?

I’m all ears.

Tito Landrum
Guest
Tito Landrum
1 year 9 months ago

Sigh… Everybody just wants pick fights… Everybody reads malcontent,into everybody’s comments…

In no way am I disputing BaseRuns nor do I dispute any of the prediction models used here or anywhere else. Some do, I don’t. I always find them interesting and informative. It doesn’t hurt my feelings that the O’s are outperforming their talent level. Because, when you think about it, that’s an incredibly awesome thing for your team to be doing. Often times these types of teams are the most fun to root for. So far this season is shaping up to be just as fun, from a fan’s perspective, as 2012.

What I’m saying is, from a wins and loses standpoint, when you see the games unfold on a daily basis there is tons of context that you can add to the statistics. Sometimes it’s injuries, mechanical issues, rest – too much or lack there of, swinging bunts, broken bat bloops, and many, many, many more. In my way of thinking, it’s a large part of what makes baseball so great.

And, I’m being completely sincere, and not in anyway sarcastic, when I say we can all probably agree to that.

Jon
Guest
Jon
1 year 9 months ago

Dave,

First, great article. Very enlightening. But, here is what I think is a little confusing to me about the ultimate conclusion you reach. It stems from your comment “regression is coming.”

As I understand it, you are not saying that the O’s performance is really outside the model’s prediction. If its not outside what the model says “can be expected,” even if its with a lower expectation than some other outcome, then why do we say the outcome is “fluky” or even “unexpected.” I didn’t see it in your article, but how many standard deviations from the mean are the 2014 O’s. And if its around 1, isn’t their record actually a reasonably expected one?

Second, the “regression is coming” comment always bothers me. And maybe you can set me straight though with what I don’t understand correct. But, my understanding is that “regression to the mean” is NOT predictive of a counter outcome. So, the O’s have been playing above their most expected outcome to date. Regression to the mean does not indicate – to me – that we can expect future equally poor performances. My understanding of regression to the mean is best explained in the coin flip model. If you flip a coin 10 times, and 7 out of ten times you get heads, you have a 70% “heads rate.” The mean is 50%. In the future, we’d expect the mean to occur, not a 30% heads rate to “balance” it out. And, so if we flip a coin another ten times, and the next ten times we get 5 heads, then we have 50%. Regression to the mean in that context is that in flipping it that extra ten times the “heads rate” has gone from 70% to 60% (12/20). So “regression to the mean” does not mean to expect a 30% heads rate to balance the 70% heads rate. Instead, regression to the mean indicates that the more you flip the coin, the more the mean will occur and the closer to the mean the overall rate will occur.

So, lets take the O’s. They have already played above their “mean” or “expected rate” using base runs. This doesn’t mean they are going to play below it in the future. What we’d expect going forward is that they would play at it. So, if they play at it for the rest of the season (not over or under) my gut tells me they still win the AL East comfortably. Regression the mean occurred, and the O’s fans think it didn’t and they outperformed some model. They didn’t really, or at least not for a long time. But the O’s did outperform it for a short, but relatively significant period of time that allowed their 2014 record to look better than the “expected outcome.”

indyralph
Member
Member
indyralph
1 year 9 months ago

Past performance in already in the bank. Dave’s comments are forward looking. Future performance will not be such to bring the total result to the mean, but future performance can be expected to be in-line with the mean going forward.

olethros
Guest
olethros
1 year 9 months ago

Eventually, in any given set of 10 coin flips, you will expect to see a 30% heads rate, provided you stand there flipping the same coin in sets of 10 for long enough. You expect the overall rate to be 50% in 1000 coin flips, but in any given set of 10, you expect to see 90% heads/tails 1% of the time, 70% heads/tails 4% of the time, 60% heads/tails 30% of the time, and 50% head/tails 65% of the time.

Jon
Guest
Jon
1 year 9 months ago

Right, but what I’m saying is that this is often portrayed in a confusing manner by all sabermetricians. When an O’s fan here’s this, they say “Ok, I’ll wait for the time when we play terrible.” And that’s not what Dave is saying will happen, I don’t think. I think he’s saying, you’ll stop playing above your mean. That is not the same thing as saying “you’re going to start playing below your mean expected outcome.”

olethros
Guest
olethros
1 year 9 months ago

In a sample the size of a single season, yes.

I think the real problem here is that a lot of people don’t understand the difference between regression in its narrow statistical meaning and its meaning in general discourse. Similar to the problem many people have differentiating between the use of the word theory in a scientific context and in common usage.

Ned
Guest
Ned
1 year 9 months ago

Dave means the former, not the latter. That’s why his statement is regression to the mean, not regression below the mean. And the mean, in the O’s case, is their base runs win %. Which is still a good baseball team.

PackBob
Guest
PackBob
1 year 9 months ago

A great explanation.

I also think there is a bit of semantics going on, a failure to communicate.

Dave is stuck on seeing this in a completely statistical way (and it is a site based on statistical analysis after all), while Orioles fans are stuck on the fact that no matter the reason, the Orioles have been good.

Dave states that the Orioles have performed better than expected and also notes that the Angels did this for 8 of 10 years. So, why then can’t the Orioles do the same? The answer is that they could, but it is just unlikely.

It really doesn’t matter if it is random or not solely in terms of current results. The Orioles may not be intentionally beating expectations, but the fact remains that they are beating expectations.

Dave’s stance tends to paint the Orioles good season as a mirage while Orioles fans want to believe it will continue. There is no reason it couldn’t continue, as the Angels have already shown. It’s two different viewpoints with neither willing to budge an inch from their mountaintop.

I agree fully with Dave’s analysis although I also think there could be underlying factors that are not random and also could not be expected to continue as a repeatable skill. Just a simple belief that the team is going to win all 1-run games might for a time increase the odds of winning 1-run games, but would evaporate once that belief ceased to be real.

A last nitpick. If Buck Showalter instills confidence into his team, that could be something that improves the Orioles chances of winning games. But if, and it’s certainly an if, he is doing that with the Orioles, it doesn’t mean that he could do it with any team. Stating that he didn’t do it with the Rangers so he couldn’t do it with the Orioles is like saying Beltre couldn’t hit with the Mariners, so there is no way he could hit anywhere else.

EthanB
Member
EthanB
1 year 9 months ago

I’m going to play devil’s advocate a little here, because I agree with Dave that this is essentially all random variation. But let’s say there was a true talent of beating BaseRuns, and it was normally distributed. Isn’t this exactly what we would expect to see? In this case, the spread of talent in sequencing would just increase the standard deviation of the normal distribution, and some of it would be due to random variation and some due to differences in true talent.

Justin
Guest
Justin
1 year 9 months ago

Yes, this exactly. Dave is essentially assuming his conclusion by using the observed standard deviation. You’d see very similar results if you looked at the wOBA distribution, but no one is arguing that Mike Trout is just getting lucky.

What he needs to do is figure out what the standard deviation is under the assumption that there’s no skill involved in beating BaseRuns. If the observed standard deviation is larger, then there are skill effects involved.

indyralph
Member
Member
indyralph
1 year 9 months ago

The skills set underlying Mike Trout’s woba, ISO, contact rate, walk rate, etc., can be consistenty correlated with woba. Nobody has presented any statistical argument of traits that can cause win% to beat BaseRuns.

indyralph
Member
Member
indyralph
1 year 9 months ago

More importantly, they can be correlated year-to-year. Jeff demonstrated yesterday, pretty convincingly, that beating BaseRuns is correlated with clutch, and that past clutch is not correlated with future clutch.

Justin
Guest
Justin
1 year 9 months ago

Yep, agree with all of that, and with Jeff’s analysis yesterday.

Dave’s argument is still bad.

Pale Hose
Guest
Pale Hose
1 year 9 months ago

This may be buried too deep in the comments for Dave to see, but I would like his opinion on this issue. In Dan Syzmborski’s 10 lessons learned about creating a projection system he noted that results tended to be stickier in-season than season-to-season. Has there been any similar research on the base runs model? Is their less regression to the mean when confined to a single season?

John Stamos
Member
John Stamos
1 year 9 months ago

The stickier part of in-season projections has more to do with rate stats rather than sequencing.

Pale Hose
Guest
Pale Hose
1 year 9 months ago

The two have to be related to some extent, right? I would think that a stickier rate stat (BABIP) would have a strong influence on sequencing.

John Stamos
Member
John Stamos
1 year 9 months ago

BABIP still doesn’t have any additional impact on sequencing, just that there are a different quantity of hits given the same amount of plate appearances. It says nothing about when those hits are bunched (or sequenced).

grant
Guest
grant
1 year 9 months ago

Excellent explanation of standard deviation and normal distribution. Expressed more cogently than my stats professor did in university. Can’t help but think this article is a better stats primer than most of the text books.

Kris
Guest
Kris
1 year 9 months ago

This is just like the ‘trust the projections’ theme on here a month ago. Of course it works in the aggregate over the long run, that’s what a good model does and is nothing new or particularly interesting.

The outliers are worth investigating further than ‘it’s just randomness,’ as that what makes baseball fun and interesting.

Chito Martinez
Guest
1 year 9 months ago

This was an excellently laid out explanation of the framework underlying Dave’s sentiment that the Orioles are not an “elite” team as many of the team’s fans see them.

Sadly, emotional O’s fans are never going to get past the sense that he’s saying their team is worse than “not elite,” they are “not good.” The tone of this article was more measured, but I can see how Dave’s responses to dumb O’s questions in the chat ruffled more than a few feathers. I think a few throwaway comments that the team is “good but not great” would have placated a lot of the saber-challenged objectors.

Then again, what do I know? I’m more bothered that O’s fans are coming off as knuckleheads than I am about these analyses.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

Well, yes and no. I think Dave is quite simply saying that they’re not elite, but it’s very possible that they’re very good.

If you have a spectrum of:
Average | good | very good | great | elite

Very good is generally a solid spot to be in regards to baseball. And there’s nothing wrong with that.

nard
Guest
nard
1 year 9 months ago

Very is a very meaningless word.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

Not quite so.

She’s good looking.

She’s VERY good looking.

See what I did there?

nard
Guest
nard
1 year 9 months ago

Yup. Confirmed you don’t get it.

Jimmer
Guest
Jimmer
1 year 9 months ago

In fairness some of the comments coming from the fans in that chat were rude also and it’s not like he hasn’t been dealing with these same kind of comments by Orioles fans for awhile.

Chito Martinez
Guest
1 year 9 months ago

I agree. I am more lamenting the fact that this whole thing makes O’s fans look like bozos. I just think it’s possible fewer bozos would be crawling out of the woodwork if Dave used some variation of “good” as an adjective for them.

Is there ever a time when a team’s fanbase comes off as collectively rational and intelligent on the interwebs?

Go Nats
Guest
Go Nats
1 year 9 months ago

NATS FANS are far more intelligent and rational than ANY other fanbase.

My proof!

Not one of them made a negative comment in this article!

So there!

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

Yup, confirmed you’re an uptight dickfor.

ben
Guest
1 year 9 months ago

Maybe the difference between Scioscia’s Angels outperforming their expected wins and underperforming is Mike Trout. It could be that our current models slightly exaggerate the impact a single player can have. (Maybe Trout is so unpleasant to be around that his teams will underperform for his entire career… we need to figure out how to quantify personality for a player’s WAR).

On a hopefully more serious note, The other similarity between those Angels teams and these Orioles teams (at least as I remember the Vlad/Salmon/Erstad/Garret Anderson Angels–they did have a 3-outcomer in Glaus), is that they were another team full of aggressive (and effective) hitters who didn’t walk much. Maybe teams who manage above-average offenses with below average walk rates are able to outperform their expected records?
Another constant with these Orioles teams has been that their starting pitchers don’t strike people out, which, again, might be another commonality between the teams listed above.

On an unrelated note, I don’t understand why so many people felt compelled to commend a blogger for writing a blog defending models he espouses. Relax, statheads, you’ve won.

Jason B
Guest
Jason B
1 year 9 months ago

“I don’t understand why so many people felt compelled to commend a blogger for writing a blog defending models he espouses. Relax, statheads, you’ve won.”

And conversely, why people get their panties in a bunch when a model or a writer doesn’t like their super duper fantastic team.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

At least you can admit that they’re a super duper fantastic team.

FWIW, I like most of the writers. Except for Joe Sheehan, though. But that’s largely because he’s a Yankees shill.

Jason B
Guest
Jason B
1 year 9 months ago

You can take comfort in knowing he’ll be shilling for a pretty depressing team for these next couple of years at least :)

Jimmer
Guest
Jimmer
1 year 9 months ago

yes, except they won a lot of games in 2012 once they brought him up and they’ve won a lot with him this year. I’m pretty sure it’s not Trout who is making them not reach the wins they are supposed to have.

Tommy "5 Runs All Earned" Hunter
Guest
Tommy "5 Runs All Earned" Hunter
1 year 9 months ago

Worth investigating for those who might be interested in discovering value somewhere other than strikeouts and OBP.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

The Orioles model from 2012 to 2014 has been centered around the following: defense, average rotation, and a superb bullpen. The offense is centered around the homer and is really pretty poor regarding OBP. They have little to no speed on the club as well.

All that said, I think what shakes the model are defensive metrics not exactly being all that perfect… and perhaps more importantly: how Buck handles the bullpen. At any given time he has the A team (Miller, O’Day, Britton), the B team (Webb, Brach, McFarland) and the C team (Matusz, Hunter). You’ll see a close run differential because when the team is losing… Buck will go with the poor bullpen arms and let the score run up. This can occur many times throughout the course of the year… which can really throw off run differential.

If you really want to get crazy, since the offense relies on the homer… if they’re facing a particular tough pitcher then there’s going to be little to no offense generated. It’ll be a close game… and that’s when Buck’s A team in the bullpen comes in. And if the game goes to extras? The B squad (which is still very good). Generally this means the O’s can hang around long enough to win. And they do: 2012, 2013, and 2014.

This isn’t discounting the model. After all, if you add up the little things: defensive metrics not being perfect (and the O’s being elite there), Buck and Dan’s superb use of the bullpen (And the Norfolk Shuffle/Shuttle), and a hot/cold offense overly reliant on the homer… well it all adds up, doesn’t it?

Wobatus
Guest
Wobatus
1 year 9 months ago

UZR does have the Orioles as very good at fielding though. They’re 3rd in the majors. And I can see the point about the relievers, although most teams have their relieving similarly staggered, with the worst guys pitching in blowouts.

worstfan_NA
Guest
worstfan_NA
1 year 9 months ago

we are knuckleheads, whose hubris shows as the man who can never get the chip off his shoulder.

whatever happened to the saying that winning is contagious? are people who play sports just as terribly superstitious as the guy who swears he can control his dice rolling?

as someone who played sports growing up, I do believe in the unselfish team concept. Is it possible that a team that is consistently defying the expectations has the mojo working for them? I don’t mean to suggest that the Orioles are definitely best_team_NA, but maybe their players have bought into the team vision and therefore are good at winning?

or could it be somewhat explained by the fact that they lead the majors in homeruns, but are very close to the bottom in walks. Maybe they have an approach at the plate that is different from how most other teams approach the plate and skews the results.That is, they go up there looking to hit. but that doesn’t exactly pass the sniff test either. I don’t know. maybe it is working as intended.

a lot more entertaining though to believe crackpot ideas though.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

Perhaps quite hilarious to me is Brian Cashman’s asinine comments prior to the 2014 season. I say they’re asinine quite simply because of the 2013 and 2014 Yankees. If the O’s regressed in 2013 (as Cashman so wonderfully 20/20 visioned it), we will see the same crashing down from the 2014 Yankees… regardless of how many overpriced vets they acquire (McCarthy, Capuano, Drew, Prado, Headley, etc.).

If anything people should be applauding Duquette and Showalter. It really starts to call into question Cashman’s ability to GM. Imagine how he’d do in an environment where he didn’t have unlimited funds?

Let’s put aside the fact that the Yankees rotation to start the year was largely geriatric and/or injury prone outside of Tanaka. They were bound to get hurt.

But let’s look to the offense. Is there ANY excuse for how bad the Yankees offense is? Last night the Yankees trotted out a lineup that was nearly $110 million for the year. They only had *2* players making less than $10m in that lineup: Gardner (nearly $6m) and Cervelli ($700k).

So, all that said… let’s applaud Duquette and Buck and not give the Yankees and Cashman a pass. It’s amazing how much credit a man can get for winning in New York (Cashman, Torre). Let’s see how they’d do in Oakland? ;)

DNA+
Guest
DNA+
1 year 9 months ago

The Yankees had one old starting pitcher. ….he’s the only guy in the rotation all year.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

Kuroda – 39
Sabathia – 34
Tanaka – 25
Pineda – injury prone
Nova

Tanaka, Pineda, and Nova are all relatively young. Pineda is injury prone. Nova just stunk this year and got hurt.

Let’s not give the Yankees any credit for weathering this storm the way they have. They’ve vastly played above their heads. In much the same way as the 2012 Orioles. But they’re not nearly as talented.

DNA+
Guest
DNA+
1 year 9 months ago

Right, so they started the year with a single old guy who has been remarkably reliable, a 34 year old that was pretty much a lock for 200 innings every year, and three young guys, one of which was injury prone and only made the starting the rotation by winning the job out of spring training be beating out two other young pitchers (Phelps and Warren).

I get that you don’t want to give the Yankees any credit, but at least be realistic.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

My point being is that the pitching (outside of the bullpen) for the Yankees has been very good (I’ll give them that), but the fact that they’re pushing out a lineup (not even counting Soriano who was cut) that is not only pretty old… but also very bad. Oh, and well over $100m.

So, should we give Cashman credit for acquiring McCann, Beltran, Jacoby? Trading for Soriano last year? Picking up Headley, Drew, AND Prado? All the while they’re one of the worst offenses in all of baseball.

It’d be one thing if their overall pitching statistics were bad, that’s not the case. Their offense is terrible and there is no excuse for it, right?

DNA+
Guest
DNA+
1 year 9 months ago

Again, in the real world, the Yankees bullpen has been outstanding this year.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

You have a strange definition of outstanding. The Yankees bullpen ERA is nearly 4 (3.93). Their WHIP is great as is their SO/9, but really it’s largely centered around 2 pieces: Betances and Robertson. Everyone else is largely flawed. We saw that last night.

DNA+
Guest
DNA+
1 year 9 months ago

ERA, WHIP and SO/9? …ok, you win.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

What other imaginary metric are you using? Throw away Betances and Robertson and the Yankees bullpen has been largely awful. What else do you want to go off of? It’s going to be a really tough sell to twist metrics to fit your argument.

Wobatus
Guest
Wobatus
1 year 9 months ago

Steve, throw away Betances and Robertson and they are largely awful? Betances and Robertson are 2 of the best relievers in baseball by WAR, xFIP what have you. And if one went merely by WAR it’s Betances 2.3, Robertson 1.8, WArren .8, Kelly .7, Thornton .5 (since gone) to Britton .9, O’Day .9, Webb .8, Hunter .5, Brach .3.

That’s the teams top 5 in WAR. Warren and Kelly have not been awful.

OK< WAR isn't the be-all for relievers. Britton's xFIP is better than his FIP. Having a great year.

In any event, all teams concentrate their relief talent and pitch the really good pitchers in the close games (unfortunately too often only the close games they are winning).

Oh, and the O's added Miller. Doing something you blasted Cashman for, trading young talent in Eduardo Rodriguez. Not to stick up for cashman, but getting McCarthy, Capuano, Headley etc for whatever run they can make were decent fill-in-the-hole moves which didn't cost much (although I like Solarte).

And I'd much sooner root for the O's than the Yankees (and they are better in baseruns and the standings), just that I'm not sure I can agree with some of your argument. The Orioles are better than the Yankees, but it's because their hitting is much better and their fielding too. Not because their relief pitching is better. It isn't. And whatever Cashman's drawbacks, these late season deals have been pretty good. The cupboard was a little bare before those moves.

Your O's are set up pretty well for the next few years I think, better than the yanks, so ya got that.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

O’s relief core is by and large better than the Yankees. And deeper.

FWIW, I hated the ERod trade.

emdash
Guest
emdash
1 year 9 months ago

Cashman over the last two years has done considerably better than anyone had a right to expect considering that the free agent market isn’t what it used to be, that ownership won’t accept rebuilding as an option, and the budget suddenly no longer being unlimited yet loaded down with unproductive high-priced players.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

I’m sorry, but trading away what little farm system Cushman had left for guys like McCarthy (lol), Headley (lol), Prado (not a bad pickup, actually), etc.

Come on, he basically went gangbusters this offseason on McCann, Jacoby, and Beltran. So we’re supposed to say he’s done “considerably better”? Come on.

DNA+
Guest
DNA+
1 year 9 months ago

They didn’t trade prospects for McCarthy and Headley. They traded Vidal Nuno and Yangervis Solarte.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

Did I call them prospects? No, I said what little farm system they had. They traded Nuno, Solarte, De Paula. The Yankees don’t have much left in the farm system to trade. And that makes me happy. Cashman is really screwing their club. Now, is this because of the Steinbrenners? Quite possibly. But they’re in a heap of trouble going into 2014 and beyond.

DNA+
Guest
DNA+
1 year 9 months ago

The point is that none of those players had any future with the team. Nuno and Solarte are the very definition of replacement players.

StephenPAdams
Member
StephenPAdams
1 year 9 months ago

Some would argue that Solarte did given the pretty terrible state of the NYY infield. Going into next year, who do they have? Jeter is gone. Headley is a FA. Johnson was traded. Roberts cut. Solarte traded. Teixeira is still on the club, but injury ridden and making > $20m. That’s a terrible state to be in. I guess they’ll take the chance of resigning the slew of “vets”, yeah? Headley (3B), , Drew (2B)? Not sure what they’d do utility wise. They’ve traded away a lot of their options.

The OF is really the only somewhat set position (Gardner, Jacoby, Prado) in addition to catcher (McCann, and Cervelli as BC).

Go Nats
Guest
Go Nats
1 year 9 months ago

Man I thought I irrationally hated the Skankees….

kevinthecomic
Guest
kevinthecomic
1 year 9 months ago

the commentary to this article is a nice, albeit sad, microcosm of what is wrong with the country today. the vast majority of the population does not have even a rudimentary understanding of basic mathematical concepts (and, yes, understanding the normal distribution and its implications is a basic concept). this lack of understanding puts America at a competitive disadvantage, economically, versus many (most?) other countries in the world.

on the other hand, for those of us who do understand math, there is a lot of money to be made due to the demand for math skills far outstripping the supply of math skills. so, AlbionHero (and others), continue your obstinance, the rest of us will be laughing all the way to the bank. thanks.

indyralph
Member
Member
indyralph
1 year 9 months ago

I didn’t realize there was practical application of math skills required in the comedy world? I guess there’s probably a bunch of good inverse pareto jokes…

JJ
Guest
JJ
1 year 9 months ago

I think one-sided thinking – me against them – is the reason things are the way they are Maybe too much focus on the bank. If everyone is just looking to laugh at the other point of view and stomp over them, may it be that they are too biased toward their own opinion? The ‘maths’ are not in question here, so no need to toot your own horn. The question is, with an always changing dynamic in the game of baseball, is there any way that the Orioles found a formula for recent success? I think it is entirely possible that the Orioles play it according to their formula, and are winning despite predictive models and expected results. The game has shifted to being more pitching dominant, so maybe the game handling can change to meet the current environment.

Vil
Member
Vil
1 year 9 months ago

I enjoy fangraphs and Dave Cameron is a big reason why. He educates me and that makes me a better baseball fan.

The Orioles aren’t an elite team. If Chris Tillman is your best starter, than you’re definitely outperforming expectations. The Orioles are not a team that draws a lot of walks–with the exceptions of Markakis and Davis—and more baserunners is always a good thing. It’s perfectly reasonable to expect that the A’s or the Tigers will dispatch the Orioles.

And injuries to the Jays, Yankees, and Red Sox along with underperforming players on all of those teams have contributed to their success.

But I did try to make one point with Dave yesterday but the chat format isn’t a good place to do that–most comments selected are short.

When I was a young man in high school, and later in recreational leagues, I played football, baseball/softball and basketball.

Like your boss at work, there is no more dispiriting feeling than playing for a team where the coaches do a poor job of preparing their players to succeed. Conversely, there’s a bounce to your step as you approach a game or even a practice when you’re confident that your coach or coaches know what they’re doing, and instead of screaming in your face when you make a mistake, the coaches calmly pull you aside and point out what you did wrong. Morale is just generally higher, just as it is in the workplace when you’re working for managers who aren’t clueless and aren’t martinets.

Ask the Orioles players, and almost to a man they will tell you that they believe in Showalter’s skill in managing a team. Or at least that Buck will do a better job than many of his peers. Dave said yesterday there’s no way to prove that, and at best it might mean 2 extra wins. I would argue 3 extra wins, but I can’t prove it. But in my sports experience, confidence isn’t a trivial factor in winning games. There’s a surge of adrenaline and perhaps testosterone when you step on to the field/court knowing that your team will play smarter than the other guys and that we will execute the plays as they were taught to us.

Many years ago, Earl Weaver made this statement (and I paraphrase since I don’t have access to an archive of Weaverisms): “At least 40% of the time, and perhaps more, you don’t have to beat the other team. All you have to do is sit back and wait for the other team to make the mistakes that create the opportunity for your team to win.”

tz
Guest
tz
1 year 9 months ago

Great point.

I think the big issue overall is that luck is SO big of a factor in baseball performance that many items that could have an impact kind of get buried under the statistical fluctuation.

For example, if sheer luck can cause a team to perform +/- 10 wins above expected, it’s hard to pinpoint a manager’s positive impact if the true value is around +/- 2 or 3 wins. By the time you get enough years data to put a credible number on it, the data is probably stale (for example, using Showalter’s track record from previous managing might not capture any improvement he has made over the years).

I think Dave’s response to you reflects this. There’s no denial that a manager could very well add, say, 5% to his players’ performance baselines by his leadership skills, but it’s next to impossible to pinpoint a real number for it. Intangibles do exist, but they’re intangible.

Bill
Guest
Bill
1 year 9 months ago

“The Orioles aren’t an elite team.”

No, they aren’t, but they ARE better than “an okay team with an inflated record” as DC asserted yesterday.

David Robertson's Calves
Guest
David Robertson's Calves
1 year 9 months ago

If Buck is fostering good morale and that is causing the O’s to play better, that influence would already be reflected in their BaseRuns and would not be a valid explanation for why their BaseRuns projection doesn’t align with their actual W-L record.

In other words, the supposed “bounce in [the Orioles’] step” due to the claimed Buck Effect could explain a scenario where the Orioles were scoring more runs and allowing fewer than expected given past performance, not the present one where the Orioles are winning more than expected given their BaseRuns.

Go Nats
Guest
Go Nats
1 year 9 months ago

Confidence is reflected in overall stats. Confident teams will play all aspects of the game better than teams with little confidence. They will also likely play better relative to their physical skills. What confidence will NOT do is sequence your results in such a way as to produce you more wins relative to your stats which is random. Dave is talking about sequencing not overall stats.

In other words, being better coached and more confident may lead to your team having a much lower ERA, a higher UZR/150, and a higher slugging percentage overall, but WHEN you hit for those extra bases and prevent those extra runs is still random (yet matter huge).

Brandon Fahey
Guest
Brandon Fahey
1 year 9 months ago

“The Orioles aren’t an elite team.” Not to be trite, but this is why you play the game. Let’s talk after the season and the playoffs end. Although since the point of this article seems to be in part that the games (W-L record) aren’t as meaningful for measuring performance of a team as various sophisticated statistical measures, maybe this comment is meaningless. Statisticians rightfully argued that ERA, FIP, WAR, etc. are better measures of a pitcher’s performance than W-L record. It seems like that point is being taken to an extreme to say that W-L of a team is less important for measuring how “good” the team is than various statistical models built to do that. I think that some of the negative reactions to that statistical mindset are based on the belief that these models fail to pick up some important attributes that go into winning that are relatively hard to quantify such as perhaps mental toughness, strategy, ability to perform under pressure, ability to win “big games” (i.e. games where both teams are trying harder than usual such as the Yankees-Orioles game last night).

Just some thoughts.

JosephK
Guest
JosephK
1 year 9 months ago

If you have this discussion after the season and playoffs end, you’re likely to encounter exactly the same sentiment as you are today around these parts, irrespective of outcome. It’s one of the great divides between sabermetrically-oriented fans and more traditional ones: the latter puts a great deal of emphasis on actual, on the field results (Cabrera drove in 3 runs with a double) while the former is more interested in the underlying probabilities that drive those outcomes (on average, across all base-out situations, a double is worth X number of runs). I’m more in the sabr camp myself, but I’m frequently annoyed with analysts and others deriding those benighted fans who so stubbornly insist on using actual outcomes to measure team/player success, e.g., the commenter at the top of this thread attempting to make a mockery of those who would use the standings (the ONLY thing that matters to most fans) to argue the Orioles are an elite team.

chuckb
Guest
chuckb
1 year 9 months ago

You’re asserting that the Orioles are winning because they’re more confident and they’re more confident because of Showalter’s leadership.

Maybe it’s the other way around. Maybe they’re more confident because they’re winning (and that’s even assuming that they’re more confident than other teams, something there’s absolutely 0 evidence of). But even if it’s true, you’re assuming that the confidence caused the winning and that Showalter caused the confidence.

Even if they are more confident, you’re assuming causation when it’s at least as likely to be simply correlation and equally likely that the causation works the other way around & Buck just happens to be along for the ride.

DNA+
Guest
DNA+
1 year 9 months ago

Just for fun I plotted a histogram of HR/PA for the 2013 season. It pretty well follows a normal distribution. Now, this could be because home run hitting skill is equivalent between players and the variation is stochastic, or it could be because the skill itself is normally distributed among MLB players, or it could be a combination of the two. I like the last possibility. …in any case, that fact that HR/PA is normal in MLB does not mean that we should have no interest in understanding the causes of the variation.

Jason B
Guest
Jason B
1 year 9 months ago

You’re misunderstanding the point. It’s not that we shouldn’t have interest in what’s causing the variation, it’s “Where are any explanations for it that will stand up to some sort of rigorous analysis and/or provide some predictive value going forward?”

DNA+
Guest
DNA+
1 year 9 months ago

I’m not missing the point at all. I agree that research efforts should be directed in this area.

Sean Patrick
Guest
Sean Patrick
1 year 9 months ago

First off, excellent article Dave.

To the dissenters, what I have a really hard problem seeing is how any of the Orioles oft-cited skills would play a role in them beating their expected record via BaseRuns. I agree that the Buck Showalter and the Orioles are good at all the below things. What I don’t see is how those talents would cause there to be a difference between their expected wins via BaseRuns and their actual wins.

A) Defense: Nearly every aspect of defense is already contained in the BaseRuns model. The model takes into account how many runners get onto base, how many singles are stretched to doubles, doubles into triples, etc. It takes every EVENT into account. What it doesn’t count are the order and timing of these events. I don’t see any argument as to how the Orioles are using defense to change the sequencing of events. In other words, why would their defense suddenly get better when there are base runners or runners in scoring position as opposed to when the bases are empty?

B) Home Runs: Home Runs are the clearly among the best events possible for a baseball team attempting to win a baseball game. Again, though, the value of home runs are baked into BaseRuns. BaseRuns does not take into account WHEN the home run occurs. To buck the model, the Orioles would have to be talented at choosing when to hit home runs. I see no proof that this is the case.

C) Relief Pitching: The argument would go that since relief innings come in high leverage situations, they should be given more value therefore making the Orioles a better team than BaseRuns says they are. The issue here is that this does not take into account the whole slate of events that brought the game to high leverage situation. Late innings are still factored into a team’s overall BaseRuns score. And, in addition, They may seem more important in retrospect, but it took a whole game’s worth of events to lead to a high leverage situation. I would, however, be interested to see what the data says on whether there is any correlation between relief pitching talent and beating BaseRun expectations.

Name
Guest
Name
1 year 9 months ago

Exactly.

Not once here has Dave stated that Showalter hasn’t helped the Orioles to improve their BaseRuns baseline. In fact, someone noted in a previous article how the O’s have overcome down years from a number of players by having other players improve (Pearce, etc.)

What the random “luck” variation has done is simply to give the O’s a commanding lead in the AL East, instead of just being in a dogfight for the top. But like you point out Sean, the positive value from Showalter and the O’s style of play are all already reflected in BaseRuns.

Nathan
Guest
Nathan
1 year 9 months ago

Hypothetically, if the Orioles were shifting in such a way as to be better at defending in situations with runners on base than without runners on base, they would outperform what BaseRuns expects, right?

B N
Guest
B N
1 year 9 months ago

This. Also, “small ball” is literally a code-name for trying to maximize your odds of getting a certain (low) number of runs, typically to the expense of maximizing the expected number of runs in that situation. There are quite a few ways to design a team that will make it “better in close-and-late”. These ways do not necessarily mean that you will have a better team (since the skills that may make you overperform your “runs scored” expectations may sacrifice not having to be close in the first place). However, it will mean that you’ll have a persistent differential from what context-neutral would expect.

Go Nats
Guest
Go Nats
1 year 9 months ago

NO! If they used shifts well with men on base then that would lower the runs they give up overall. Not WHEN they gave up runs. They would have to shift sometimes and not shift other times against exactly the same hitter with exactly the same counts and base runners but with different scores. Then have the shifts work better to get what you explain. In other words, actively choose to suck some of the time. Actively choosing a lesser option with less chance of success when the game is not close. Slacking off when they get big leads (and still not losing) and when they are far behind. I doubt they are doing that and still winning. I choose to believe they are trying their best at all times, but being human means they sometimes make mistakes, the mistakes are random, but the Orioles have been lucky that their mistakes rarely cost them games when the game is close.

DavidKB
Guest
DavidKB
1 year 9 months ago

This is a really well done analysis. I hope it helps people to understand your arguments. The challenge (for you, as a communicator) is that it’s so hard to get a “feel” for random distribution even if you’re looking at the whole dataset. If you cherry pick one performer it’s even harder. There is almost no way other than numerical analysis to decide whether we’re looking at a good distribution or a bad one. Unfortunately not so many people are willing to go that far in their understanding.

As a side-note, I’m curious about the skew on the underperforming side. It would be an undertaking, but I’d love to see whether that skew can be attributed to anything consistent (e.g. particular teams). Not an easy thing to tease out statistically I know…

VeveJones007
Guest
VeveJones007
1 year 9 months ago

First I just want to say that I completely understand wanting to look at the numbers. But here’s the problem: if you’ve watched the Orioles play since the break, it would be obvious to a non-baseball fan that they are one of the best teams in baseball.

32 straight games against teams over .500 and they are 22-10, winning 8 of 9 series.
5th in the AL in FIP and 5th is wOBA during that span.
+32 against some of the best teams in baseball (for a net of +1 over pythag)

Yes, the Orioles had two months of mediocre to below average baseball to start the year, but they aren’t the same team. The rotation has settled down and the bullpen have settled into their roles. I can’t filter BaseRuns standings for the past 1-2 months, but I imagine they would demonstrate (like the pythag over the past 32 games) that the O’s aren’t really lucky any more. They’re just good.

DNA+
Guest
DNA+
1 year 9 months ago

Basically, teams look really good when they are winning and really bad when they are losing.

VeveJones007
Guest
VeveJones007
1 year 9 months ago

Did you read the article? It basically says that despite winning most of their games over the past 2 2/3 years, the Orioles aren’t as good as their record indicates based on the data. However, I provided data that says that the Orioles have been as good as their record indicates over this stretch of 32 games against teams over .500.

DNA+
Guest
DNA+
1 year 9 months ago

Thirty-two games chosen because the team was winning, as opposed to chosen randomly. If you choose to look at the games only during winning streaks, of course the team is going to look good.

Jason B
Guest
Jason B
1 year 9 months ago

DNA+ is right on. I can rejigger the start and end points to show that just about any team is really good:

Toronto, May 12-June 6: 20-4
KC, July 22-present: 17-4
Houston, May 24-June 14: 15-6
Miami, April 26-May 8: 10-2

etc.

Again, not to say that the Orioles are bad/not good/destined to miss the playoffs/etc., just that we shouldn’t take any given win streak as a sure sign of how good a team is, nor any losing streak as a way to define how bad a team is.

VeveJones007
Guest
VeveJones007
1 year 9 months ago

Here’s the problem with saying saying that these past 32 games are arbitrary:

This article would NOT have been written had these 32 games not occurred. The Orioles are 19 games over .500, so these past 32 games represent 63% of their improvement over .500. Of that 63%, they have only outplayed their pythag by +1 win.

We’ll have the rest of the season to determine which is a better indicator of the O’s true talent: the first 87 games or the last 32. Having watched the last 32 games, I see nothing that indicates that a regression is coming, especially with a significantly softer schedule on the horizon. Despite the smaller sample size, I don’t see data that would indicate that a regression is in store. This team has one of the best records in baseball since the break and they deserve it. It wasn’t “random variation”.

nard
Guest
nard
1 year 9 months ago

/Reads article

NUH-UH!

VeveJones007
Guest
VeveJones007
1 year 9 months ago

I’m saying that there is no face validity for full-season data if you’ve watched the last 32 games. True or false: teams can improve or get worse over the course of a season?

Beimel53
Guest
Beimel53
1 year 9 months ago

Which is completely besides the point because this article isn’t about the Orioles. The Orioles BaseRuns could be great during the last 32 games. The point of the article is about random variance.

chuckb
Guest
chuckb
1 year 9 months ago

Pay attention to their best months and ignore their worst months.

And trust me, my eyes tell a more accurate story than the numbers do.

nard
Guest
nard
1 year 9 months ago

Figured this was coming. Ha, baltimores.

worstfan_NA
Guest
worstfan_NA
1 year 9 months ago

please die.

nard
Guest
nard
1 year 9 months ago

Don’t be the Christian South of baseball, please.

Arya
Guest
Arya
1 year 9 months ago

There you are, nard! I knew you’d show up! And I knew that when you did show up, you’d post comments just like this one!

You honestly add nothing to anything. You’re the guy who gets in everyone’s face about being “right”, except that you’ve proved nothing yourself and haven’t even formed an intelligent argument. You can’t stand that the Orioles keep winning games, so you make fun of them for being lucky. It’s not only glorious, but incredibly one-dimensional.

But we’re sure enjoying it, I can promise you that! :)

nard
Guest
nard
1 year 9 months ago

Cry more, Arya.

Arya
Guest
Arya
1 year 9 months ago

Hahaha. Weird, another one-liner. Didn’t expect that. Do you have anything of actual substance to say?

But that wasn’t crying, nard. It was appreciation for your transparency.

nard
Guest
nard
1 year 9 months ago

Please submit more haughty, overwrought definitely-not-crying-but-also-definitely-not-contributing-anything-either posts, then. They amuse me.

FYI Dave Cameron doesn’t love you.

Arya
Guest
Arya
1 year 9 months ago

Nice job. That’s the longest post I’ve ever seen from you!

But nah, if I was crying, you’d know. What do I have to cry about? My team is in first by 7.5 games. :)

I come to the comments section to read intelligent discussion, and this is something you provide absolutely none of–something you have exhibited in multiple comments sections on this site. True, my replies don’t add much either I suppose, but I guess I couldn’t resist.

FYI Dave Cameron probably doesn’t love you either. Just sayin :) He’s a smart guy, so he probably sees you for the transparent, Oriole-hating troll you are. If he even bothers to read this garbage, which I sincerely doubt.

Steve
Guest
Steve
1 year 9 months ago

When a 10-20 win variation is within the expected range of results, I’m really not sure what the value of predictive models are. Honestly, the local sports paper has been as good as Fangraphs at predicting the standings for years now.

indyralph
Member
Member
indyralph
1 year 9 months ago

I’m as tired of those silly Ruben Amaro jokes as the next guy, but it’s really tough when you lob a softball like this.

LaLoosh
Guest
1 year 9 months ago

I said this above. Saying that 15-20 win variations are expected within the first std dev is fine but it renders these models virtually useless.

This whole article reads like another in the Sabre-series of “when I’m right, I’m right, and when I’m wrong, I’m still right.”

How about let’s attempt to explain why the models consistently underrate the O’s… Maybe they are better at controlling sequencing than anyone thinks…?

indyralph
Member
Member
indyralph
1 year 9 months ago

Would it make you feel better if, everyday, Dave posted an Rsquared regression between some new variable and baseruns % – win%? And then posted, “Yep, this still means nothing.” This assertion that a website and professionals who are dedicated to asking and answering these question would ignore them because it’s the Orioles is just mind boggling to me.

LaLoosh
Guest
1 year 9 months ago

what’s mind-boggling is that someone would attempt to gain some edge by using an “Rsquared regression” as part of their reply… I guess you got me there.

indyralph
Member
Member
indyralph
1 year 9 months ago

Why would engage, extensively, in a statistical debate and then admit that you have not even a basic familiarity with statistical terms?

LaLoosh
Guest
1 year 9 months ago

isn’t great how some people feel compelled to jump right into insulting people as soon as there is some basic disagreement… strike another one for the keyboard tigers society.

indyralph
Member
Member
indyralph
1 year 9 months ago

I really don’t see where I insulted anyone. It’s not a comment on your personal character if you don’t understand statistics. I just want some of people who disagree to offer some suggestion of what could be done differently, rather than assume it could be done and has not been.

LaLoosh
Guest
1 year 9 months ago

first you made the Ruben Amaro comment (which was obviously intended as a pejorative) and then the remark that my post was “mind-boggling.”

I must be the one who mistakenly entered what I thought was a baseball discussion and ended up in a stats class…

Jason B
Guest
Jason B
1 year 9 months ago

“How about let’s attempt to explain why the models consistently underrate the O’s”

*Whoosh*

But OK, we’ll bite. What theories do you have, that can be adequately tested and can be shown to be predictive going forward?

Steve
Guest
Steve
1 year 9 months ago

indyralph, you know statistical terms, so can you explain the value of a model that has a standard deviation of error that is basically entirely inclusive of the sample set?

indyralph
Member
Member
indyralph
1 year 9 months ago

Not really sure I understand your question. There are roughly 100 data points of 330 outside of one standard deviation. It’s basically a binomial distribution – two possible outcomes with a set probability. Since it’s a binomial distribution, it can be estimated using a normal distribution in large enough sample size. The observed results should fall within a normal distribution, as they do here. People use a coin flipping example because they know the probability is 50%. In most real life applications, you don’t know the actual probability. We don’t know for sure that if the Orioles are a .500 team or .510 team or .550 team. We can surmise based on what we’ve observed and test statistically. A model cannot predict the randomness, that is what randomness is. What we can say is that over a large sample, the model should produce results in a similar distribution with the observed results. Observed results should converge to the mean over an infinite sample size, but of course that’s not possible. A model cannot work for a single data point – it can only work in broad application. But that doesn’t mean there is no value. If you want proof of the value of a predictive model, just look at what the A’s or Rays do compared to the Phillies. It might not work every time, but if it works most of the time it has value compared to not doing it at all. That’s the point I was making with my snarky comment above. Apologies if you had genuine curiosity.

Aficionado
Guest
Aficionado
1 year 9 months ago

We are getting a bit ahead of ourselves regarding the usefulness of the model and the meaning of its prediction-errors.

Every model has limitations to its predictive powers (hence the typical designation of standard deviations and confidence intervals, etc.). When a model’s prediction error is within the expected range of a models prediction-errors, it does not mean that the error is in fact random. It just says that the model predicted the outcome within the limitations of its predictive powers.

The error could still in fact be random, but it may just as well be that other independent variables that influence an outcome have not been identified or used to make the prediction. If one concludes all prediction errors are in fact random, one implicitly accepts that all future research is futile as the “errors” contain no additional “information” and the best model has been determined. Fortunately this is not the case.

DNA+
Guest
DNA+
1 year 9 months ago

Well said.

LaLoosh
Guest
1 year 9 months ago

Fine. Then let’s try and identify the causes of the consistent outperforming of the models.

DNA+
Guest
DNA+
1 year 9 months ago

I support that.

…my own pet hypothesis is that batting outcomes depend on the run environment of the game. In high scoring games hits and runs are easier to come by because you are facing increasingly worse pitchers in increasingly lower leverage situations. Perhaps teams with high variance in runs allowed or runs scored per game perform differently than do teams with lower variances that play a higher percentage of meaningful innings.

LaLoosh
Guest
1 year 9 months ago

that is an interesting take. taking it further, perhaps teams with lower run differentials have greater reliance on bullpen performance… and their RPs are logging a higher percentage of high lev IP than are other pens…?

DNA+
Guest
DNA+
1 year 9 months ago

Anecdotally, the fact that the Yankees outperform their baseruns isn’t so surprising to me. They have decent pitching, terrible hitting, and a great bullpen. In the minority of games where the starting pitchers falter early, the game is basically conceded straight away since the offense has no chance to recover and the back of the bullpen pitchers are allowed to fill innings and give up runs. Often, when they lose, they lose big.

However, when they win, it means the offense has managed to score one or two more runs than the opposition over the first six innings, allowing Betances and Robertson to lock it down. When they win, the margin is always small.

To me, the Yankees seem like a team built to outperform their run differential and their baseruns.

haishan
Guest
haishan
1 year 9 months ago

I think everyone would like to do that; people who design these models (for fun!) aren’t generally like “oh okay, this model seems to work pretty well, no more improvements necessary!” There are probably useful things that BaseRuns doesn’t account for, but given that the model works quite well 95%+ of the time, there aren’t likely to be very many obvious, significant, and easy-to-implement improvements.

haishan
Guest
haishan
1 year 9 months ago

(Which doesn’t mean it’s not worth looking for some that are two out of the three.)

Hoary Trope, Jr.
Guest
Hoary Trope, Jr.
1 year 9 months ago

“identify the causes of the consistent outperforming of the models.”

That the models are really coarse-grained and crude? I mean, no offense, but “linear weights” models are not exactly a life-like replica or simulation of the game of baseball. Linear weights and logarithmic weights work shockingly well for modeling a lot of things, but they’re still very must “first-hack” approaches. You wouldn’t go trying to beat the stock market with them, I’ll tell you that.

It’s not like we don’t have better models:
– Hierarchical linear models: The next step up, we actually acknowledge structure and properly weight it!
– Bayesian models: A lateral shift over, we allow ourselves joint conditional probabilities that account for all sorts of things AREN’T independent.
– Markov Models: We account for causality across series of events, over time. We can then calculate the differences between different policies for selecting actions.

Given that we’re not using any of these most of the time, is it any wonder that we’d have plenty of unexplained variance? Our operational model amounts to literally a weighted line in a hyperspace. I mean, it’s a weighted line with some great data behind it, with weights that are quite predictive but uhm… it’s a line.

Bill
Guest
Bill
1 year 9 months ago

If every other comment arguing the Oriole’s point would be erased but one, I’d leave this one. Well said.

Wobatus
Guest
Wobatus
1 year 9 months ago

Yes, well said. It’s kind of why I think the coin flip analogy is probably a bit flip. There are so many more variables at play.

Wobatus
Guest
Wobatus
1 year 9 months ago

This is probably completely off base, but is there any possible correlation between being better than baseruns and having a good fielding team?

The Royals and O’s are 1st and 3rd in the majors in team fielding and 1 and 2 in actual wins above baseruns model. I know it’s a hoary trope and not true, D don’t slump. It likely isn’t clutch either.

Hmm, I think my next (and first) novel should be called Hoary Trope.

LaLoosh
Guest
1 year 9 months ago

I mentioned something about this above – that run prevention is probably underrepresented in some models.

Also that bullpens might be underweighted in close run differential scenarios.

Other thing that I haven’t seen mentioned is that gross run differential says absolutely nothing about distribution of those runs. Is there really much to glean from runs for and against? It’s almost like citing the average temperature of a place that has 4 seasons like NY. It provides very little intuitive data.

Hoary Trope, Jr.
Guest
Hoary Trope, Jr.
1 year 9 months ago

That’s a fine name for a novel, but a bad name for a first-born.

Johnny U
Guest
Johnny U
1 year 9 months ago

Fangraphs writers talk about how the Orioles and Royals have overperformed their true talent but they only project the Nationals, Tigers, As, and Angels to produce more WAR going forward.

Ummm… what?

http://www.fangraphs.com/depthcharts.aspx?position=ALL&teamid=2

Orioles Fanatic
Guest
Orioles Fanatic
1 year 9 months ago

Dave, how dare you question my Orioles and their magical winning abilities! They are clearly wizards who can defy science. *Buck Showalter casts fire! It’s super effective against spread sheets!*

another troll
Guest
another troll
1 year 9 months ago

sub-par effort.

BDF
Guest
BDF
1 year 9 months ago

Fantastic article. I’m not particularly well educated in statisticality, but why do we assume normal distribution? Is it just because experience shows that distributions tend to be normal and so we assume that unless proven otherwise? Seems inadequate, and also seems unlikely that there would be some distribution pattern that holds over a diverse range of phenomena.

Thanks!

B N
Guest
B N
1 year 9 months ago

It’s mainly that because if you take a bunch of random independent samples, of any well-behaved distribution, they eventually tend to look normal-ish. As a result, tons of things look normal if you have enough samples. So basically, if you can assume your samples are pretty independent (e.g., one sample doesn’t impact the other one) and they’re well-behaved (e.g., they don’t occasionally give you an infinite number or a few other things), you can emulate a normal distribution with it.

This is kind of a weird finding, but it’s well… cause math. The normal distribution just happens to be what you get when you do the integral operations that capture information about the moments (e.g., mean, variance) of most distributions.

However, just because you get a normal distribution from it, this does NOT mean:
1. Your samples are independent. Plenty of normal distributions arise from samples that aren’t independent, but still end up normal when you lump them together.

2. Your samples are even random! You can emulate a normal distribution from a chaotic sequence, which is still deterministic. This is the core principle behind most random number generators in computers.

By comparison, there’s no reason to think that baseball is particularly random. It is almost entirely dependent upon chemical processes and classical physics, which are fairly deterministic. I have yet to see proof that quantum mechanics play a huge role in dingers or K’s. Mainly, when we talk about “random” in baseball, that is just code words for the result of “chaotic processes that we cannot measure usefully.” And there’s certainly no reason to think that an actual normal distribution governs anything in baseball, except maybe sub-atomic particles (and even that is a pretty dicey assumption, what with discrete quantum states…)

TKDC
Guest
TKDC
1 year 9 months ago

“Well, the 2012 Angels won fewer games than BaseRuns expected, and so did the 2013 Angels, and so have the 2014 Angels. After beating the model for five straight years, Scioscia is now on a three year losing streak. This doesn’t erase what he’s done previously, but if we’re to explain how Scioscia figured out how to beat BaseRuns, we have to also explain why he forgot how to do it a couple of years ago, and hasn’t remembered since.”

This is condescending and ignores reality in not just baseball but just life in general. If I am a car salesman and I outsell everyone in the area for 10 years, and then I don’t for the next 3 years. It doesn’t mean that I “forgot” whatever it was that made me successful, it means that other things have changed and I have not adjusted with the times. Or maybe I’ve just gotten old and tired and don’t care anymore?

Anyway, I’m not a fan of the Orioles (or the Angels), and I generally agree with what you’re saying (since the math is right there, Jesus, how could you not?) but I do think dismissing any sort of effect Scioscia may have had for 10 years based on what the Angels have done the past three is wrong.

Chito Martinez
Guest
1 year 9 months ago

Even as an O’s fan I think the most interesting to come of this discussion would be a deep-dive into that 10 year stretch the Angels had.

Bill
Guest
Bill
1 year 9 months ago

It would be interesting to see what they had in common with the current Orioles team. Perhaps it’s nothing and Cameron is right, but I’m betting otherwise.

B N
Guest
B N
1 year 9 months ago

If I recall correctly, this analysis WAS done a few years ago, though I don’t recall the link. I think the big factors were some small-ball (sacrificing higher expected runs to get the minimal runs you need in late-and-close games), shut-down reliever matchups, and more aggressive baserunning in late-and-close games. Though I may be recalling it wrong.

I think a big thing is that late-game situations can have slightly different conditional winning probabilities, because the expected number of runs scored by the opponent shrinks. Or, in other words, you have more information about how many runs you’ll probably need to score to win. Early in the game, you don’t have that info: maybe the other team will get 10, maybe they’ll get 0. If it’s bottom of the 9th inning and you’re tied, you know you need exactly one run. This lets you maximize your choices for getting at least that one run (if you strategize properly). There are analyses that account for this (usually based on Markov chains), but linear weights (e.g., WAR) ignore those because they target the player as a unit of analysis.

I think the Angels picked guys who had skills that could be applied to implement better late-game strategies, then applied some of those strategies.

Ruki Motomiya
Guest
Ruki Motomiya
1 year 9 months ago

I think this also makes sense. If you only need one run to win the game, such as in the 9th, you would want to make descisions that maximize the chance of scoring one run, even if they lower the chance of scoring two runs. (Since you only need one.)

Ruki Motomiya
Guest
Ruki Motomiya
1 year 9 months ago

This post basically said one of the few things I wanted to say.

I also think that it should be noted that doing something for a long time and then not doing it is NOT necessarily predictive that it was not repeatable: If, say, Miguel Cabrera puts up 10 awesome seasons, then 3 bad ones, he probably is aging, not that he got lucky. Similarly, if the Angels did it for 10 years and then stopped, what if they aged or the skills they had detriorated? What if it was the result of personnel changes causing talent to be developed differently and thus lose the edge? Or even opponent’s “figuring out” what they did and finding a way to address it?

My problem with ascribing the Orioles and Angels so much to random variation and saying that, eventually, they stopped is mistake because it assumes a team is in a vacuum and does not change. Yes, the Angels stopped doing it after 10 years, I am pretty sure a lot changes in 10 years and that can affect it as much as random chance.

As for if they actually figured anything out? No idea, but just saying “Random chance and show me otherwise to get me to look into it” is not the answer: It should be looked into and if nothing is found, you can go “It appears to be random variance and maybe in the future we can look at it more as new metrics and ways to measure baseball come out”.

John Elway
Member
1 year 9 months ago

Glad I never posted that Joe Flacco isn’t the best QB ever drafted by a Baltimore team.

Just neighing.

(#KeepNotGraphs)

telejeff
Member
telejeff
1 year 9 months ago

No, that would be Johnny Unitas.

Word
Guest
Word
1 year 9 months ago

The Steelers drafted Unitas, smart guy.

Brandon Fahey
Guest
Brandon Fahey
1 year 9 months ago

I am starting to come to the conclusion that advanced metrics miss some basic things such as strategy. In particular, some teams are built such that they will win an inordinate number of close games. Take this years Yankees. They are not that good of a team and have a negative run differential (mainly their offense is kind of weak). They have outperformed that partly due to luck. But, that they, overall, have had good starting pitching and a great backend of the bullpen means that they will win a larger percentage of close games. The starters allow them to grab a lot of early, albeit narrow, leads and the bullpen typically keeps them. This just seems like a basic point that gets ignored in these analyses.

So much
Guest
So much
1 year 9 months ago

Confusion

Dee P. Gordon
Guest
Dee P. Gordon
1 year 9 months ago

Showalter uses his pitchers a lot better than my manager.

DAKINS
Guest
DAKINS
1 year 9 months ago

I think I have the reason. The O’s have somehow found a way to skew the win-loss records between themselves and the Blue Jays. They have taken away wins the Jays should have right now and replaced them with losses the O’s should have.

This is my only explanation for why my team is so far behind them in the standings.

/sarcasm

Rick Dempsey
Guest
Rick Dempsey
1 year 9 months ago

Everyone, forgive Dave for the poor quality of his “analysis”. He’s not aware that the error of any predictive model can be decomposed into different components: bias, variance, and irreducible error. Irreducible error corresponds to random variation, whereas the others correspond to flaws in the predictive model and/or the data used to train the model. For a single data point — the performance of the 2014 O’s — it’s impossible to know which type of error is happening.

So just ignore Dave, he doesn’t know any better. He probably analyzes data in Excel…

nard
Guest
nard
1 year 9 months ago

Jesus wept.

Paul
Guest
Paul
1 year 9 months ago

Man this whole advance stats thing is really takes the fun out of the game for some fans. I prefer to think my Orioles are magical and a good team because they win games and have been winning for the last 3 years. It’s fun being excited about a winning team and it gets annoying for someone to pull out a bell curve and explain why the Orioles are not as good as they seem and just an incredibly rare fluke.

There is nothing wrong with the BaseRuns model. It makes sense and a reasonable model for the masses. However, that doesn’t mean their isn’t a better measurement or predictive model out there that would have less outliers.

Jimmer
Guest
Jimmer
1 year 9 months ago

So, continue to think that. ESPN and MLB Network and places like that will most likely go along with that….or maybe the local paper, or whatever. If looking for statistical analysis brings you down, don’t come here. The fact that this team shouldn’t be doing what it’s doing doesn’t mean it isn’t doing it. The wins count. Enjoy the fact that they are having a season that’s an outlier. Heck, win the W Series. Enjoy the playoffs. People are getting WAY too worked up over this…I would think they’d spend more time just enjoying the fact that they are winning. I wish my team was winning…

JJ
Guest
JJ
1 year 9 months ago

I think the analysis, stating the Orioles as an outlier, actually proves that their current formula is the reason they are winning. They found a way to put a team together and manage games that has been successful. Other teams have not found a winning formula in the current environment using their own formulas. I don’t think there is any denying that they are winning ballgames. Its beyond the scope of mathematicians to understand the feel for the game.

Bill
Guest
Bill
1 year 9 months ago

Yes, but is that formula repeatable? What about their current composition allows them to consistently outperform the metrics? If you can’t quantify it, it can’t be copied. Maybe they know something we don’t. Or maybe they just stumbled on something. The Angels fall from outlier grace should provide an example that unless you understand what you are doing, you won’t be able to sustain it.

Ruki Motomiya
Guest
Ruki Motomiya
1 year 9 months ago

The Angel fall from outlier grace could also mean “After 10 years of having a good team/outperforming team, changes can occur either player-wise or organization-wise that causes it to stop”.

Someone
Guest
Someone
1 year 9 months ago

Having a high WPA for the bullpen isn’t just luck. There are a few things that allow Baltimore to get the most mileage out of a good, but not the best, bullpen. The talent in the Orioles bullpen is very concentrated in a few pitchers. The Orioles don’t choose their relievers at random, and in a high leverage game state, they are going to go to one of their top notch relievers. When it is 11-1, that’s when AAAA guy comes in. There gets to be a point in a blowout where the managers no longer contest the game, and will give their lesser players playing time. At that point most collective team data is meaningless.

Bill
Guest
Bill
1 year 9 months ago

Yes, but going into a season with a number of top notch relievers is not an easy task, even for the best talent evaluators. Pedro Stop was great in 2012 but terrible in 2013. JJ blew 10 saves in 2013 after blowing none in 2012. Zach Briton was a failed starter, they couldn’t have predicted his success. Hunter has been rough this year after two years of success. Matusz has been bad this year after being great last year. I agree that a lot of the O’s ability to outperform their ratings has to do with their bullpen, however this is not an easy thing to duplicate. Relievers are too unpredictable.

Wobatus
Guest
Wobatus
1 year 9 months ago

Mr. Adams above was making this same argument generally, but every team’s bullpen usually has some top notch guys and then filler. Sure, some are better top to bottom. He was arguing that outside of Betances and Robertson the Yankee bullpen stinks. Of course, Betances and Robertson are 2 of the top relievers in baseball by WAR, xFIP, whatever. And actually Warren and Kelly are pretty good too. And, well, Yankees actually beat their baseruns prediction as well.

fish
Guest
fish
1 year 9 months ago

Hey Dave,

Since you only have limited BaseRuns data available, are you willing to share the dataset you used in the article? I’d like to replicate your results.

thanks,
a person

chuckb
Guest
chuckb
1 year 9 months ago

Contact David Appelman. Dave makes it pretty clear in the article that he got his “limited” data from him.

___
Guest
___
1 year 9 months ago

Right, cause it’s not like they work for the same website or something.

Dennis
Guest
Dennis
1 year 9 months ago

Hi Dave:

The presence or absence of a normal curve in a large sample has zero bearing on whether the outliers have a causal factor. Coin flips follow a normal curve, but so does human height, which isn’t exactly random on the individual level. While you are correct that you can’t infer incorrectness of underlying assumptions due to the presence of outliers, you also cannot infer that because the outliers follow a normal distribution, that said outliers are due to random variation.

well, this
Guest
well, this
1 year 9 months ago

is correct. And a competent statistician should have been able to tell you this. And 5 minutes of thinking about some of the others things in the world that have normal distributions should have been able to tell you this.

Justin
Guest
Justin
1 year 9 months ago

This is what I’ve been trying to argue but you said it in a much better way.

Ruki Motomiya
Guest
Ruki Motomiya
1 year 9 months ago

This is basically the other point I had in mind, I believe. Just because it can be random variation does not mean it is.

Arc
Guest
Arc
1 year 9 months ago

Sure you can. Thread desperately needs more Bayes.

well, this
Guest
well, this
1 year 9 months ago

Please explain.

Tommy "5 Runs All Earned" Hunter
Guest
Tommy "5 Runs All Earned" Hunter
1 year 9 months ago

Maybe the three-year rolling averages addresses this, but as I just found out I’m on the C team I figure I could be wrong. Does this analysis adequately address the fact that these events are not independent, i.e., they happened with the same team (many of the same players, same manager, same GM), whereas the whole outlier to normal distribution and randomness argument rests on a premise of independent events?

JLuman
Guest
JLuman
1 year 9 months ago

I feel like this discussion is lacking the utilization of Bayesian inference.

mch38
Member
mch38
1 year 9 months ago

It’s the good old “How large must a model train set be before it’s just a train” argument.

Kevin
Guest
Kevin
1 year 9 months ago

As an Orioles fan I don’t understand why people have a problem with this. Isn’t this pretty much exactly the type of story society tends to love? A team overcoming its limitations and performing well above what is expected of them?

pft
Guest
pft
1 year 9 months ago

Lack of evidence is not proof, something sabermetrics just does not get.

It may be that the Orioles have a bunch of players, manager and coaches who know how to win beyond their true talent level. There is no statistical test that can prove this at the 95% level due to SSS

That said, I do think the weaker AL East has played a big role in the Orioles success this year. No good Yankees and Red Sox teams, and the Rays are average at best.

Arc
Guest
1 year 9 months ago

No one so much as implied it was. You are boxing shadows.

B N
Guest
B N
1 year 9 months ago

“The question isn’t whether we can find outliers in the data; the question is whether there are more outliers than we’d expect given a normal distribution.”

This is an exceptionally stupid way of framing the problem. Imagine model with three states: {0, 1, 2}. Second, assume that state 1 occurs 95% of the time. I could “predictively” model this by just guessing 1 all the time. Indeed, if I wanted to maximize some return over this distribution, I SHOULD guess 1 all the time. And I could apply a normal distribution to claim that my model is fine because the outliers (0, 2) don’t occur very often.

But that’s stupid. The question is, can I find information that I can use to determine WHEN I should guess 0 or 2? That’s the whole concept of predictive modeling. You don’t say a predictive model is good because outliers aren’t more common than you’d expect in a normal distribution (“It’s okay, I only get hit in the face about as much as you’d expect by chance, given the distribution of violence in the world…”). Your predictive model is good when you’ve applied all the available information, in all possible ways, yet you still cannot eke out any more predictive accuracy (“I figured out that when I called people bad names, they hit me in the face! This new information is life-changing!”).

Fiers at the taco bell
Guest
Fiers at the taco bell
1 year 9 months ago

It’s one thing to prove that the Orioles are an outlier (but an outlier that is within the bounds of a normal distribution).
It’s another thing altogether to decide that your model which is effectively a univariate regression (Wins = f (baseruns) + Error) is a perfectly specified model. That is that your error term is only reflective of noise, and not of any missing variables.
Outliers are interesting because they make you consider if there is anything you’re missing (that is, if you have an inquisitive econometric mind).

Arc
Guest
1 year 9 months ago

Good thing that wasn’t decided or even implied, then.

David
Guest
David
1 year 9 months ago

It would be really interesting to do an analysis of the teams that have historically outperformed their expected wins over a long period of time to see if they have anything in common that might explain their success and isn’t fully taken into account by current models.

Ruki Motomiya
Guest
Ruki Motomiya
1 year 9 months ago

One of the other things I was wondering is, couldn’t BaseRuns be off and that is causing an issue, such as overrating or underrating a certain type of skill? To use a hypothetical, what if defense in reality saved slightly more runs than BaseRuns thinks it does and thus is a bit more valuable? How would you test this? Interested to know.

telejeff
Member
telejeff
1 year 9 months ago

This is a logical, and well-written defense of models. But, it misses the mark in two important ways:

1. Most of the debate is about why pre-season models have been so wrong about the Orioles each of the past three years. This article, however, defends the performance of in-season, evolving models, which have done much better. I suspect there are plenty of other examples of teams regularly out-performing pre-season projections, but that does not dismiss the idea that we should be wondering how those management teams manage to keep doing so. Afterall, it was the consistent overperformance relative to conventional wisdom of the MoneyBall As, Earl Weaver Orioles, and other past teams that led us to statistical analysis in the first place.

2. While the Os out-performance relative to in-season, evolving models in 2 of the past 3 years probably isn’t noteworthy yet, the Angels consistent out-performance for an entire decade is absoltely noteworthy and should be explored. It is almost impossible for that performance to be the result of random variation. Moreover, in my estimation, this hypothesis is bolstered rather than discounted by the Angels more recent under-performance. The past few years, the Angels have pursued a different strategy in building the team, relying on Pujols, Hamilton, and other high-priced free agents rather than the plug-and-play gamers that surrounded Vlad Guerrero in the prior decade.

So, yes the models are very good at predicting in-season results, except when it came to the 2002-2011 Angels. And, whatever those Angels teams were doing absolutely should be explored because it is hard to say it was the product of random variation.

And, models don’t seem to be very good at pre-season predictions, however, at least not in some cases, such as the Orioles of the past 3 years. It would be also interesting to see more about that: is 3 straight years of over-performance compared to pre-season projections noteworthy? Or, is it common? And, if not common, is it the product of random variation?

Fiers at the taco bell
Guest
Fiers at the taco bell
1 year 9 months ago

Made this exact point about the Angels earlier — agree entirely that throwing out the ten years of overperforming Angels teams because the same manager there is problematic.You can do it if you subscribe to the view that everything about the Angels overperformance was driven by Scoscia, but if you think it might have more to do with the speed+defence+*fundamentals*+elite bullpen model that they employed for those years it’s probably not going to work.

To me, those Angles teams were designed to manufacture and hold one run leads so there might be something to it.

Matt Shoemaker at the In N Out
Guest
Matt Shoemaker at the In N Out
1 year 9 months ago

Nice name

Go Nats
Guest
Go Nats
1 year 9 months ago

I started to write a long explanation about your lack of understanding about statistics and modeling, but I am bored and want to read other things.

telejeff
Member
telejeff
1 year 9 months ago

Doubt it was boredom. Would love to see you try to explain how random variance explains the observation that 1 team out of 30 substantially exceeded “expectated records by an average of 41 points per year, winning more games than expected in eight of those ten years.” The odds of that occuring are very small.

chuckb
Guest
chuckb
1 year 9 months ago

It seems that more teams should change their names to begin with the letter A. That’s clearly the new market inefficiency.

telejeff
Member
telejeff
1 year 9 months ago

Dave is right about the relationship between the Orioles performance and records the past three years. The observed variance is likely the result of random variation.

The Orioles performance (not record) the past three years has subtantially exceeded pre-season projections. This is a different matter. The likley reason, however, is that pre-season projections aren’t very good, even when based on statistical modeling. Too many variables that can’t be estimated.

Dave’s dismissal of the variation between the 2002-11 Angels performance and record fails, however. It is implausible that random variation can explain 1 team out of 30 beating “expectated records by an average of 41 points per year, winning more games than expected in eight of those ten years.” Either we witnessed an exceedingly rare event or the Angels were doing something positive that has not been accounted for.

baycommuter
Guest
baycommuter
1 year 9 months ago

On the subject of the Yankees outperforming, the Angels who outperformed so much in 2009 was similar–part of their bullpen was good, part was awful. Scioscia (wisely) kept bringing the bad guys in lost games and let them get killed, a formula for overperformance.

DNA+
Guest
DNA+
1 year 9 months ago

Fun things about this thread:

1) The Orioles seem to have a lot of passionate fans. Who knew?

2) Dave Cameron publishes a website full of normally distributed data that reflect nonrandom differences in baseball skill or performance, yet comes to the conclusion that a normal distribution in predicted versus observed wins must mean that the variance is entirely random.

worstfan_NA
Guest
worstfan_NA
1 year 9 months ago

You could be snarky and cynical and say that he is in denial that his system does not accurately measure what he is trying to measure.

that is, creating a cookbook guide to building a good baseball team. It wouldn’t be such a big deal if a random scrubbie team that was supposed to be really terrible was only moderately terrible, or that a moderately terrible team became really terrible, but the fact is that some teams, year over year, have been able to achieve a winning blueprint that diverges from what this website is trying to sell.

its not an elegant blueprint. real baseball people will laugh at this sort of analysis. go ahead, and take all the clubhouse cancers in the world because they have the desirable on field numbers. The manager has to be able to manage the egos on a team.

is it so hard to grasp that good teams may have that not measurable x factor? or does the locker-room have no impact on the game whatsoever?

deadhead
Member
deadhead
1 year 9 months ago

Guys, the thing we are missing here is, Dave Cameron really knows his stuff. He wouldn’t have written it if it weren’t true. I heard he and Brian Kenny had a three way with Joe Maddon on top of The Book, after a couple bottles of Pinot Greesh. That’s the type of experience you just can’t discount.

Jdm
Guest
Jdm
1 year 9 months ago

This is silly. A model airplane is not supposed to fly like a real airplane but mimic it. The same goes for statistical “models” of reality. Just the way a model airplane doesn’t capture all of reality, base runs does not capture every component of winning. Models give us approximations of reality and a way to think about what we might expect. If you try and model every component that might impact your dependent variable you will overfit the data. The concerns with models are systematic biases, which this doesn’t seem to have, possibly an issue with serial correlation but this is our best “model” we currently have. It is fair to question what other components may be added that are predictive but I think it’s silly to point to singular data points and use those to try and prove a model is wrong that works very well for the vast majority of data points.

Jonathan Judge
Member
Jonathan Judge
1 year 9 months ago

I’m late to this party, but this article was really wonderfully conceived and written. Nicely done.

Oblarg
Guest
Oblarg
1 year 9 months ago

The analysis here, as far as I can tell, is basically “the distribution of residuals from our model is normal, therefore the residuals are random noise.”

This is poor statistics. That a distribution is normal says nothing about whether or not it is /predictable/, which is the key aspect of randomness. Yes, random noise will be normally distributed. Not all normal distributions are random. This is a necessary condition, but hardly a sufficient one. In fact, as the standard deviation is calculated from the /observed deviation from the model/ in the first place, the thrust of this article can be essentially boiled down to the claim that “the Orioles’ deviation from the model is not out of line with the deviation we observe in the rest of the data set.” This is true, but the prudent reader should respond, “so what?” One can imagine a model with no predictive value whatsoever in which the same is true. It provides no insight into the nature of that deviation and to what degree it can or can not be predicted by other data available to us.

I often wonder if writers here have taken more than an introductory college stats course, as while the math itself is usually correct, there are /huge/ gaps when it comes to justified inference from the numbers. For example, why is it that despite working with a purely observational data set with no coherent motivation for the underlying models, we never see data partitioned into model sets and test sets? This is something that anyone who understands experimental design should know is necessary – you cannot generate confidence in your model by /testing it against the same data you used to formulate it/. Inference does not work that way.

wpDiscuz