FanGraphs Baseball - Comments on Offensive Volatility and Beating Win Expectancy
RSS feed for comments on this post.

## Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: `<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> `

0.198 Powered by **WordPress**

Channelclemente will loooooooooooooooooove this!

Comment by Albert C. — January 3, 2013 @ 9:30 am

My guess here is that “blowout” type games are more on the losing team than the winner. IE, you have a blowout because the starting pitcher was terrible that day, went out early, and you spent 8 innings hitting the last 3 guys in the bullpen. (not because everyone on your offense was great that day)

So, at a particular level of runs scored, a team with less blowouts, is probably a better team. Essentially, if a team has lower volatility, more of their runs were talent based, versus opportunity based.

Comment by RC — January 3, 2013 @ 10:11 am

The next question regarding the effect of individual hitter volatility on team volatility will be very interesting to see. If it is, and hitter consistency/volatility is a repeatable skill, then you will be able to project team volatility in the future. Just a gut feeling, but I think that most of team volatility will probably come from randomness in sequencing, and possibly injuries, that will be impossible to predict.

Comment by murphym45 — January 3, 2013 @ 10:57 am

By “If it is”, I really meant, “If hitter volatility does have a significant impact on team volatility”. Poor wording on my part.

Comment by murphym45 — January 3, 2013 @ 11:01 am

Bill, fascinating. A couple of questions.

1) Where did .67 come from?

2) Perhaps this calls for a rank order correlation (Wilcoxon? Spearman?)

Comment by Pizza Cutter — January 3, 2013 @ 12:46 pm

I think there is a subtle flaw here, which perhaps Pizza Cutter’s suggestion of rank ordering might help.

Embedded in all of this is the assumption that two teams with equal run totals had equal offensive performances overall, just with some variation in the distribution of runs. This is generally fine for first order analysis and similar assumptions underpin much of our analytic knowledge of baseball. However you are looking at a (definitionally) second order question. And for this purpose the equivalency assumption may very well break down, not least because runs per game are non-normally distributed.

Consider this extreme example (in a world of only two teams where fractional runs are allowed):

Through 161 games, Team A has scored exactly 5 runs per game, every game.

Through 161 games, Team B has scored exactly 4.907 runs per game, every game.

In game 162, Team B wins 20 to 5.

Team A finishes with a record of 161-1, averaging 5 runs per game with a zero standard deviation.

Team B finishes with a record of 1-161, averaging 5 runs per game with a non-zero standard deviation.

Now, the question: Is Team B an equal offensive team that suffered for having more volatility, or is Team B an inferior offensive team whose inherent inferiority was masked by volatility? We don’t know for sure without filling in more details of game 162, but it is

plausiblethat the 20 run game, like all extreme run outputs, might reflect lack of effort on Team A’s part (i.e. the game was out of hand early and they put in their worst pitcher) as much as skill on Team B’s part.I know this is a silly example, but the underlying problem (that a single high-run game can raise a team’s total runs scored or allowed by a non-trivial amount, and the distribution of run-scoring is asymmetric) is serious. Is volatility an attribute that can differentiate between two otherwise equal teams, or is it a consequence of an inferior team getting lucky and beating up on mop-up relievers a bit more frequently than average?

You could perhaps address this by summing ln(runs) over each game or something like that. But without addressing this in some way or another I think high volatility, rather than being an attribute which explains sub-par performance, is a consequence of an underlying process which causes us to overestimate the team’s win expectancy.

Comment by mcbrown — January 3, 2013 @ 2:19 pm

Very interesting. I would have guessed that greater volatility was better and that the difference was neary negligible. Now I am much smarter in two ways.

Comment by Baltar — January 3, 2013 @ 2:38 pm

There’s another obvious next step: an update the Pythagenpat formula with a volatility adjustment.

Take some dataset of team wins and expected wins. Calculate the volatility metrics for each team-season, then create a correlation between the volatility and (expected wins – actual wins). This correlation becomes an adjustment to the original Pythangenpat formula: New Exp Wins = Old Exp Wins + F(Vol). Then, use some metric like AVERAGE((exp-act)^2) to see how much better the adjustment is than the original.

Comment by CaptWBligh — January 3, 2013 @ 4:31 pm