This is what I expected as runs scored should follow a Poisson distribution. There are some reasons why the Poisson distribution is not exactly correct (runs come in bunches). If the Poisson distribution was followed the standard deviation in runs should go as sqrt(runs). So that RS_Vol = sqrt(runs)/runs = 1/sqrt(runs). The actual data does not show this form precisely but the trend of decreasing coefficient of variation with increasing runs was certainly present. My toying with the data found that a revised measure of volatility of sigma(Runs^(1.5))/Runs gives a measure of volatility that is close to independent of runs.

The comparisons of actual wins with pythag expected wins should remove the link between runs scored and volatility. However, since the volatility is correlated to runs scored and runs against I would be worried that the actual measures are more expected-actual trends with runs scored and against. It would be interesting to see if these trends remain with a modified measure of consistency that is independent of runs scored and allowed.

Given the time I will try the Weibull distribution analysis discussed above.

]]>http://cybermetric.blogspot.com/2010/04/how-much-does-team-consistency-matter.html

I first ran a regression with team winning percentage as the dependent variable and runs per game and opponents’ runs per game as the independent variables. Then I added two variables in a second regression which measured consistency. HITCON was the standard deviation (SD) of runs per game divided by runs per game (just the SD would not be right since high scoring teams will have a greater SD). PITCON does something similar on the pitching side.

I found that the more consistent hitting teams win more for a given average runs per game while the less consistent pitching teams win more.

I found it is much more important to score and prevent runs than become more consistent (or less, on the pitching side).

]]>The league shape parameter is typically around 1.85 or so, and is equivalent to the exponent in the Pythagorean (actually, Pythagenpat) win% equation. But individual teams have different shape parameters based on their actual run distributions. The difference between the team shape parameter and the league shape parameter is a better measure, in my opinion, of volatility. This would require fitting a Weibull distribution for every team, but that’s a relatively easy programming task (I would love to do this if I had the time).

My belief from having studied this is that you can’t *construct* a team to have greater or smaller volatility, but that it does happen and that it explains a great deal of deviation from Pyth win%. If anyone finds a way to predict, a priori, what the necessary conditions required for controlling the volatility (or shape) of a team’s run distribution, I’ll buy you a beer.

In the meantime, it’s a heck of a lot easier to stock up on hitters that hit the snot out of the ball and pitchers with great breaking stuff than it is to squeeze blood out of the run distribution stone. It’s a really fun analytical thing to look at but I can’t see the practical consequence.

Thanks for a well thought out article.

]]>Remember, however, when a manager, announcer, fan or other baseball type calls for a player to be consistent, he really is calling for the player to be consistently good, not that he want the player to play closer to his average. ]]>