Adjusting Linear Weights for Extreme Environments

Well, it’s my first assignment as a real writer, having been promoted for my Community Research articles on pitcher BABIPs and ERA estimators, and I’ve been thrown into the deep end of the pool: linear weights.  It’s a tricky subject, but I’ll try to walk you through both the problems with linear weights and how they can be overcome.  This article series mainly draws from various works of Tom “Tango,” a.k.a. “tangotiger,” the creator of wOBA and FIP, as well as from David Smyth’s BaseRuns.  I’ll go deeper and deeper down the rabbit hole of stat geekishness as the series goes on, eventually emerging with a spreadsheet version of Tango’s Markov run modeler that I made for you all to play with.  Where the Markov mainly shines over wOBA is when it comes to extreme run environments, such as unusual offenses or extreme ball parks.

Who cares about extreme run environments?

Nerds like me, I guess?  Tom Tango cared enough to come up with ways to address the shortcomings his original wOBA formulation.  If you’ve ever wondered how valuable a certain player is to your favorite team, maybe you should care too; that low-OBP slugger might be more valuable than wOBA might suggest to your low-OBP team.  On the other end, a typical walk last year was worth considerably more to the high-OBP Cardinals than it was to the low-OBP Mariners (around 0.04-0.065 more runs each… which adds up over a season).

Let me pose a hypothetical: if a team’s offense hits one home run per game, and nobody else ever reaches base otherwise, how many runs per game would that team average?  The answer is 1, right?  So why do the linear weights used in wOBA, for example, say a home run is worth “1.4 runs above average, and 1.7 runs above the run value of the out“?  Well, this, and the other linear weights models of scoring break down in unusual circumstances, due to their assumptions that certain conditions will be at the league average.  In this case, the linear weights are overestimating how many runners will be on base when the home run is hit, as it’s going by what’s typical.  That’s fine for a team that has a pretty typical number of runners on base for each home run.  Yup. 

So, what’s a home run really worth?

That depends on how many runners are on base, right?  So, between 1 and 4 runs, you might say.  Of course, the home run hitter deserves full credit for driving himself in, but he doesn’t get to take the credit for runners getting on base ahead of him — only for driving them in. So, we now have to divvy up credit for the runs scored on the homer between the players involved.  How do we do that?  Well, you don’t get 1 credit for each RBI and each run you score, because then a team would end up with twice as many credits as actual runs scored.  It would be closer to a half credit for each RBI and run, but it’s more complicated than that.  Let’s look at the 4 runs produced by a grand slam, for example:

  • The hitter gets 1 run credit, for the RBI and the run he scored himself + whatever driving in the 3 runners is worth
  • The 3 runners get credit for getting on base, plus for the RBI potential of the type of hit (or walk) that got them on base.  In the circumstance of being driven in by a HR, for the purposes of run-scoring potential, it doesn’t really matter whether they got on via walk, single, or whatever, since a HR drives them all in.

That’s actually kind of complicated, when you think about what goes into “whatever driving in the runners is worth.”  That depends on how likely those further down in the order would have been to drive them in instead.  That isn’t a simple thing to figure out; it in turn depends on things like:

  • the initial base positions of the runners
  • the rates of BB, 1B, 2B, 3B and HR (etc.) of subsequent batters
  • the speed and base-stealing abilities of those on base
  • the number of outs

Then, of course you also have to consider how many runners are likely to be on base when a home run is hit.  This is the most important factor of them all in this equation, if you ask me.

Interactions

If your eyes haven’t glazed over yet, hopefully you see this: different types of hits have different values to different teams.  The fact is, a team’s on-base ability, slugging, and speed all interact with each other when it comes to the process of scoring runs, such that one factor can add or subtract value from another.

I’ll now break down some of the ways the abilities impact each other:

  • The more runners that are on base, the more value any subsequent hit has, all else equal, as there are more RBI opportunities. (Up to a point… more on that later)
  • If the team has a very high OBP, it will be able to sustain longer rallies, and will therefore be less dependent on the home run to score runs (i.e., singles, walks, etc. will be more valuable relative to the home run, compared to low OBP teams).
  • In a low-OBP team, however, while a home run is likely to score fewer runs than it will in an otherwise similar high-OBP team, the value of a home run relative to other hit types will be greater, as the team will be less likely to rally.
  • Digging even deeper, if a team hits a lot of home runs, the average value of a home run actually drops, due to more runners having been cleared from the bases by previous home runs.
  • Base running ability becomes more relevant the closer to 1B the runner is, as only during a couple particular types of batted ball (some grounders and some flyouts) will the speed of a runner on 3B make the difference between a run and a non-run.  So, good baserunning is relatively more important to a low-OBP team, as there will be fewer rallies that allow the runners to advance.
  • The abilities of the base runners are made less relevant, the greater the value of the hit that advances them; home runs and triples automatically clear the bases regardless of the speed of a base runner (only in rare occasions does a slow or stumbling runner on 1B prevent the batter from reaching 3B).   So, good base running is also more important to low-slugging teams.
  • This one is pretty important: the fewer outs the batter makes, the more opportunities (plate appearances) he allows his teammates and himself to have, which by itself allows the potential for more run-scoring.

Most probably seem obvious to you now, yet linear weights formulas ignore, or don’t properly deal with these interactions.  They assume that a walk, or a single, or a home run is each worth a fixed value based on league averages.  Some systems, like wOBA, do now recognize that the values aren’t really fixed, and change them annually to reflect the run environments of the time:

Single and HR values2

Tom Tango came up with these, but FanGraphs has the most updated list of wOBA constants here: http://www.fangraphs.com/guts.aspx.  To find the run value of each event (relative to the out), you divide the factor by the wOBAScale constant of the same year.

Below, you’ll see five of the highest and five of the lowest years in terms of run values of singles (R/Single) and home runs (R/HR).  The run values include the value of not making an out. 

Year R/Single R/HR R/G AVG OBP SLG OPS wOBA
1930 0.8214 1.723 5.546 0.296 0.356 0.435 0.791 0.356
1936 0.7833 1.683 5.187 0.284 0.349 0.404 0.753 0.349
2000 0.7826 1.696 5.140 0.270 0.345 0.437 0.782 0.341
1994 0.7627 1.686 4.923 0.270 0.339 0.424 0.763 0.333
1950 0.7623 1.689 4.849 0.266 0.346 0.402 0.748 0.346
1908 0.6159 1.603 3.383 0.239 0.297 0.305 0.602 0.297
1968 0.6233 1.603 3.417 0.237 0.299 0.340 0.639 0.291
1917 0.6395 1.605 3.588 0.249 0.311 0.324 0.635 0.311
1943 0.6643 1.630 3.911 0.253 0.323 0.344 0.667 0.323
1976 0.6739 1.623 3.995 0.255 0.320 0.361 0.681 0.315

Notice any trends that explain why the values go up or down?  Blatant hint: look at runs per game (R/G).  More on that in the next article…

So, where do we go from here?  If a context-neutral stat for each season is good enough for your needs, then this is the end of the line, and you can just go with the wOBAs listed on FanGraphs.  But if you want to see how much a player impacts a particular team, or you want to analyze a hypothetical team (e.g. estimating how many runs your team will score next year based on projected stats), you’ll need to go deeper.  In the next part of the series, I’ll tell you more about that.



Print This Post



Steve is a robot created for the purpose of writing about baseball statistics. One day, he may become self-aware, and...attempt to make money or something?


Sort by:   newest | oldest | most voted
Henry
Guest
Henry
3 years 6 months ago

This might be a stupid question, but by “assumptions that certain conditions will be at the league average” do you mean separate averages for NL/AL or is that distinction not being made?

Encarnacion's Dinosaur Arm
Guest
Encarnacion's Dinosaur Arm
3 years 6 months ago

Awesome work. Do you have any idea if projection systems like Pecota currently consider team-specific variables when projecting next year’s standings?

kdm628496
Member
kdm628496
3 years 6 months ago

i LOVE where this is going. keep it up, steve!

Oh, Beepy
Guest
Oh, Beepy
3 years 6 months ago

Excellent read. Congratulations on your staff spot, Mr. Staude.

Ty
Guest
Ty
3 years 6 months ago

This might be a dumb question, but is there any such thing as a +/- for wOBA (kind of like OPS+ or wRC+)?

Dan
Guest
Dan
3 years 6 months ago

I am intrigued. Any idea on when the following articles will be published? Not pressuring, just don’t want to miss them. Congrats. My favorite article in a while.

SrMeowMeow
Guest
SrMeowMeow
3 years 6 months ago

Great addition. Best recent community posts. After three articles total, you slot in as my second-favorite regular writer. And Dave Cameron better look out if you keep this up.

In case it’s questioned, I am neither a) your relative, b) a paid shill, or c) joking. Congratulations!

pft
Guest
pft
3 years 6 months ago

Here is a question. How much is a BB worth if a team walks a maximum of 3 times an inning for 9 innings and all outs are by strikeouts with no PB or WP.

In this example 27 walks would yield no runs. Obviously the walk is overvalued.

Seriously, I like the article but I look forward to see more of how the assumptions of linear weights are not very accurate for every spot in the order or for different teams/parks with different run environments, and how the current usage today is limited (eg limitations in determining who is MVP)

Tangotiger
Guest
Tangotiger
3 years 6 months ago

Events are roughly independent of each other.

You are asking what would happen if after the third walk in the inning, you are 100% guaranteed of getting strikeouts until the end of the inning. And you won’t have a WP or PB or BK or PK-error with the runner on third. And to have repeat this for nine innings (or more if extra innings).

That’s hardly independent, and so, I don’t see the point.

Joe Morgan
Guest
Joe Morgan
3 years 6 months ago

Digging even deeper, if a team hits a lot of home runs, the average value of a home run actually drops, due to more runners having been cleared from the bases by previous home runs.

Is this similar to that old adage that an announcer once said (I don’t think it was Morgan, but it might have been) that he’d rather have a double in the middle of a rally than a homerun, because the double essentially “keeps the rally going”

Brandon T
Guest
Brandon T
3 years 6 months ago

Of course, now we can talk about the WIN value of a home run. In a low run-scoring environment, like 1969 where a HR was only worth 1.6 runs, the run value of a home run — correct me if I’m wrong — is actually worth more wins, right?

Ed
Guest
Ed
3 years 6 months ago

Just wanted to say congratulations on your promotion! Looking forward to future articles on this subject (an interesting and important subject!).

Spit Ball
Member
Spit Ball
3 years 6 months ago

Awesome stuff Steve. They did not hire you to the staff for no reason. When a relatively informed fangraphs reader (or so I like to think) has to read the article a few times to comprehend and fully understand, an excellent article has been written. Your mind obviously spends a greay deal of it’s free time on all thing sabermetric and ways to improve the field. That’s brains and passion combined. Keep it up. GREAT F%CK&NG HIRE. CONGRATULATIONS and THANK YOU!

Baltar
Guest
Baltar
3 years 6 months ago

I am so glad to see a very good writer who is so conversant with statistics doing statistical articles for FanGraphs now. I’m really looking forward to your future articles. The possibilities are infinite.

wpDiscuz