Win Values Explained: Part Five

For the last couple of days, we’ve been talking about the different components of the Win Value system. However, you may have noticed that we’ve been dealing entirely in runs. wRAA, UZR, the position adjustment, and replacement level are all expressed in terms of runs, not wins, and that’s why the sum of those numbers is categorized under Value Runs.

So, if all of our metrics deal in terms of runs, but we want to get to wins, we need to know how many runs it takes to make a win. This is actually quite a bit easier than it sounds, thanks to the pythag formula for expected win-loss records. For those not aware of pythag, you can get a good estimate of a team’s winning percentage by dividing the square of runs scored by the sum of the square of runs scored and the square of runs allowed. Or, in formula terms:

RS^2/(RS^2 + RA^2) = Pythagorean Winning Percentage. So, if a team scored 775 runs and allowed 775 runs, they’d have a .500 Pythag Win%, or 81 wins and 81 losses – even amounts of runs scored and runs allowed should lead to something like an even record. Not as scary as it sounds.

What happens if we subtract 10 runs from the runs scored column, so that we now have a 765 RS/775 RA team? Pythag spits out a .4935 win%, and .4935 * 162 = 79.95 wins. So, instead of 81 wins, you’re now expected to win just barely less than 80. By subtracting 10 runs, you lost a fraction more than one win.

Same thing happens if you add 10 runs to the runs allowed column – 775/785 RS/RA spits out .4935 as well. How about if you add 10 runs, so we have a 785/775 team? .5064 win%, or 82.03 wins. Again, 10 runs added equals one win gained.

For an even more precise look at the issue, you could use the improved PythagenPat method, which places a better exponent in the calculation, but the conclusion is going to be the same; 10 runs = 1 win.

So, when you see value expressed in runs, but you want it in wins, just divide by 10. Likewise, if you see it in wins but you want it in runs, multiply by 10. It might sound like a cheap trick, but it’s reality – 10 runs add up to a win. A +20 run player is a +2 win player.




Print This Post

Dave is a co-founder of USSMariner.com and contributes to the Wall Street Journal.

14 Responses to “Win Values Explained: Part Five”

You can follow any responses to this entry through the RSS 2.0 feed.
Click here to view comments in a non-threaded output.
  1. dan says:

    Not sure if there’s a part six, but you might want to throw in the run environment part of that as well–the translation for runs to wins changes slightly depending on the run environment.

    Vote -1 Vote +1

    • Dave Cameron says:

      I’m mostly trying to keep these basic enough for people who want to understand that the system works but aren’t going to try to re-engineer it on their own. So, I’m going to skip a lot of little details like that, and figure that for people who that kind of information matters for will look at the more in depth pieces written by guys like Tango.

      Vote -1 Vote +1

      • studes says:

        That’s kind of too bad, David. I think we really suffer from being able to go to a place to understand these new stats (and you know how hard that is for me). Blog entries aren’t a good way to organize this content — that’s not what they’re meant to do. And if you’re only covering the “superficial” level of explanation, without at least links to more detail, that’s not as helpful as it could be.

        I would suggest that, in your Glossary, you guys make a list of these blog entries, along with a list of blog entries that cover the details in more depth. I really think we need a central place that includes all the links to all the details, in a reader-friendly layout, in order to gain acceptance for your new stats.

        Vote -1 Vote +1

      • Dave Cameron says:

        Yea, we’ll end up doing something like that. I’m not trying to discourage discussion of the more in depth parts of the system, but was rather just saying that due to the nature of blog posting, I wasn’t going to be able to tackle everything in these write-ups.

        Vote -1 Vote +1

  2. Terry says:

    This addition to fangraphs is a rout as a major win for fans goes because it draws attention to the best saber method of evaluating player worth available (and it’s so readily accessible that it hurts). This is pretty huge and fangraphs has taken leaps forward in the last month or so from being a really, really good site to basically being one of the premier resources for baseball fans on the web.

    All of that well-deserved praise aside, the win values suffer from reliance on a single defensive measure and while they will likely be quoted very often as absolute truth (or maybe more softly put, as the final answer on the issue in discussions in blogs and boards across the web), they are at best a “rough estimate” starting point IMHO because of the way it handles defense (ironically a similar flaw as BP’s traditional approach to the problem of evaluating total player worth).

    Am I overstating this perceived weakness-i.e. complaining that the homerun only cleared the fence rather than smashed a windshield in the parking lot?

    Vote -1 Vote +1

    • philosofool says:

      This is certainly a good point. IMO, the best way to handle defensive stats is to average all the PBP data you have available. PRM-to-runs is available for free if you’re willing to seach for it. And +/- (which has to be mulitplied by .8 to get runs rather than plays) is available from Bill James low cost site. The average of those should, I think, be regarded as reliable. However, when you find more-or-less consensus among your PBP values, just using UZR for your WAR projection will work fine. Any player in baseball can rise or fall .5 WAR from one season to the next, so we shouldn’t split too many hairs when producing this estimator.

      Also, even if our projection of defensive run prevention is less reliable than our projection of run production, it’s important not to do what we did for years and pretend like everyone is the same defender.

      Vote -1 Vote +1

    • tangotiger says:

      Yes, Terry, you are overstating the case.

      There is uncertainty in ALL sample measures, without exception. The only thing that you can complain about is that Fangraphs doesn’t publish the uncertainty level.

      But, given the choice between what Fangraphs is showing, and what anyone anywhere is publicly showing, it’s no contest.

      You’ve got yourself a triple, and you are complaining that it’s not a Homerun, when the alternative is a single, or worse, talk-radio strikeouts.

      Vote -1 Vote +1

  3. Terry says:

    Lets pick on Dunn for a bit because he has a universally held reputation for being a very bad defender (so many people might take his -22 run rating at face value). He’s also a big name FA this offseason so his signing will undoubtely generate alot of commentary. I’m also a big fan of the Mariners/Reds/Rays so it’s within the realm of possibility that he’ll still be on one of my favorite teams in ’09 so his value is an important issue to me for selfish reasons.

    Anyway, UZR hated Dunn in ’08 (using either data set) but really UZR was much more down on Dunn’s glove last season than other systems. For instance, Dewan’s +/- agreed that Dunn was a bad defender but suggested he was roughly -10 runs bad (or maybe a little worse). JinAz’s system using a different approach pretty much agreed that Dunn was bad to the tune of roughly -10 runs. However, PMR basically thought Dunn was roughly a neutral defender in -08. The CHONE projection for Dunn in ’09 is that of a -13 run defender (I’m going on memory here). It seems reasonable looking at such a survey to suggest Dunn’s true defensive value in ’08 was something closer to -15 runs at the worst and maybe as high as -10 runs (still no great shakes but dramatically different than -22).

    Assuming that argument is at least plausable, then the fangraph’s win value for Dunn’s ’08 could be off by something close to a win just from the way it handled Dunn’s defense (even applying a +/- 5 run margin of error to his UZR at best might still significantly low ball him and at worst lead to dramatically undervaluing him). Basically the disparity would be the difference between concluding the Reds paid Dunn roughly exactly what they should’ve to concluding they dramatically overpaid him.

    To me that’s a little more than griping that a triple wasn’t a homer.

    Once again, a million kudos to win values now being so easily and readily accessible-I think it’s the gold standard approach (and BTW, thank you Tango for being so unbeleivable generous with your work).Win values are now in the mainstream baseball fan converasation and that in and of itself is a huge step forward. It’s just that I think we should double check the defensive part of the equation when relying only on a single metric before quoting the figure verbatim. That’s all I’m saying.

    Vote -1 Vote +1

    • I’m not entirely sure I see the point of using three different defensive systems that use the same data. You wouldn’t average different offensive metrics like Base Runs, Runs Created, or wRC to get a better idea of how a player did on offense.

      If you want to average in a system that uses STATS data, or some other data source, then I’d buy into that, though unfortunately that won’t be happening anytime soon.

      Vote -1 Vote +1

      • Dave Cameron says:

        David is totally right. Justin’s system and the CHONE defensive projections are secondary data, based off things like UZR and PMR. When trying to question the validity of those primary sources, using dependent secondary sources as counter evidence doesn’t work.

        Vote -1 Vote +1

      • Terry says:

        Ignore JinAz and CHONE then…

        Using PMR’s estimate, Dunn was worth about $18M in ’08. Using fangraph’s UZR estimate, he was worth about $8M.

        Like I said, this isn’t meant to be argumentative and maybe even picking out Dunn happens to be a case of highlighting a fairly rare case, but that’s a large enough gap for even Bavasi’s moving van to drive through.

        In the end, it’s really about estimating what his value will be next season…

        I think Tango is exactly right, we’re missing a “brackets”.

        Vote -1 Vote +1

      • Dave Cameron says:

        I know we had this conversation on USSM, but this bears repeating.

        Adam Dunn’s bUZR, 2005 to 2008: -16, -12, -15, -23
        Adam Dunn’s sUZR, 2005 to 2007: -7, -19, -29
        Adam Dunn’s +/-, 2006 to 2008: -18, -28, -23

        You keep acting like we have one data point that says Dunn’s bad and one data point that says Dunn is okay. We don’t – we have about 4,000 data points that say Dunn is a horrible fielder and one that says he isn’t. PMR is the outlier. There’s no reason to give it 50% weight here. It’s wrong.

        Vote -1 Vote +1

  4. tangotiger says:

    Terry: thanks for the kind words.

    I wouldn’t call a 10-run difference “dramatically different”. What to say of STATS-based UZR and BIS-based UZR having a 112 run difference for Andruw Jones over six years? Super-duper dramatic?

    I could buy into averaging BIS-based UZR and BIS-based PMR, because they have different methodologies (even though they have very similar inputs). UZR and PMR and everyone else agrees on the number of outs Adam Dunn made. The question is understanding the context under which he made those outs (who the batter, pitcher, park, runners, and outs were, and exactly where and how hard did the ball go?), and how many outs would an average player have made under that same context.

    If for example Adam Dunn made 235 outs in LF/RF, and MGL/UZR says that an average OF would have made 260, and Pinto/PMR says 255 (or whatever he is saying), and Dewan/PM says 250, is that so different?

    This is really the chief issue here. There’s an uncertainty level, and that’s what’s really missing.

    Vote -1 Vote +1

  5. Matthias says:

    I realize I’m way late on this discussion, but I’d like to point out how starting pitchers and position players have a very different distribution of RAR. A pitcher’s RAR is concentrated in approximately 35 games, while a position player is likely to spread his RAR out over 150 games (for a full season, obviously).

    Using pythag W/L%, if we spread all the runs saved over all 162 games, we can get a very different value than spreading them out just over the 35 starts. As an example, say FIP leads us (I think this is what Fangraphs uses for WAR, right?) to believe that Halladay has saved the Phillies 56 runs over a replacement pitcher in 27 starts, and the Phillies have played a total of 130 games.

    We could say that an average NL team scores about 578 runs in 130 games (9-inning games for simplicity’s sake) and then check the difference in Pyth W/L when those runs are taken away from the Runs against part of the formula. My math gives me a WAR of 6.6.

    Now instead, say we only focus on those 27 starts. Halladay averages 7 2/3 innings per start. So I set his team’s Runs Scored to the average, or about 4.45 R/9, and then set his team’s Runs Against to the weighted average of his FIP and an average bullpen’s FIP. Halladay pitches 85% of every 9 innings when he starts, so I would weight the RA accordingly. Now we’ll get a difference in pyth W/L based only on those 27 games. I can’t finish the example very well because I’m not sure where all the replacement levels are, but I hope my point is understood. I have found in a few examples using my own R/9 formulas that the difference between the two methodologies can be as much as a full win or two.

    Does this defeat the simplicity and “presentability” of WAR? Have I botched the methodology? I just think 10 runs = 1 win is a little too simple to use for both hitters and pitchers (especially relievers).

    Thanks again for all the work you do :-)

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

*