If a hitter creates a run or runs in a game where a pitcher with a low personal run environment is pitching, wouldn’t that run or runs created have higher value?

IOW, should, (and does) a hitter get extra win value credit for creating a run when a pitcher like C.C. is on the mound, vs. when a pitcher like Livan is out there ?

Comment by shoewizard — January 14, 2009 @ 7:07 pm

So if I’m reading this right, a pitchers’ value is inversely proportional to about the square of his FIP, since a low-FIP pitcher makes each run more valuable and thus each run he prevents more valuable?

I actually read that before, and being and being a Yankee fan also enjoy your work at PMA. OK, if I use your graph, it would seem like a league average 4.78 RA would equal a 9.56 RPG run environment, which would seem to indicate a runs/win of less than 10, while formula states it is 10.17

I am sure there is a logic behind this, but I am not sure what I am missing. I get that averaging the pitchers RA and the leagues is like getting the RA for a game environment between the pitcher and a league average opposing pitcher, but i don’t see where the +2)*1.5 comes from.

I’m loving these results, but I wanted to point out that the 2008 numbers seem a bit off. Unless I am missing something, I don’t think Kenny Rogers (5.22 FIP in 173.2 IP) was a 2 win pitcher. Running the numbers really quickly without park effects (even though I’m assuming Comerica to be right around neutral), I’ve got him at 0.7, which seems to fall more in line with past results. Along the same lines, Nate Robertson and Justin Verlander seem like their win totals are too high as well, especially compared with past seasons. I look forward to seeing the rest of the explanations, as you will probably explain these results. Nice job on this project.

Tango is usually dead-on, but I think the win converter formula is consistently a little too high, and quite a bit too high in very extreme low run environments.

A better converter seems to be:

((League RA + Pitchers RA)^0.72) x 2

For the example given in the article:

((4.78 + 4.78) ^0.72) X 2
(9.56)^0.72 X 2
5.08 X 2
10.16

The article formula is only 0.01 at that run level, but it overestimates by more at lower run levels

For Example at 3.00 and 3.00 instead of 4.78 and 4.78

My Formula = 7.27
Article Formula = 7.50
Difference = 0.23 higher

Although not large differences in most ‘normal’ environments, why use something that less accurate, when the calculation is not more complicated?

That formula I provided was done years ago. I can retest to see what a better fit would be.

Comment by tangotiger — January 15, 2009 @ 8:22 am

Couldn’t we just use Pythag?

4.78+4.78=9.56
9.56^.28=1.88

There’s your exponent.

Then, 4.78*162=774.36.

So:

774.36^1.88/(774.36^1.88+774.36^1.88)=.5

Then

774.36^1.88/(774.36^1.88+y^1.88)=.506

Solve for y, and you get 764.54. 774.36-764.54=9.82.

(For a little more accuracy, we could just recurse over it a few times, perhaps. But refiguring the exponent for the new RA value and you’re at the same number when rounding to two significant digits.)

Colin: the point of the Runs-to-win converter is in its simplicity and linearity. There’s no question that I would choose the Tango Distribution first, the PythagenPat as a close second, then everything else far down, if ease-of-implementation was not a concern.

Comment by TangoTiger — January 15, 2009 @ 1:20 pm

Then the results of KJOK and my equation are virtually identical.

So, I don’t know the basis for KJOK claim that either formula is more accurate than the other.

In the above link, the average error for me is .00325 wins per game, while it’ .00335 for KJOK’s. I have no doubt that I can tweak KJOK’s parameters, while keeping the same framework, that I can get an error of below mine. But, how much better could it be?

When you have two models that spit out similar results, choose the ones that’s easier.

Comment by TangoTiger — January 15, 2009 @ 1:53 pm

If I change KJOK’s parameters to .73 and 1.97, then KJOK’s model comes in at .00324 error.

If I change mine to 1.6, 1.6, my model comes in at .00317 error.

Again, I don’t see it.

Comment by TangoTiger — January 15, 2009 @ 2:02 pm

Defining RPG as the runs per game for both teams, Tango’s formula is .75*RPG + 3.

.75*RPG + 2.75 is a better match (in fact, it should be the best match) for the Pythagenpat-based formula that KJOK is using. Any of them (the Pythagenpat RPG formula, Pythagenpat itself, +3, +2.75) will have nearly identical results with real teams.

In that 20-group bin link I provided, a best-fit where I use .75 as one parameter, sets the other parameter as 3.01.

If someone wants to take the team-level data (say since 1919), and construct a best-fit linear equation, I’d encourage someone to do so. It’s possible that at that level Patriot is right.

Comment by TangoTiger — January 15, 2009 @ 3:14 pm

Ah, what the heck. Taking the 1908 teams since 1919, and using games played rather than innings played, here we go.

If I force the .75 to stick, then I get an average error of .0202 wins, if the second parameter in the Patriot equation is anywhere from 2.66 to 3.09. That is, the error goes up to .02025 at those two extreme points. The absolute minimum point is at 2.86, where you get an error of .020198.

If I force it to stick to .80, then I get an average error of .02021 if the other parameter is 2.41.

If I force it to stick to .70, then I get an average error of .020194, with the other parameter as 3.31.

As you can see, at the team-level, where so many teams are stuck at the .500 level, and all at similar run environments, it doesn’t matter what you use, since you will get an average error of .0202 wins.

The binning as I do in the earlier link highlights the extreme run environments, and so makes the equation more sensitive. I have no problem if David wants to use anything he wants along these lines. But, I cannot see how a non-linear equation improves anything, other than possibly a smidge.

Comment by TangoTiger — January 15, 2009 @ 3:30 pm

For 1961-2003 (excluding ’81 and ’84), the RMSE for W% times 162 are:

The “+2.75” version is the tangent line to the Pythagenpat RPW function at the point RPG = 9 (i.e. near the long-term average RPG). The actual average for that time frame is 8.74 RPG, and at that point the tangent line is .757*RPG + 2.70, which has a RMSE of 3.951.

I should have known Tango was right all along. I was using Pete Palmer’s numbers as my ‘standard’ as I thought they were ’empirical’ Runs to Wins factors (from Total Baseball), but Palmer was apparently using the equation RPW = 10*sqrt(RPG/9), which I guess just happens to come a little closer to my equation than to Tango’s at lower run levels.

It is, but not a lot – there’s eight other players in the lineup that determine a lot of the run environment. You can measure that value using theoretical team BaseRuns, but the difference isn’t that bad.

KJOK: I enjoyed rechecking that anyway, and Patriot is going to do a bit more on this, so I think we’ll all be better off for this.

Comment by tangotiger — January 15, 2009 @ 10:19 pm

Colin,

can you take a look at my question two posts above Nick’s.

It seems to me that if pitchers are going to receive credit (or blame) for creating their own positive or negative run environment, there needs to be some strength of competition adjustment added to the hitters win value formula.

If Stephen Drew smacks a 2 run double off of Jake Peavy, it’s more “valuable” than hitting a 2 run double off of Livan Hernandez, because of the different run environments each pitcher creates.

Conceptually, I realize this is all in the realm of WPA, however since the personal run environment factor is being introduced into the win values calculation for pitchers, it seems a counterbalance is needed for the hitters. Rather than that be a personal run environment for hitters, as Nick suggested, it seems some sort of strength of competition would be more appropriate.

I don’t know how this works, (or doesn’t) mathematically. I’m just conceptualizing. Thanks for taking the time to respond, and I’m sorry if my question is born out of some fundamental lack of understanding whats going on here. I’m trying my best to keep up. !

Thanks to everyone involved in the projects going on, and the efforts to increase baseball understanding.

Comment by shoewizard — January 16, 2009 @ 11:25 am

The thing is, it almost always evens out for hitters.

I guess it’s just ignored with the reasoning that hitters can’t control when they create their runs, against better or worse pitchers. Plus, it would be an insane amount of work to calculate.

I get that the conversion needs to be scaled, but why not scale the average to 10? Or is the AL 10.17, and the whole MLB scaled to 10?

Comment by BJ — January 14, 2009 @ 6:02 pm

Basically why add 2 and then multiply by 1.5? Where does that come from?

Comment by BJ — January 14, 2009 @ 6:08 pm

The scale isn’t made up. It’s based on the amount of runs needed to create a win. I don’t really love pimping my own work, but check here:

http://statspeak.net/2009/01/when-ten-runs-isnt-a-win.html

Comment by dan — January 14, 2009 @ 6:37 pm

Please filed this under no question too dumb:

If a hitter creates a run or runs in a game where a pitcher with a low personal run environment is pitching, wouldn’t that run or runs created have higher value?

IOW, should, (and does) a hitter get extra win value credit for creating a run when a pitcher like C.C. is on the mound, vs. when a pitcher like Livan is out there ?

Comment by shoewizard — January 14, 2009 @ 7:07 pm

So if I’m reading this right, a pitchers’ value is inversely proportional to about the square of his FIP, since a low-FIP pitcher makes each run more valuable and thus each run he prevents more valuable?

Comment by Jake — January 14, 2009 @ 8:15 pm

I actually read that before, and being and being a Yankee fan also enjoy your work at PMA. OK, if I use your graph, it would seem like a league average 4.78 RA would equal a 9.56 RPG run environment, which would seem to indicate a runs/win of less than 10, while formula states it is 10.17

I am sure there is a logic behind this, but I am not sure what I am missing. I get that averaging the pitchers RA and the leagues is like getting the RA for a game environment between the pitcher and a league average opposing pitcher, but i don’t see where the +2)*1.5 comes from.

Comment by BJ — January 14, 2009 @ 8:29 pm

Why can’t a great hitter like Pujols increase his run environment?

Comment by NickP — January 14, 2009 @ 10:58 pm

Hey guys.

I’m loving these results, but I wanted to point out that the 2008 numbers seem a bit off. Unless I am missing something, I don’t think Kenny Rogers (5.22 FIP in 173.2 IP) was a 2 win pitcher. Running the numbers really quickly without park effects (even though I’m assuming Comerica to be right around neutral), I’ve got him at 0.7, which seems to fall more in line with past results. Along the same lines, Nate Robertson and Justin Verlander seem like their win totals are too high as well, especially compared with past seasons. I look forward to seeing the rest of the explanations, as you will probably explain these results. Nice job on this project.

Comment by Eddie — January 14, 2009 @ 11:14 pm

Tango is usually dead-on, but I think the win converter formula is consistently a little too high, and quite a bit too high in very extreme low run environments.

A better converter seems to be:

((League RA + Pitchers RA)^0.72) x 2

For the example given in the article:

((4.78 + 4.78) ^0.72) X 2

(9.56)^0.72 X 2

5.08 X 2

10.16

The article formula is only 0.01 at that run level, but it overestimates by more at lower run levels

For Example at 3.00 and 3.00 instead of 4.78 and 4.78

My Formula = 7.27

Article Formula = 7.50

Difference = 0.23 higher

Although not large differences in most ‘normal’ environments, why use something that less accurate, when the calculation is not more complicated?

Comment by KJOK — January 14, 2009 @ 11:19 pm

I know how to derive pitcher winning%. I also know how to get from winning percentage to WAR. But how do you get value runs?

Comment by David — January 15, 2009 @ 12:07 am

That formula I provided was done years ago. I can retest to see what a better fit would be.

Comment by tangotiger — January 15, 2009 @ 8:22 am

Couldn’t we just use Pythag?

4.78+4.78=9.56

9.56^.28=1.88

There’s your exponent.

Then, 4.78*162=774.36.

So:

774.36^1.88/(774.36^1.88+774.36^1.88)=.5

Then

774.36^1.88/(774.36^1.88+y^1.88)=.506

Solve for y, and you get 764.54. 774.36-764.54=9.82.

(For a little more accuracy, we could just recurse over it a few times, perhaps. But refiguring the exponent for the new RA value and you’re at the same number when rounding to two significant digits.)

Comment by Colin Wyers — January 15, 2009 @ 12:10 pm

Colin: the point of the Runs-to-win converter is in its simplicity and linearity. There’s no question that I would choose the Tango Distribution first, the PythagenPat as a close second, then everything else far down, if ease-of-implementation was not a concern.

Comment by TangoTiger — January 15, 2009 @ 1:20 pm

Using the data here as my source:

http://www.tangotiger.net/winactuals.html

Then the results of KJOK and my equation are virtually identical.

So, I don’t know the basis for KJOK claim that either formula is more accurate than the other.

In the above link, the average error for me is .00325 wins per game, while it’ .00335 for KJOK’s. I have no doubt that I can tweak KJOK’s parameters, while keeping the same framework, that I can get an error of below mine. But, how much better could it be?

When you have two models that spit out similar results, choose the ones that’s easier.

Comment by TangoTiger — January 15, 2009 @ 1:53 pm

If I change KJOK’s parameters to .73 and 1.97, then KJOK’s model comes in at .00324 error.

If I change mine to 1.6, 1.6, my model comes in at .00317 error.

Again, I don’t see it.

Comment by TangoTiger — January 15, 2009 @ 2:02 pm

Defining RPG as the runs per game for both teams, Tango’s formula is .75*RPG + 3.

.75*RPG + 2.75 is a better match (in fact, it should be the best match) for the Pythagenpat-based formula that KJOK is using. Any of them (the Pythagenpat RPG formula, Pythagenpat itself, +3, +2.75) will have nearly identical results with real teams.

Comment by Patriot — January 15, 2009 @ 3:02 pm

In that 20-group bin link I provided, a best-fit where I use .75 as one parameter, sets the other parameter as 3.01.

If someone wants to take the team-level data (say since 1919), and construct a best-fit linear equation, I’d encourage someone to do so. It’s possible that at that level Patriot is right.

Comment by TangoTiger — January 15, 2009 @ 3:14 pm

Ah, what the heck. Taking the 1908 teams since 1919, and using games played rather than innings played, here we go.

If I force the .75 to stick, then I get an average error of .0202 wins, if the second parameter in the Patriot equation is anywhere from 2.66 to 3.09. That is, the error goes up to .02025 at those two extreme points. The absolute minimum point is at 2.86, where you get an error of .020198.

If I force it to stick to .80, then I get an average error of .02021 if the other parameter is 2.41.

If I force it to stick to .70, then I get an average error of .020194, with the other parameter as 3.31.

As you can see, at the team-level, where so many teams are stuck at the .500 level, and all at similar run environments, it doesn’t matter what you use, since you will get an average error of .0202 wins.

The binning as I do in the earlier link highlights the extreme run environments, and so makes the equation more sensitive. I have no problem if David wants to use anything he wants along these lines. But, I cannot see how a non-linear equation improves anything, other than possibly a smidge.

Comment by TangoTiger — January 15, 2009 @ 3:30 pm

For 1961-2003 (excluding ’81 and ’84), the RMSE for W% times 162 are:

Pythagenpat (x = .29) 3.950

Pythagenpat RPW (2*RPG^.71) 3.952

“+2.75” 3.951

“+3” 3.949

The “+2.75” version is the tangent line to the Pythagenpat RPW function at the point RPG = 9 (i.e. near the long-term average RPG). The actual average for that time frame is 8.74 RPG, and at that point the tangent line is .757*RPG + 2.70, which has a RMSE of 3.951.

Comment by Patriot — January 15, 2009 @ 3:31 pm

Good job. So, that pretty much settles it then, that the “+3” is indeed the best-fit for Patriot’s time period, and close enough for my time period.

Comment by TangoTiger — January 15, 2009 @ 4:29 pm

I dare you guys to make the site more awesome and make me more obsessed. (Good luck with that)

Comment by Samg — January 15, 2009 @ 9:03 pm

I should have known Tango was right all along. I was using Pete Palmer’s numbers as my ‘standard’ as I thought they were ’empirical’ Runs to Wins factors (from Total Baseball), but Palmer was apparently using the equation RPW = 10*sqrt(RPG/9), which I guess just happens to come a little closer to my equation than to Tango’s at lower run levels.

I stand corrected.

Comment by KJOK — January 15, 2009 @ 9:04 pm

It is, but not a lot – there’s eight other players in the lineup that determine a lot of the run environment. You can measure that value using theoretical team BaseRuns, but the difference isn’t that bad.

Comment by Colin Wyers — January 15, 2009 @ 9:57 pm

KJOK: I enjoyed rechecking that anyway, and Patriot is going to do a bit more on this, so I think we’ll all be better off for this.

Comment by tangotiger — January 15, 2009 @ 10:19 pm

Colin,

can you take a look at my question two posts above Nick’s.

It seems to me that if pitchers are going to receive credit (or blame) for creating their own positive or negative run environment, there needs to be some strength of competition adjustment added to the hitters win value formula.

If Stephen Drew smacks a 2 run double off of Jake Peavy, it’s more “valuable” than hitting a 2 run double off of Livan Hernandez, because of the different run environments each pitcher creates.

Conceptually, I realize this is all in the realm of WPA, however since the personal run environment factor is being introduced into the win values calculation for pitchers, it seems a counterbalance is needed for the hitters. Rather than that be a personal run environment for hitters, as Nick suggested, it seems some sort of strength of competition would be more appropriate.

I don’t know how this works, (or doesn’t) mathematically. I’m just conceptualizing. Thanks for taking the time to respond, and I’m sorry if my question is born out of some fundamental lack of understanding whats going on here. I’m trying my best to keep up. !

Thanks to everyone involved in the projects going on, and the efforts to increase baseball understanding.

Comment by shoewizard — January 16, 2009 @ 11:25 am

The thing is, it almost always evens out for hitters.

Comment by Samg — January 18, 2009 @ 2:42 pm

? ??????? ? ???? ?? ????? ??????????, ?? ????? ????? ?????? ???? ??????? ?????? ????????. ??????? :)

Comment by ????o??p??pa? — April 10, 2009 @ 12:47 am

I guess it’s just ignored with the reasoning that hitters can’t control when they create their runs, against better or worse pitchers. Plus, it would be an insane amount of work to calculate.

Comment by Blueyays — September 4, 2013 @ 6:33 pm