Also, would it make sense to throw in HR’s/inn since we’re discussing things that raise pitch counts? Pitchers can control HR’s (theoretically) and they definitely increase pitch counts.

]]>P/batter=1.785(K%) + 3670(BB%)+3.187

The big difference is moving it from PA to IP

Using the above equation on 2011 data (I used 2008 to 2010 data to get my equation) here the results

Estimated Tango: 2889 pitches

Estimated Zimmerman: 2925 pitches

Actual: 2926 pitches

R-squared for the data and results

Tango (96.2%)

Zimmerman (96.5%)

Almost the same results. My equation for the IP will be off more than per batter, but I find it a little more useful for IP predictions.

]]>You’re probably just wording things poorly here, but that doesn’t jibe at all with your formula.

Your formula says “Pitchers who don’t strike out or walk anyone average 13.5 pitches per inning and those numbers…”

]]>Also, sicne the fg data only went back to 2002, the sample shrunk from 715 to 592, plus looks like another 19 player seasons didn’t match up [likely players who have a Jr or something in their name; had to join baseball-databank ‘first’ and ‘last’ name fields to the fg name data.]

Anyways, on average the formula from this thread was off by 122 pitches and Tango’s 93.

Splitting up into three equal piles, by (SO+BB) rate:

The top group averaged being off by 126 pitches with this formula and 78 with Tango.

The middle group, 120 and 93. The bottom group, 120 and 108.

So, this method was about as good at estimating pitches regardless of SO+BB rates, while Tango’s improves as SO+BB rates increase.

]]>The top 20% in (SO+BB) rate averaged 32.1% and were estimated to have 2.5% more pitches than Tango’s formula.

The next group averaged 0.7% more estimated pitches.

The middle group, -0.1%

Next, -1.4%

Last, -2.0%

Granted– without a dB full of actual pitch count data [which I don’t have], no way to tell which is the more accurate estimator…

]]>pitches = 3.3xPA+1.5xSO+2.2xBB. Looking at 2000-2010, min 180 IP, here are the guys that the two formulas disagree about the most [per 180 IP]:

2000 Pedro +394

2004 JSantana +274

2004 RJohnson +272

2002 Pedro +264

2001 RJohnson +247

2003 JSchmidt +233

2005 Pedro +231

2005 JSantana +213

2002 RJohnson +213

2002 Schilling +209

2005 Clemens +206

2009 Lincecum +200

…

2002 Sturtze -200

2004 Lowe -209

2000 Haynes -209

2006 Silva -212

2005 Francis -212

2003 JJennings -213

2002 Sparks -217

2004 Lohse -217

2000 Lima -223

2004 JJennings -242

So, at least relative to Tango’s stat, this metric is assigning a lot more pitches to guys with high SO+BB rates and alot fewer pitches to those with low SO+BB rates.

]]>