Spring Training Stats That Matter

As stat geeks, we are quick to tell less nerdy baseball fans that spring training stats mean nothing. Whether it’s the tiny sample size, the varying level of competition, the experimenting with new pitches/mechanics/stances etc, there is a ton of noise clouding the data. Even with the obvious explanations, there have still been studies performed to determine whether spring stats have any significance. Sure enough, historical studies have confirmed that spring training stats have limited value.

Several years ago, John Dewan of Baseball Info Solutions determined that hitters with more than 100 career at-bats and 36 spring training at-bats that produce a spring slugging percentage in excess of their career average by 200 points or more will often experience a power spike during the regular season. Although this study received lots of press and the theory became one of the few that people still cite today, it’s flawed. The study used slugging percentage, which includes singles, rather than isolated power, which only includes extra-base hits. If Michael Bourn, he of the .358 career slugging percentage, hit .500 in the spring with a .575 slugging percentage, he would meet Dewan’s power breakout criteria. Of course, his slugging percentage is almost completely composed of singles and should therefore not be expected to enjoy a power surge.

Other studies that have been done have simply looked at surface stats like ERA. However, I am not aware of any that have examined whether peripheral stats in spring training have any significance. When boldly predicting that Francisco Liriano would be a top 10 pitcher this year, I hypothesized that a pitcher’s spring strikeout and walk rates actually do mean something and may foreshadow a breakout or disappointing season. I also guessed that exceptionally strong springs were more significant than poor ones. So I decided to construct a study to test these two hypotheses.

The Study**:

I looked at 749 starting pitchers from 2007-2011 who threw at least 10 innings in the spring and 40 innings during the regular season. The pitchers also had to have a Marcel projection to be used as a control, so we can control for the fact that Clayton Kershaw will likely have both a high spring strikeout rate and high regular season strikeout rate. I chose to focus on K% and BB%, as that eliminates BABIP luck and changes in these metrics are typically the drivers of a breakout or disappointing season.

The Results:

First, the correlations for K% between the three sets of stats:

Season K% Marcel K% Spring K%
Season K% 1
Marcel K% 0.7211 1
Spring K% 0.4971 0.4489 1

Not surprisingly, the correlation between Marcel and Season is much higher (0.72) than Spring and Season (0.50). Since R-squared in a single regression is just the square of the correlation, the R-squared for predicting the Season K% using Marcel K% is 0.72^2, or 0.52. What we want to do is figure out if using any piece of a pitcher’s spring K% in conjunction with Marcel can increase that number. Running multiple regressions determined that indeed we can. The equation is:

K% = -2% + 0.90*(Marcel K%) + 0.18*(Spring K%)

The R-squared jumps to 0.56 (about 0.75 correlation) and the p-stat was 0.000. Success!

Next are the correlations for BB% between the three sets of stats:

Season BB% Marcel BB% Spring BB%
Season BB% 1
Marcel BB% 0.608 1
Spring BB% 0.3792 0.368 1

Once again, Marcel is much better at predicting seasonal BB% than Spring is. Interestingly, BB% appears more difficult to project than K% as both Marcel and Spring had lower correlations than in the K% table. The R-squared for predicting the Season BB% using Marcel BB% is 0.61^2, or 0.37. Like with K%, the idea now is to determine whether factoring in some of a pitcher’s spring BB% into his Marcel projection increases that R-squared. Yes we can! The equation is:

BB% = 0.87*(Marcel BB%) + 0.12*(Spring BB%)

The R-squared improves to 0.40, with another p-stat of .000. So now we also find that a pitcher’s spring BB% does actually carry some significance.

Aside from trying to determine whether spring training K% and BB% rates mean anything for the upcoming season, I felt like really strong performances meant more than poor ones. You cannot fluke your way into striking out a high percentage of hitters, but pitchers work on new pitches or their mechanics in the spring all the time and can easily explain a weak performance.

Unfortunately, I tested this and both ends of the spectrum carried the same weight. In fact, the poor performances were actually a smidge more significant than the strong performances. So in other words, good and bad springs should be treated the same.

Last, I figured I would also test spring ERA one more time to see if we can glean anything from it. As expected, the results proved that it’s all noise.

Conclusions:

-Spring K% and BB% actually do mean something and may help identify breakout and bust performers for the upcoming season
-Good and bad springs carry the same level of significance and they should therefore be treated equally
-Spring ERA is completely useless

On Wednesday, I will identify which pitchers have posted the largest increases/decreases in their K% during the spring as compared to their projections. Then on Thursday I will do the same for BB%.

**A very special thanks to the amazing Matt Swartz for actually running the numbers, providing me with the results and explaining how to interpret them. You rock Matt! This also means that any math/study construction related questions should be directed at him.




Print This Post

Mike Podhorzer produces player projections using his own forecasting system and published the eBook Projecting X: How to Forecast Baseball Player Performance to teach you how to project players yourself. He can be heard live every Wed. night at 9 PM EST on the Fantasy Baseball Roundtable Show. He founded Pal Locale, an online community of Pals available for rent by the hour, and sells beautiful photos through his online gallery, Pod's Pics. Follow Mike on Twitter @MikePodhorzer and contact him via e-mail.

17 Responses to “Spring Training Stats That Matter”

You can follow any responses to this entry through the RSS 2.0 feed.
Click here to view comments in a non-threaded output.
  1. Chris says:

    I guess this question is for Matt – any goodness of fit tests performed to test the i.i.d. assumption?

    Vote -1 Vote +1

    • Matt Swartz says:

      Are you concerned about normality?

      Vote -1 Vote +1

      • Chris says:

        Less interested in the normality of the errors, and more interested in the assumption that the observational data used is i.i.d. – more specifically, interested in statistics which test that assumption.

        Vote -1 Vote +1

  2. V says:

    Love it – I’ve always believed that the two things that you can’t fake are raw power (a player who can hit 5 HRs against MLB talent, even in spring training, meets the power threshold) and the ability to strike MLB hitters out. Those skills can change over time, but they are definitely skills, and spring performance can be analyzed for changes in skill.

    Vote -1 Vote +1

  3. max says:

    What is the correlation between Spring ERA and Season/Marcel ERA?

    (I’m looking for a number here, to back up your assertion that “it’s all noise.”)

    Vote -1 Vote +1

  4. MGL says:

    Matt, not COMPLETLEY random. .02 or .04 is still something. As well, what is the P-value or confidence interval? It is possible that you made a Type II error, no, and that the true values are .1 or .2?

    In any case, we expect a much lower correlation with ERA, but since ERA is explicitly related to BB and K rates, of course it is not going to be completely random if K and BB rates are not.

    I think also we can infer the ERA correlation, at least an estimate of it, from the K and BB correlations, no? The only thing missing would be the HR correlation, since we know that the BABIP correlation between any two samples, let alone 10 IP of spring training and a whole season, is going to be very near zero.

    Vote -1 Vote +1

    • Matt Swartz says:

      The standard errors are about .02, so the coefficients’ confidence intervals are something like (.14,.22) for Spring K% and something like (.08,.16) for Spring BB%.

      Vote -1 Vote +1

  5. MGL says:

    Oh, and really good stuff, BTW! Although I am not surprised that there is some correlation and predictive effect. Why shouldn’t there be? Sure, pitchers work on stuff, they are not in mid-season form, and their competition is not quite the same as the regular reason, but still, I would think that pitching is pitching. The principal limiting factor in terms of the meaning/value of ST stats is sample size. 10 or 20 IP is not much of a sample to me that predictive regardless of whether it is ST or the regular season. I would love to compare the predictive value/correlation of what you got to the same number of IP during the regular season (of course you have to exclude that sample from the regular season stats). I suspect that the ST stats are a little less predictive than the same number of IP during the reg season for the obvious reasons.

    Vote -1 Vote +1

  6. pft says:

    The study did not look at ERA so why is that part of the conclusion?.

    The study showed that BB and K rates in ST are somewhat correlated with BB and K rates in regular season. However, most people want to know if there is any meaning in terms of seasons runs allowed (or ERA), and BB and K rates are only part of the equation there.

    Furthermore, while the data may be significant for the population of pitchers, it is still rather meaningless for individual pitchers whose results may be driven by weaker competition, low arm strength, working on new pitches/mechanics, etc.

    Vote -1 Vote +1

    • The study did look at ERA, but I didn’t publish the correlations because they were so low. My last sentence though stated “Last, I figured I would also test spring ERA one more time to see if we can glean anything from it. As expected, the results proved that it’s all noise.” The correlations I posted in a comment above:

      Spring to season ERA: 0.035
      Spring to Marcel ERA: .02

      Vote -1 Vote +1

  7. MP says:

    Any chance that adjusting for location (Ariz. vs. Flo.) would help the predictiveness of the spring stats? I’m sure it’s not worth doing exact park factors, but I wonder if K and BB rates differ materially between the two ST leagues.

    Vote -1 Vote +1

    • This is a good question and would require a lot more work. I am pretty sure the stadiums in Ariz are much more hitter friendly, so park adjusting may help. But i’m not sure the effect they have on just K% and BB%

      Vote -1 Vote +1

  8. mulkowsky says:

    My long-ago college stats courses taught me that adding any variable will always increase R squared. You need to also look at adjusted R squared and the p-stat of each of the variables separately. (As a start.)

    Vote -1 Vote +1

  9. I’ll try to get Matt in here to respond to the rest of the math related questions. So much for my college stats class…guess that knowledge disappeared from my brain soon after.

    Vote -1 Vote +1

  10. Ben Bishin says:

    This is a very nice start. I have a couple of suggestions:

    1. One potential problem is that essentially you are fitting these observational data and then trying to use them to predict future performances. All these really tell is is that your model fits *these* data quite well, but we don’t really have any sense of how the model plays out going forward. It would be more compelling if you randomly took half your cases to fit the original model and then used the coefficients from that model to predict other half of the cases. We could then see what the R2 would be on this other half to see if it retains its predictive power.

    2. It could be the case that once you control for K% BB% becomes insignificant. so you may want to run a model with both included.

    3. The magnitude of the BB% coefficient is about 2/3 of the size of the K% variable suggesting that K% has an effect about 50% larger than BB%.

    4. Finally, it seems that these effects are relatively small. If I understand your metrics correctly, a 1% increase in K% corresponds to a .18% increase in regular season K% (with Marcel held constant). So a 10% increase in Spring K% corresponds to a 1.8% increase regular season K%.

    Despite my suggestions, this is a very interesting first step. Well done!

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

*