Sample Size
Description:
So we have all of these statistics, but when can we use them? Suppose a player goes three for three in their first game in the big leagues. Should we expect this player to continue batting 1.000 for the rest of the season? Of course not, that’d be silly. Three at-bats is way too small a sample to draw conclusions about a player, but then we’re left with the question: at what point do statistics become reliable? For the answers, see below:
Offense Statistics:
- 50 PA: Swing%
- 100 PA: Contact Rate
- 150 PA: Strikeout Rate, Line Drive Rate, Pitches/PA
- 200 PA: Walk Rate, Ground Ball Rate, GB/FB
- 250 PA: Fly Ball Rate
- 300 PA: Home Run Rate, HR/FB
- 500 PA: OBP, SLG, OPS, 1B Rate, Popup Rate
- 550 PA: ISO
Pitching Statistics:
- 150 BF – K/PA, grounder rate, line drive rate
- 200 BF – flyball rate, GB/FB
- 500 BF – K/BB, pop up rate
- 550 BF – BB/PA
In case it’s not obvious, you can tell a lot more about a hitter from one year of data than you can about a pitcher. All this data is from research that Pizza Cutter conducted, which can be found in the links below. If a statistic is not included, the means it did not stabilize over the intervals that Pizza Cutter tested (which was up to 750 PA / BF).
Also, a quote worth remembering: “In small sample sizes, a good scout is ALWAYS better than stats.”
Links for Further Reading:
525,600 Minutes: How Do You Measure a Player in a Year? – Statistically Speaking / Pizza Cutter
On the Reliability of Pitching Stats – Statistically Speaking / Pizza Cutter




1
Steve,
If GB rate and GB/FB are reliable at 200 PA, doesn’t that imply that FB rate is also reliable at 200 PA? I only ask because you have it listed at 250 PA. Sorry if this was nitpicking. I think the whole library is fantastic. Thanks!
In the same vein, isn’t BB/PA implicitly reliable at 500 BF? Thanks again.
Good point….I’m just quoting word for word from the research that was done. I can double-check all that, but otherwise I’d go with what is listed. Sometimes individual stats can be reliable in one context, but once you start mixing them together, the results aren’t always the same.
Also, geez, the formatting on this page did not transfer well. Time to fix that.
This came up in the original thread, here is Pizza Cutter’s answer:
“Those numbers are per PA, not per ball in play. So, for one player who always puts the ball in play LD + GB + FB may account for 95% of his plate appearances. For another guy who strikes out and walks a lot (we’ll call him “Adam Dunn” just to give him a name), LD + GB + FB might only cover 70% of his PA’s.”