The absolute hardest part of being a fan or analyst is avoiding the use of small samples to make a definitive claim. Allowing unreliable information to shape an opinion or serve as the foundation for an important decision is a mistake. This especially occurs in the opening week of the season when the statistics produced are meaningless and everyone is susceptible to the manipulation of numbers out of sheer joy that baseball has resumed. Just because Ramon Hernandez went 4-5 yesterday does not mean he has regained his stroke and will have an all-star caliber season. The same can be said if he starts the season 12-20, or 21-40.
We have enough information about his true level of abilities at this point in time that 20 trips to the dish is nothing more than a mere blip in the dataset. But when can we be sure that a trend to open the season is actually indicative of a noteworthy change?
The answer is not cut and dried. It depends upon the specific statistic being examined. As I wrote back at the beginning of 2009, statistics “stabilize” at different thresholds of plate appearances. A trend can be considered significant for certain statistics sooner than it can for others. Having this knowledge can prevent rash decisions in fantasy leagues, and can aid in the avoidance of anointing players as having breakout years or writing them off completely. The results in the prior article were based upon the tremendous work of Russell Carleton, my former colleague at Statistically Speaking, and current analyst for the Cleveland Indians.
The method known as split-half reliability was utilized, which measures the correlation between different parts of the same dataset. An example would involve separating Matt Holliday‘s even-numbered plate appearances from his odd ones, and then running a correlation on both bins. When the correlation between the bins is somewhere around the 0.7 range–correlations run from -1 to 1, with +/- 0.7 or above indicating significance in a statistical study–the statistic can be considered useful in forming opinions and noting trends.
By ‘useful’ I am referring to the notion that our expectations moving forward can be more narrowly defined. The goal then becomes finding the lowest PA total at which point the correlation is significant. Holliday has averaged around an 11% walk rate over the last three seasons. If in his first 200 PAs this season his walk rate has dropped to five percent, the split-half reliability test will tell us how closely his second 200 PA bucket will mirror the first.
If the correlation is close to 1.0 then we can say with an increased level of confidence that his walk rate will remain at five percent. Even though his past exploits are known, that specific statistic can be indicative of a change at around the 200 PA mark. When using numbers to form opinions, isn’t that type of assumed reliability the desired result? Nobody knows for sure what exactly will happen over those next 200 PAs, but tests like this help to reduce the range of possibilities and eliminate some guessing in the dark.
The thresholds for various statistics offered on this site are as follows:
50 PA: Swing % 100 PA: Contact Rate 150 PA: Strikeout Rate, Line Drive Rate, Pitches/PA 200 PA: Walk Rate, Groundball Rate, GB/FB 250 PA: Flyball Rate 300 PA: Home Run Rate, HR/FB 500 PA: OBP, SLG, OPS, 1B Rate, Popup Rate 550 PA: ISO
How can this information be used? Well, it’s unlikely that anyone will rack up 50 plate appearances in the first week of the season, so it will take at least two weeks before the first trend on offense can truly be tracked. A hot start from an unexpected player might be interesting to flag for future evaluation, but cannot be considered noteworthy until much more of the season has passed.
My goal isn’t to serve as a wet blanket or Debbie Downer, but rather to shed light on seminal research that can help everyone keep some perspective in the opening week(s) of the season. Not only should fans and analysts avoid letting preformed opinions shape their perceptions of a player based on numbers produced at the start of the season, but the statistics themselves should not even be considered worth discussing until they surpass the aforementioned plate appearance marks.