Knock on wood, I certainly hope so. This piece isn’t about sending a tribute to the area, rather it is a discussion of the composition of the minor leagues and those who reach the major leagues.
While this article became a study of a the California League’s population, the concept began when I was thinking about Jake Lamb‘s prospect status. Lamb signed with the Diamondbacks last June and I stumbled upon him during his first Spring Training with the club — he ranked among the 10 best prospects I saw in Arizona. Intrigued, I followed his injury-riddled season closely and thought he would never garner the attention I believed he deserved because of his old age and collegiate pedigree (though, Hulet ranked him higher than anyone else this off season!). Suddenly, I found myself buried in Excel attempting to discover what Jake Lamb’s chances were to become a major leaguer.
Statistical studies of prospects are difficult because the minor leagues are vast and rife with variables and failure. There are 189 teams across 16 full-season, short season and rookie leagues, each stocked with talent that may never make a major league 25-man roster. With over 5,000 minor leaguers vying for 750 MLB roster spots it can be easier to study the successes.
Studying only the players who reach the major leagues may be easier, but often such studies snag on “survivorship bias.” Survivorship bias may be present when a study’s population consists of a select group amongst a larger class. If one is going to study success, it’s wise to study failure too. For a demonstration of survivorship bias, read Dave Cameron’s post on The Value of Hunter Pence.
I’ve categorized each California League hitter from 2006-2008 by age and signing status, removing those who were present for obvious rehab assignments. In hindsight, I could have asked Jeff Zimmerman to set a threshold for games played of plate appearances but I did not. Instead I culled through the players’ Baseball-Reference pages to determine they were genuinely playing in the California League and to find their origin — College, High School or International Free Agent.
Of course, a study cannot be used to make an absolute declaration about an individual, but its application can support an opinion. While I could apply the findings below to Jake Lamb, he is his own man. The results of others will not predetermine his success. For instance, additional context like the conclusion from Eno Sarris’s AFL All-Star piece should be considered when discussing Lamb’s future. In the end, Lamb was the impetus for this piece, not the subject.
Additionally, this is an incredibly small sample size. There are 664 players in this study of the California League (CALL) from 2006-2008. While 664 baseball reference pages are voluminous when one — read: Me — is clicking through them, in context the sample is minuscule.
HOW IMPORTANT WAS AGE RELATIVE TO LEAGUE?
In May I argued individual prospect analysis required greater context than a player’s age. Today, I am looking at a pool of players which ignores important granular detail that is essential to the evaluation process. To analyze the importance of ARL, I created an area chart of those who reached MLB versus those who did not.
The average age of the sample was 22.8. It’s easy to see from the chart that those who reached the MLB from the CALL were nearly equally distributed around the mean. However, younger players had a higher success rate than older players, as evidenced by the blue area in the first chart and in the chart below.
The high success rate of young players supports our intuition. When young players are challenged by their organizations against older competition it is because they are thought to be exceptionally talented. Exceptionally talented players are more likely to make the MLB. Simple enough.
Though, the debut rate of those who were 22 and 23 is higher than I expected. The sample’s debut rate was 21% while the debut rates of 22- and 23-year-olds were 28.4% and 17.5%, respectively. These marks imply one should pay attention to players who are of league average age and not dismiss them as too old. Of course, these results are binary. Debut or Failure. It tells us nothing of the quality of their career. As I mention later, that’s something worth evaluating in a future study.
Another way to look at age is by comparing athletes who are similarly situated within the league, rather than comparing them to the entire population.
Breaking up the CALL debutees and failures into College, High School and IFA classes still shows the importance of age within group.
WHICH BACKGROUND (COL/HS/IFA) HAD THE MOST SUCCESS?
The preprofessional development of college and high school draftees and international free agents is fascinating. Alas, it is a topic best left for another day. In short, it is important to know that college draftees spend between two and four years at a college program, high school signees are 18 years old who spurn college commitments to begin their career, and international free agents are signed between the ages of 16 and 18. This snippet of information can inform our conclusions.
There are more than twice as many collegiate athletes who reached the major leagues than there were high school signees and IFAs, but the college population in the CALL was far greater. I suspect there are more college players than high school players in the sample for two reasons. First, college players debut in more advanced leagues high school players. Thus, they may be more likely to reach the CALL before they burn out. Second, more college players are signed.
The success rate of HS and IFA was significantly higher than college, similar to the ARL results. I hypothesize that many of the collegiate failures are organizational fillers whereas less desirable HS players will elect to attend college. For a college player the decision whether to enter the workforce or attempt to achieve their dream of reaching the MLB is an obvious decision for most, even if the odds of success are long.
However, the equation is different for high school draftees. An organization will not offer a HS player a sizeable signing bonus to buy him out of his college commitment if they believe he is merely filler. Further, a HS player will likely elect for a college scholarship, an education and the opportunity to improve his draft status if he is offered a meager bonus. Thus, the quality of high school signees is likely higher than the quality of college signees because the college signees are watered down with organizational fillers.
Interestingly there were far less IFA in the sample than I had anticipated. Part of that can be explained by fewer IFAs signing annually than Rule 4 signees. In 2013 there were 1216 HS and college players drafted. While I do not know how many IFAs signed, I assume the number is about 1/5 of that total (250). Moreover, because many are signed at 16, they attrite at lower levels.
FURTHER QUESTIONS TO EXPLORE
Conceptually, this study is about identifying value, determining where an organization will see the most return on investment and exploring the composition of the minor leagues. In practice, the sample studied needs to be expanded before it can yield meaningful conclusions.
If I were to expand, I would like to see how debut rate changes from Rookie League to Double-A and how the debut rate and composition of different leagues of the same minor league level compare. Using this sample I also studied how quickly players debuted after their CALL stint, but I want a larger sample before publishing that data.
In this study I reviewed whether a player debuted and defined a debut as a success. However, success can be redefined using career WAR, wRC+ or another metric. Jeff Zimmerman and I are exploring different ways to approach incorporating a performance metric into a similar study.
Within this 664 player sample there was a correlation between Age Relative to League and debut rate. This finding, while limited by the sample size, supports the belief that players who are younger than the league average are prospects. However, it’s not the player’s ARL that makes them a prospect. Rather, when a player is placed in an advanced league it is a confirmation of his abilities by his organization. Thus, this study is an observation of organizations’ tendencies rather than a statement that organizations should challenge all of their minor leaguers with difficult assignments to increase their debut rate. With that said, there may player development advantages from aggressive prospect placement, but if such advantages exist they are not being studied here.