FanGraphs Baseball


RSS feed for comments on this post.

  1. Why are 96 players a big enough sample size to make determinations? Random chance seems to say I could pick a group of 96 players out of a hat and some non zero percentage of them would experience similar decreases.

    In other words, could you show your work here as to why this is a sample size we can draw conclusions from as opposed to noise?

    Comment by Hunter fan — July 11, 2012 @ 9:23 am

  2. Nice piece, but I think this is a pretty obvious case of selection bias. If you limit your analysis to players who participate in the derby, you’re ignoring the fact that many of these players had abnormally good first halves and simply regressed (to the mean) in the second half.

    In other words, if someone hits 6 homers in the first half, and 18 in the second, (a) they aren’t going to be invited to the derby, and (b) idiot analysts won’t be around to explain the difference as being due to the fact that the batter’s swing was improved because he didn’t participate in the home run derby.

    Comment by Matt — July 11, 2012 @ 9:24 am

  3. I seem to remember Hanley Ramirez’s power declining considerably after his turn in the home run derby. Would any care to verify or refute that remembrance for me?

    Comment by Rob — July 11, 2012 @ 9:26 am

  4. Pause to consider that there was once a time when Brandon Inge could be in the Home Run Derby.

    Comment by Well-Beered Englishman — July 11, 2012 @ 9:27 am

  5. I’d feel more confident if a 1st half/2nd half baseline were established league wide. Maybe the data will show that HR Derby participants regress at the same level as all other hitters in the 2nd half of a 6+ month season.

    Comment by L.UZR — July 11, 2012 @ 9:35 am

  6. Wait, you did that. How did I misread?

    Comment by L.UZR — July 11, 2012 @ 9:37 am

  7. Selection bias, indeed.

    Comment by Ryan — July 11, 2012 @ 9:37 am

  8. This must be right. The Home Run Derby is essentially a selection of players who hit a lot of home runs in the first half. Some of the participants will be doing their usual thing, but a relatively high percentage will be those who outperformed their talent over the first half. It isn’t surprising that many hit less homers over the second half.

    It also isn’t surprising that there will be a couple of notable stories (Inge, Abreu) of 2nd half collapses. It’s probable that this will happen to some players in any sample of significant size that you pick.

    Moreover, I’m not sure how significant the collective decline actually is. The average ISO drops about 9%, but it’s still decent. Certainly, Inge’s case is not the norm.

    Comment by Benjamin — July 11, 2012 @ 9:38 am

  9. I am feeling deja vu for some reason…

    Here is a suggestion that would still be fraught with problems but is at least a new approach.

    Step 1. Crowd source a list of players who were named as possible participants but did not actually participate. They are presumably similar to the players who did participate in that they hit lots of HRs in the first half, but dissimilar in that they did not actually take any of the life-threatening derby swings.

    Step 2. Compare the second half performance of this sample of players to the sample of actual participants.

    Step 3. Profit.

    Comment by mcbrown — July 11, 2012 @ 9:50 am

  10. FACT: Brandon Inge would have been an all-time great, but he was consistently “pitched to like Babe Ruth”. Poor guy.

    Comment by bada bing — July 11, 2012 @ 9:51 am

  11. You need to compare the league-wide DISTRIBUTION of player’s first to second halves to the DISTRIBUTION of HR Derby participant’s first halves and second halves. You cannot just look at the league wide means and declare that the HR Derby participants are significantly different. If the data has a lot of dispersion then your tiny sample size of HR Derby participants will fit comfortably inside it. In that case no explanation is needed (e.g. regression and lucky first halves for the HR Derby participants).

    I’m willing to bet this is just random variation. Your regression explanation makes intuitive sense, but there are plenty of All Stars that don’t fit narrative. Players don’t make the All Star team and HR Derby just because they’ve hit a lot of home runs in the first half. Prince Fielder made the All Star team and participated in the HR Derby because he’s Prince Fielder. He has underperformed in the first half this year. Also, in the general population it is expected that half of all players will over perform in the first half.

    Comment by Jason H — July 11, 2012 @ 9:53 am

  12. I wonder how the second half performance of derby participants compares to other players who hit “x” amount of first half home runs.

    Comment by adohaj — July 11, 2012 @ 9:54 am

  13. Bil-ly But-ler! Bil-ly But-ler!

    Comment by Aaron (UK) — July 11, 2012 @ 9:56 am

  14. Shortly after the All-Star break in 2009 Inge hurt his knee, but being the gamer that he was(barf), he played with it injured because the team had no one else, he couldn’t hurt it any worse is what he claimed the doctors told him, and to get it right would have taken 8-12 weeks and he didn’t want to miss that much time.

    Couple that with the fact he played over his head in the first half then got hit with the double whammy of an injury and probable regression to the mean in the second half. I only mention this because he seems to be the poster boy for this theory and the fact he was a huge contributor to the September collapse that year.

    Comment by Mike P — July 11, 2012 @ 10:01 am

  15. Or just skip the crowdsource and use the smallest number (or rate) of HR by a player in the derby in the year as a threshold. I did something like this after the Abreu debacle and found some very small effects of the derby on power, but not enough to worry.

    Comment by Elias — July 11, 2012 @ 10:02 am

  16. I suspect that home run derby guys don’t get three days rest like the rest of the league, that may make a difference too.

    Comment by Hurtlockertwo — July 11, 2012 @ 10:02 am

  17. Of course the home run derby guys get extra batting practice….???

    Comment by Hurtlockertwo — July 11, 2012 @ 10:03 am

  18. Then again, the crowd noise factor, plus field energy, minus gold ball bias…?? Mind boggling.

    Comment by Hurtlockertwo — July 11, 2012 @ 10:05 am

  19. I was utterly shocked by this actually. What next… All-Star Starter Yuniesky Betancourt?

    Comment by Jack — July 11, 2012 @ 10:35 am

  20. In terms of your methodology, would you consider Cano to have participated in the Home Run Derby this season?

    Comment by Evan — July 11, 2012 @ 10:48 am

  21. Why not compare individual post HR Derby 2nd half numbers to career numbers? Other than the large amount of work it would be…

    Comment by CommonCents — July 11, 2012 @ 10:52 am

  22. naa, Adam Kennedy would be a better SS HR derby rep. YuckNasty actually has a little pop for the position (key there, “for the position”)

    Comment by Cidron — July 11, 2012 @ 10:59 am

  23. Comapre HR Derby participants to All Stars in general.

    Comment by Rob — July 11, 2012 @ 11:02 am

  24. Or when earlier this year he was hitting poorly in AAA on a rehab assignment. He claimed he wasn’t getting anything good to hit. Those AAA pitchers can be real tough on a 10 year MLB veteran.

    Comment by asdfasdf — July 11, 2012 @ 11:16 am

  25. Gregg Zaun has repeatedly said on Blue Jays telecasts that if the home run derby negatively affects swings, then players would do poorly every single game because batting practice always ends up as a glorified home run derby at the end.

    Makes sense.

    Comment by Allan G — July 11, 2012 @ 11:35 am

  26. Correct correct correct.

    And even if we do assume for a second that these players are tired out by prior performance, we could just as easily determine that they are tired out from their scorching first halves as from the homerun derby.

    Comment by JeffMathisCera — July 11, 2012 @ 11:50 am

  27. To take it a step further:

    How about Home Run Derby participants vs. other all stars since 2000 to testthe regression theory?

    Or Home Run Derby participants vs. themselves in years they didn’t participate in the HRD?

    Comment by Will — July 11, 2012 @ 12:07 pm

  28. You know the HR derby sucks when batting practice is declared a glorified version of it.

    Comment by TKDC — July 11, 2012 @ 12:07 pm

  29. you’re affirming the point made in the post, yes?

    “The real issue then becomes regression. Players selected to participate in the event are individuals who performed at an elite level in the first half of the season. For many of them, their first-half numbers represented a level of performance significantly out of sync with their career numbers to that point.”

    Comment by Tim_the_Beaver — July 11, 2012 @ 12:12 pm

  30. Case closed. Greg Zaun said it, therefore it be true!

    Comment by Subversive — July 11, 2012 @ 12:27 pm

  31. How about taking home run derby participants vs all other players that had equivalent home run totals by the all-star break. Obviously, with only 8 participants, there are other home run hitters that don’t participate in the derby. If the non-participating home run hitters have similar drop-offs, then it’s probably a better indicator of late season regression due to fatigue. If non-participating home run hitters maintain their performance, then it indicates that there might be something to the curse of the derby theory.

    Comment by Scott — July 11, 2012 @ 12:29 pm

  32. Would it be possible to make the 96 player data set available for others to play with– I’m sure I could dig them up online, but if someone just handed me a list of names [with lahmanID?] and years, I’d probably be more inclined to do something with this :)

    Likewise for many of the others commenting on this thread I’m sure.

    Comment by Eric R — July 11, 2012 @ 12:42 pm

  33. I really like Rob and Common Cents suggestions for possible comparisons.

    Comment by Bigmouth — July 11, 2012 @ 12:49 pm

  34. That’s what I thought too; they were essentially making the exact same point that the author made in the article.

    Comment by Jason B — July 11, 2012 @ 12:53 pm

  35. Sample-size of one person, occurrence, or anecdote obviously does not prove nor disprove any theory. Hence the larger sample of 96 used in the article.

    Comment by Jason B — July 11, 2012 @ 12:54 pm

  36. Anecdotally, Adrian Gonzalez’s swing has not looked the same since his clinic last year and his power still has not returned. Again, small sample size of 1, but it’s stuck in my head for the last year.

    Comment by Matt — July 11, 2012 @ 1:04 pm

  37. Josh TweakerTough Reddick got snubbed! Best RF in the AL!

    Comment by sleepingcobra — July 11, 2012 @ 1:14 pm

  38. Can we look back and find ZIPS rest of season projections for HR Derby participants? It would at least be a start.

    Comment by Jeremiah — July 11, 2012 @ 1:26 pm

  39. But what if ZiPS already knows about The Curse???

    Trumbo is only projected for 14 more homers this season…. ZOMG THE CURSE!!!

    Comment by mcbrown — July 11, 2012 @ 1:49 pm

  40. Wait I thought it was because of his knees that he isnt an inner circle hall of famer, or was it mono, or was it chaning positions, or…

    Comment by Ronin — July 11, 2012 @ 1:49 pm

  41. upset you didnt reaffirm fisrt?

    “Damn, I shoulda concurred”

    Comment by Pat G — July 11, 2012 @ 1:49 pm

  42. But 96 could still be noise too…

    Comment by Hunter fan — July 11, 2012 @ 2:55 pm

  43. This also has selection bias issues as all All Stars usually have had especially good first halves.

    Comment by TheGrandslamwich — July 11, 2012 @ 3:05 pm

  44. Done.

    You can click on any column heading to sort. First Half+ and Second Half+ are comparisons of each player’s respective half to their career numbers. Initial sort is by Second Half ISO+. The numbers at the very top of each column are the averages for all 96 Derby participants.

    Interestingly enough, the collective AVG+ and OBP+ for each half is about the same, but the ISO+ is 123 for first half and 110 for the second half. As many have already stated, this higher ISO+ creates a selection bias. Those with higher ISO+ in the first half are more likely to be selected for the Derby. Only 10 of the 96 Derby participants had a First Half ISO+ less than 100 (i.e., an ISO less than career average), while 42 of 96 had a Second Half ISO+ less than 100.

    I think these comparisons are a good start, and would like to see research comparing the distribution of these 96 players to First and Second Half distributions of the entire league.

    Comment by elkabong — July 11, 2012 @ 7:51 pm

  45. Reaffirming and restating the conclusion of the post: hitters are not worse after the all star break; they are better before it.

    Comment by Tomrigid — July 11, 2012 @ 8:42 pm

  46. The fallacy that population averages apply to all individuals lives on.

    Just because the group shows no effect does not mean certain susceptible individuals will have no effect. It may be due to injury or a minor change in a players swing. Anyone participating in the HR Derby is simply rolling the dice.

    A-Gon coming off shoulder surgery last year participated in the HR Derby. His power numbers fell off a cliff in the 2nd half and he said his shoulder acted up. In fact, his power numbers have not recovered and since last years HR Derby he has 16 HR in 607 AB. Yikes. Maybe just a coincidence, but…..

    Comment by pft — July 12, 2012 @ 6:11 am

  47. I think that’s his point. Like mcbrown above, the goal would be to analyze players who had a strong first half of the season, and compare their 2nd-half numbers to the 2nd-half numbers of Derby participants.

    Comment by Monty — July 12, 2012 @ 9:08 am

  48. I only grabbed six years of data [2005-2010], but here are the average pre and post ASB HR totals by the HR Derby round the player made it to:

    Finals: 17.8/12.3
    2nd: 21.8/16.0
    1st: 20.5/19.4

    So, guys who didn’t hit enough to advance past the first round hit 95% of the HRs in the second half as the first; guys who made it to the second round, but did not advance, hit 73% of the HRs in the second half as the first; guy who made it to the finals, 69%.

    So maybe alittle HR derby doesn’t hurt much, but a lot of it is more detrimental.

    Since the top two buckets had 12 players in each and the bottom 24, lets even split that one in half. 11 of them hit four or more HRs in their one round. They averaged 21/12 [57%], the other thirteen 20/25 [125%].

    So guys in multiple rounds or who atleast hit a decent number of HRs in the first round took a pretty big hit in the second half and the rest did OK…

    Comment by Eric R — July 12, 2012 @ 10:15 am

  49. So, essentially, Robinson Cano is going to hit 32 home runs in the second half.

    Comment by BronxBomber — July 12, 2012 @ 2:09 pm

  50. Exactly. It’s like the “Verducci Effect” theory on pitching. Of course they are due for a slight regression when you are looking at players who are performing at a high level.

    Comment by Hamilton Marx — July 12, 2012 @ 2:51 pm

  51. Maybe. SSS still applies to my data and even if it didn’t, maybe it’d be more like saying, “if we were to play-out the remainder of the 2012 season 1M times, Robinson Cano would probably average around 32HR”.

    Comment by Eric R — July 12, 2012 @ 5:34 pm

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Close this window.

0.208 Powered by WordPress