Don’t hitters on average worse in the 2nd half of the season than in the first half? Primarily due to playing in so many games and wearing down over the course of the season. Shouldn’t this factor (that I admit to not knowing the full research behind), be controlled for?
Comment by Judge Carl Robertson — July 12, 2011 @ 4:05 pm
Good question. OPS is easier to find splits for, as I was only able to quickly and readily access monthly wOBAs, which are tougher to put together over a massive amount of data that was collected.
The reason I like OPS in this scenario is because I’m more trying to compare the individual player rather than player-to-player, in which I’d readily admit then that OPS would become much more flawed.
These players take a thousand swings a week. They have been taking a thousand swings a week for a thousand weeks in a row. If you think that a day of home run derby has anything to do with their future performance, you should ask a baseball player his opinion.
Most baseball players are players because they can play, not because they know that taking a walk in 10-plus percent of their plate appearances (to pull something random out of the sky) tends to be a good thing.
I’ll readily grant that I’d like to hear their opinion, but it wouldn’t sway my research a great deal.
One angle worth exploring is the split between players who advance multiple rounds in the derby. After all, you have to figure that hitters who take an extended run during the competition are more likely to experience fatigue that would alter their swings. The findings are interesting, if unsurprising.
Also wrong. These are the best athletes in baseball, in their prime. Not to be dismissive, but 15 swings, unless they are literally flying out of their shoes, does not fatigue them. Cano looked like he could’ve hit 100 last night and not broken a sweat. Swinging a bat is not like bench pressing your max. You don’t easily “burn out” in a session.
I think there are several things that could explain why the OPS generally decreases in the second half. For one, the all star game is often comprised of the HOTTEST (and most popular) players in baseball. So the odds of most of those who made the game on the merits of a super hot first half is VERY unlikely to go super nova and get even hotter in the second half.
I am not saying that the Derby does not screw some people up. But I do think the odds of regression is more likely then the hot getting hotter after the game.
I’m by no means a solid ballplayer at an advanced level, but it’s not uncommon to develop bad swing habits after one round of batting practice that isn’t taken under the same scope of work as your usual rounds.
The proper way to do a study like this would be to compare those players that participated in the derby with those that did not and see whether there is a statistically significant difference in their performances post All-Star break. As a few have already mentioned there is selection bias is looking only at derby participants since they are generally chosen based on a strong first half power showing.
Additionally, I’ve often wondered if the declines we attribute to Derby swinging is really just a case of overall fatigue made worse by the lack of a break at the all star game. Comparing non-derby all-stars to all-stars should control for this.
I third the two posters before me. On average, the best players in the first half will perform worse in the second half. For a better comparison, you should try selecting a group of players who had comparable first halves to the derby participants, were not in the derby, and were not held out due to injury concerns – or at least another group of players who had great first halves.
Honestly, I think this is a very sloppy post. It’s never informative to hand-pick a few players entirely non-randomly, point to their numbers, and say, “hey, this must mean something!” In any reasonably large sample of players, you’ll find a few whose numbers changed drastically, but that doesn’t mean that you can immediately attribute it to the presumed cause. It’s better to look at numbers in the aggregate (i.e. combine all players’ numbers pre- and post-Derby) to better control for random variation. The question to ask has to be, “Did these players’ numbers decline further than would be expected based on regression to the mean?” I’m pretty sure that others have found the answer to that question to be “no.”
The point is that these are the top .000001% of athletes ever to swing a baseball bat, and you are implying a mechanical swing change (or physiological/muscle memory?) change after an insanely tiny amount of swings. It’s a drip in the bucket.
A player could tell you that he spends time during BP trying to do all sorts of weird things – going deep, going oppo, putting the ball on the ground, etc. They don’t have long term effects on their swings.
I’m not attributing anything in terms of it being an absolute certainty. That’s why I sort of left it open at the end, to both spark conversation but also to hear opinions that are similar or different.
I think you forget that people who tend to make it to the Home Run Derby normally are very good players having very good first halfs, so there is a natural regression that is bound to occur in the second half.
The use of batted ball data does not at all negate the obvious explanation that you have (necessarily) cherry-picked players who performed above expectations before the all-star break. Batted ball data is LESS luck-influenced than avg/slg/ops/etc. over large sample sizes, but it is not COMPLETELY uninfluenced by luck, especially over small samples. Like half a season or so.
As others have said, to draw any conclusions we need a control sample of some kind. Even with the use of batted ball data.
Weirdly, Ichiro’s career high in HR and ISO came in his worst full season yet by wOBA: 2005 (though this season will probably come out worse). All his gains from additional HR and 3B were lost due to a low BABIP; he hit fewer grounders that year than usual, but not the fewest of his career, and more fly balls but not the most of his career.
The big differences for Ichiro that year (compared to his career) were a 37-point drop in BABIP (causing a 25-point drop in BA) and a 38-point increase in ISO, and the total wOBA result was negative. I think that makes sense to me; the difference between a hit and an out is a lot bigger than the difference between 1B/2B, or 2B/3B, or 3B/HR.
Regarding the notion that the players take a thousand swings a week for a thousand weeks in a row and that nothing ever changes: that’s ignorant. If you honestly believed that, there would be no such thing as hot streaks or cold streaks. One bad BP sesh can put a player into a funk just as a good session can put a player into a groove.
And HR Derby participants have admitted that they have been fatigued after having a big round. The one that I remember off the top of my head is Josh Hamilton in 2008, but there have been more that have been tired due to trying to crank homer after homer, especially players that have traditional line drive swings.
If too much batting practice didn’t tire players out, they would all be in a cage for five hours before every game, and that’s not the case.
Players go on streaks affecting their batted-ball profiles all the time. I’d expect that there’s a bias towards players with better-than-usual batted-ball numbers in the first half in the ASG/HRD, not just a bias towards players with better triple-crown stats. Not because the voters are tracking batted-ball numbers, but because they’re correlated with triple-crown stats. Comparing to longer-term averages still makes sense.
The Bronx Blade just has that natural swing though. I don’t think it’ll affect his play.
Comment by Templeton1979 — July 12, 2011 @ 5:54 pm
This post essentially shows that mean reversion exists and so is not informative, except to prove that there is some luck in baseball (an exercise that FG probably doesn’t need to do again).
Mean reversion says that if you take the top performing players in one time period and then look at another time period, they will perform worse. You can do the same for under performers (they will perform better).
It doesn’t matter what stats you look at, they all do mean reversion. Probably the only exception would be something like jersey numbers.
Comment by Barkey Walker — July 12, 2011 @ 6:51 pm
It’s a start, but there is a ton more data to look at to answer this. How about controls – average MLB 1st and second half, comparable players from different years, poor players 1st and second halves, etc, then comparing the guys in the derby to these other trends.
I bet the more data found the weaker this correlation gets.
Still, maybe Abreu really did mess up his game. Maybe it happens to a guy or two, but I doubt it’s anything more than a random case.
Couldn’t you run a regression to get at this? I’m thinking a dependent variable indicating yes/no for dropoff in performance and HR derby participation as one of the independent variables? That would give you the degree to which a dropoff is attributable to the derby.
I think this question should be specific to All-Star entrants, not the league as a whole.
Don’t hitters having exceptional first halves, and in turn, being voted in as All Stars on average fare worse in the 2nd half of the season than in the first half?
Comment by Sultan of Schwinngg — July 12, 2011 @ 7:58 pm
Without going to the tape, I’ll say “yes”.
Comment by Sultan of Schwinngg — July 12, 2011 @ 7:59 pm
Not sure if this was mentioned because I didn’t feel like reading all the comments, but why not compare the derby contestants vs non-derby contestants. I know that you are judging whether the consistent home run swings of the derby mess with the numbers, but you might as well test the numbers of the other top power hitters in the league, too.
Brandon, that is just a crazy statement. Have you ever even experienced a batting practice? yeah, they shoot for the moon
Comment by Sultan of Schwinngg — July 12, 2011 @ 8:06 pm
You mentioned Abreu (not a HR hitter, though he was in the first half that year) 9 times, suggesting he was your case study, naturally so because his 2nd half performance sticks out like a sore thumb. Any criticism of your doing.that is warranted, imo.
Comment by Sultan of Schwinngg — July 12, 2011 @ 8:12 pm
I actually performed this study with a group of people at Middlebury College a few years ago. We found that there was a significant drop off in OPS as well as home run rate and batting average for derby participants from before to after the derby. However, we did not find any statistically significant difference between the second half of the derby year and the year past, leading to the conclusion that derby participants were (in general) selected based on outlier first halves.
This study was published in the Journal of Recreational Mathematics in Vol 35:4
There might be something to that (though your assertion that their first halves were solely attributable to luck is dubious, at best) but I’d like to see some data to support such a definitive conclusion.
Jim Edmonds participated in the 2003 home run derby after hitting 28 homers in the first half. He had a 1.066 OPS at the break and hit 11 homers with an .866 OPS after the break. He blamed his poor second half on the home run derby, saying that participating in it screwed up his swing for the second half.
I’d suspect that on average, players selected to the ASG have higher LD% than their usual rates. It’s one of the elements that impacts BABIP, so naturally players who overperform on hitting LD in the first half will be more likely to have elevated BA, which makes them more likely to be selected. Likewise, low popup rates helps boost batting average, so again, lower than expected rates in the first half can make a player more likely to go to the All-Star Game. Just because a metric is more “advanced” doesn’t mean it doesn’t fluctuate the same way other stats do. The question isn’t whether players see a change from first half to the second, it’s whether there’s a difference between what the player does in the second half compared to a typical year.
Yes, Abreu always comes up at this time every year so you’d think someone would finally tell us the truth about his 2005 season. At first, I thought this would be that time, but using simple pre and post ASG splits isn’t going to do the job.
The simple truth is that Abreu was then and has been throughout his career a very streaky hitter when it comes to HR. The fact is that his ‘second half’ had nothing to do with the HR derby, but was just a continuation of a downturn that began in early June and went through the remainder of the season. I honestly would have thought that someone at Fangraphs would have debunked the ‘Abreu got messed up by the HR derby’ long before this.
Yes, Abreu had 18 HR at the break that season and 6 after. The context, though, is that he had only one HR in April, then had a monster May. He hit 11 HR in May, but none after May 18th. After the first game of a June 4th doubleheader in which he hit 2 hr, he was sporting a triple slash of .340/.461/.609, pretty much his high water mark. By the break, this had fallen to .307/.428./.526 (in just over a month, his OPS went from 1.071 to .954.
In reality, it wasn’t all downhill from the HR derby on, but rather all downhill from June 4th on. It seems to me that a site that prides itself on analysis and hates those ‘fake stats’ that people make up because something seems like it might have happened would have corrected this misconception long before this, but instead it somewhat plays into continuing it.
When you refer to a 2% line drive rate decrease from July to August…. this is what a change of 1 or 2 line drives? (meaning simple variation, classification error, change in relative platoon AB’s, etc… or of course potentially real)
The batted ball %’s are interesting but when you start comparing one month to another month what might seem like a significant # when looking at percentages, really is a minor delta when you consider the average hitter probably has 100 balls in play or less in any given month, and power hitters who may strikeout and/or walk more than normal might have even fewer. Jumps of 5+% may have a little more value, but I still question that value when you start looking at sample size and errors that could be involved. Throw in a couple of extra tough lefties (or righties) or a tough stretch through a bunch of #1 and 2 pitchers (and missing the #5’s) and the numbers can vary quite a bit on small samples
The other issue is given the sample size is so small we have no idea whether something like a minor injury may skew the #’s for a couple of players (this would tend to even out over larger samples, but we have no real way of knowing how big an impact it is on such a small sample of data)
I have a hard time believing the premise, I think the post hoc (?) explanations of the derby messed up my swing by player(s) is more a way to explain away a down second half ( based entirely on intuition and my gut with absolutely zero data to back that up!)
Yes, he’s always been a streaky hitter. But what triggers the cold streaks? It could be a small injury, a bad at bat, or a competitive batting practice where he is rewarded for hitting homeruns. Was it the homerun derby that triggered that particular cold streak? That’s the question that is being asked in this post.
This is the correlation is not causation website, right?
Here’s what I would look at FIRST in regards to HR Derby competitors …
 Expected second half regression versus  Actual second half regression.
All we know is that often there is regression. What is not examined is how much (if any) regression should be expected.
Telo is more correct than anyone wants to give him credit for.
These guys take, and have taken, hundreds of swings per days … for years to develop and maintain a swing to the point of habit.
Are we really saying that one day of HR Derby throws that swing out of whack to such a degree that it undoes years of work? Just imagine what playing golf would do!!! (sarcasm).
Do you/we realize just how often guys tinker around with their swing, or pitcher’s try new release points or arm paths in bullpen/batting sessions?
I wonder how many home runs each of these guys hits in pre-game batting practice. I saw ARod hit 13 home runs in approximately 25 batting practice swings at The Cell. It’s amazing his OPS didn’t fall off a cliff for the next month.
These guys don’t change their swings much for the HR derby and it’s evident by how many of them do so poorly.
Anyway … I think we should look at expected regression versus actual regression.
Comment by CircleChange11 — July 13, 2011 @ 11:40 am
That doesn’t make for as interesting story.
Comment by CircleChange11 — July 13, 2011 @ 11:44 am
Read my whole post. The HR derby came in the midst of an extended cold streak that began in early June.
I’m sorry but I just have a hard time believing that 15-20 swings in one night by a professional athlete would have enough effect to linger with them for 3 months. If it does have any effect I would think it would be more so that they developed bad habits by trying to alter their swing for power than it being an endurance issue.
A psychological explanation does it for me: Player hits for some power in the first half. Player gets selected to the HRD. Player gets inflated head. Player wants nothing more than to continue being a hero and hitting home runs. Player presses to perform and changes approach. Player tanks.