Nowhere But Down
Much of my work this week has focused on the ‘Clutch’ statistic kept here, attempting to shed light or help the confusion surrounding its meaning and usage to dissipate. A great discussion took place in the comments section at my post ‘All About Clutch’ wherein it was suggested that the best hitters in the league will struggle to post high clutch scores because, essentially, they would be so high up the performance chart that there would be no higher ground to which their games could be raised. The inverse would then be true for poorer hitters; since their games were so low much more room exists for game-raising performance.
The major confusion stemmed from the fact that a player with a .333 BA in situations with a high leverage index could be less clutch than one with a .225 BA in the same situations. The way the clutch statistic works is that it measures a player against himself, comparing production to what that production would be in a context-neutral environment. Clearly, I would rather have the .333 guy up to bat in a crucial situation and, because of that, heads begin to spin when it is realized that the .225 guy could have a higher clutch score because in all others he hit .200; the .333 guy posted the same BA in all situations, therefore failing to raise his game.
With this in mind I decided to do a little digging in order to see if this generally holds true. I took the qualifying major league players from 2000-2007, first found the average WPA/LI, and then calculated the average clutch score for those with above average WPA/LI as well as the average clutch score for those with below average WPA/LI. Keep in mind that, in the results below, BA refers to the average clutch for below average WPA/LI with AA meaning the same for above average:
2000: 1.15 WPA/LI, -0.10 BA, 0.07 AA
2001: 1.39 WPA/LI, 0.05 BA, -0.10 AA
2002: 1.38 WPA/LI, -0.02 BA, -0.19 AA
2003: 1.15 WPA/LI, 0.03 BA, -0.32 AA
2004: 1.20 WPA/LI, -0.06 BA, -0.25 AA
2005: 1.15 WPA/LI, 0.01 BA, -0.27 AA
2006: 1.07 WPA/LI, 0.22 BA, -0.13 AA
2007: 0.98 WPA/LI, 0.03 BA, -0.14 AA
As you can see, other than in 2000 and 2007, the average clutch score for those with below average WPA/LI was much better than their above average colleagues. Not to say that their clutch scores were earth-shatteringly spectacular, but, rather just much higher and more indicative of game-raising performance. Deciding to go a little deeper, I looked at the top and bottom 10% in each year to see if the results differed:
2000: 0.06 BA, -0.25 AA
2001: 0.03 BA, -0.54 AA
2002: 0.05 BA, -0.87 AA
2003: 0.02 BA, -0.39 AA
2004: -0.20 BA, -0.11 AA
2005: -0.01 BA, -0.46 AA
2006: 0.16 BA, 0.21 AA
2007: 0.34 BA, -0.27 AA
Here we get very similar results; those in the bottom 10% of WPA/LI generally post much higher clutch scores than those at the top. 2004 and 2006 are the exceptions to this “rule” but even they do not differ too heavily; they actually come within ten points of each other whereas every other year is vastly different in the average clutch scores.
Based on these results it would seem that, yes, the players with below average performance are more likely to post higher clutch scores because they have more room to work with, so to speak. I would still rather take, with much confidence, those in the top 10% of WPA/LI in crucial situations, even though the clutch statistic, in its current state, will debit their performance for having nowhere to go really but down.
Now, to clarify the above paragraph, after some tests, there is no correlation between WPA/LI and Clutch, meaning that it is not a concrete rule that all good players will post lower clutch scores and vice versa. From these results, though, it does seem that those with a higher WPA/LI have more opportunity to post lower clutch scores.

23


Solid
So I’ll open it up here, what would you (all readers, not just Chris) look for in a clutch statistic? Perhaps to get rid of this potential bias.
I know this might sound like a stupid question, but what exactly does WPA stand for?
Win Probability Added
One could measure the consistency, for high leverage situations (I bet Joe Morgan would love this idea).
For all players that are above average their clutch scores could be compared to themselves, and for those that are below average they could be compared to the average player. So Albert Puljos would be compared to himself, and Juan Pierre would be compared to league average. In other words have some threshold of a minimum for clutch.
Example: Lets say the league average OPS is 800 and a normally 950 hitter posts an 850 OPS in high leverage situations he is indeed not coming through in the clutch. Lets say a normally 600 hitter posts 700 in high leverage situations it is an improvement for them, but still by no means better than your average player and therefore not really clutch as their performance was still pedestrian.
My main problem with still using WPA when calculating clutch would be something like the following example:
Home team down by one, with two outs. Tying run (Pedroia) is on first. Manny comes up to bat and hits a ball into the gap for a double, but because Pedroia is not exactly fleet of foot he gets thrown out at the plate. Manny gets a negative WPA in a high leverage situation even though in most cases he hit a game tying double. It just doesn’t seem quite right because the only thing “more clutch” Manny could do in my hypothetical situation is hit a home run.
The same thing applies to fielding errors, the negative WPA should not go to the pitcher, but to the fielder.
You can’t debit a fielder for a bad play without crediting him for a good play, or else the average shortstop would have about five times as much negative fielding WPA as the average outfielder. Besides, so much of baseball is decided by chance anyway.
Example: Brad Ziegler came on to face Garrett Anderson with first and second and one out in the eleventh the other week. Anderson hits a line drive to short which doubles off Izturis to end the inning. Did Ziegler deserve those outs? Hell no. He got hit hard by the only batter he faced and yet walked away with a .226 WPA.
And that’s only one example. Stuff like this happens numerous times throughout the course of a single game: check swing singles, 400 foot fly outs. If the game were decided almost entirely by skill, as in basketball, then the best teams would be racking up .800 winning percentages and the worst would struggle to win a quarter of their games, as in the NBA, but that’s not the case. So why stop at errors? Why not play the what-if game and credit people for what “should” have happened? Batter hits a scorcher right at someone. Well, it’s not his fault that the fielder happened to be standing right there. Let’s give him credit for a hit.
Errors are a fact of life. Every pitcher deals with them. If you want to separate chance from skill, then WPA is not the tool for you.
i am not talking about WPA in general, just as it applies to clutch. I understand that a lot of baseball is luck and as the saying goes, good players have good luck. If you are trying to figure out who is clutch it would be best to remove aspects that are out of the hitters control. An intentional walk would be a good example of this, as an all star hitter is much more likey to be walked. Think FIP, but for hitting and win value. How you would measure this i am not exactly sure.
If you haven;t noticed already, there’s some bias in the below-average WPA/LI group and the “bottom 10%” group. If the players with low WPA/LI numbers (aka bad players) don’t hit well in the clutch, then they wouldn’t be in the major leagues, and therefore wouldn’t be in your sample. IOW, the only thing keeping those bad players in your sample is their clutch hitting “ability.”
Unfortunately, it’s impossible to separate every ounce of performance into neat little piles of skill and luck. Just ask Norm Cash.
Dan, I’m not sure we can definitively say that the bottom 10% of below average WPA/LI players are in the major leagues solely based on their clutch ability. There are players in the league based on intangibles like being a good teammate or hustling, or even for quality defense despite lacking in offense.
There is a selection bias in baseball in that those called up to the major leagues are supposedly the best of the best of the best, and were weeded out from other minor leaguers for, amongst other things, there ability to handle pressure.
I think there may be some players in those groups that are in the majors strictly for their performance in high leverage situations but I would need some concrete proof that the only reason they are in the majors deals with clutch, IE, a Robert Horry scenario.
Though if you are saying that the only reason they are in the majors is because they have shown productivity in a split-area, then, from an offensive standpoint I would agree. Based on all of the confusion surrounding what clutch refers to, though, I wouldn’t label it. If your point is that these players would stink even more without their clutch scores as kept here, and therefore fall out of the majors, then perhaps there is something there, but there are other variables as well.
I think we would have to find players consistently in the bottom 10% and then determine if there major contributions deal more with intangibles, defense, or the clutch hitting. Someone like Adam Everett, for instance, would be in the league due to his defense, regardless of his clutch score or below average WPA/LI.
This isn’t exactly right, but lets say that this is true: Players are in the major leagues because of their WPA (helping their teams win). A bad player (low WPA/LI) is in the major leagues because his WPA is higher than his ability (WPA/LI) would suggest.
Some of the bad hitters (low WPA/LI) are still in the majors because they still help their teams win (WPA/LI + Clutch) and are therefore in the majors because of their clutch hitting. This isn’t true for all bad players obviously, but I think it’s enough to skew the results a little bit.
Right, as long as we acknowledge it isn’t true for all bad players, as I would imagine there are more Adam Everett-types (poor WPA/LI, poor Clutch) than those you described, simply because of the defensive value.
The key though, as I mentioned in the final paragraph, is that no correlation was found between WPA/LI and Clutch, meaning there is no direct relationship; it isn’t a definitive conclusion that bad players will have higher clutch scores.
Despite no correlation, breaking the players into the halves and top/bottom 10% groups shows that those with lower WPA/LIs have averaged a higher clutch score most of the time. So, while we cannot say this will always be true, they certainly have more “wiggle room” so to speak.
I think much of the discussion here stems from the fact that this clutch stat only measures a raising of the game, without acknowledging the prior game being raised (why someone like Pujols may appear less clutch than Juan Pierre), and how it is merely an interesting stat, not a conclusively definitive one; and that some others commenting here would like it to be more than just something of interest. That is definitely something plausible to consider.
This seems like common sense to me, but wouldn’t players in the top of WPA/LI generally be performing well above expectations, and therefore be harder pressed to beat those expectations?
For instance, Lance Berkman is getting an average of 0.731 bases/PA. If he comes to the plate twice in clutch situations and gets a single, he’s hitting .500/.500/.500 (a 1.000 OPS!), yet he’s going to be unclutch.
For a player like Omar Vizquel on the other hand, he’s only getting 0.255 bases/PA. If he comes up 3 times in the clutch and gets a single, then he’s performing well above expectations. If he were to get a home run? God save us all! He can go 0/14 over his next at-bats in the clutch and still beat his projection.
(I calculated bases/PA by adding total bases, walks, and HBP — I realize they don’t all have the same value, but I figured it would make it easier to understand what the ‘expected’ at-bat would look like more than OPS or the like)
Furthermore, regression to the mean rearing its ugly head would boost those at the low extreme, and suppress those at the higher end. You’d expect people with worse performance to improve as much as you’d expect the performance of those who are doing better to get worse.
Ideally, rather than using performance in both situations, you’d run a running ‘true talent’ simulator using Marcel (or another projection system, though I like Marcel because the inner workings are free for all to see and calculate themselves) and then compare what performance is expected (with regression to the mean) against what they are doing in clutch performance (compared to the same true talent).
I would take a wild guess and say that you’d get far more balanced results that way, even if it does seem counter-intuitive that someone outperforming their performance by .20 points of OPS in non-clutch situations, and .10 points in clutch situations is considered ‘clutch’.
I just have a difficult time with this definition of “clutch”. To me, being clutch, is contributing in high leverage situations (i.e. “coming through in the clutch”). It is not necessarily increasing your performance above the norm in these situations (i.e. “raising your game”).
This stat is only useful to see who can raise their game, and that’s limited by how much room their game is available to be raised. For example, picture a D student that aces the final to pull a B- vs. the A student that aces the same final, but has nowhere to go. The B- gets a cookie from mom, while the A student gets another pat on the back and a yawn. Obviously, this stat is flawed, because the good players are more likely to show up as a negative, while the poor players are more likely to show up as a positive.
Huskyskins,
The whole point of this post was to investigate the concerns with the stats, so believe me, we’re well aware of the potential downsides. The entire point of this post, in fact, was to test if there was anything concrete to suggest only below average players will have high clutch scores. In the end, there is no correlation, meaning that there is no concrete rule or finding that all good players post lower scores and all bad players post higher scores; however, as the results in this post showed, the bad players do have much more wiggle room.
Understanding that there is no direct correlation player to player, we can all agree that this stat is only really helpful at pointing out players that perform opposite their norm (e.g. good players performing poorly and bad players performing well), because it’s very difficult to be better than really good or worse than really bad. So, a correction factor that is applied based on the player’s baseline stats’ distance from the mean of all players may be helpful in regulating the trend that you proved in your post, where the good players trend down (as a whole) and the poor players trend up (as a whole).