Thinking of beating out a ground ball between say Reyes and S Drew…how quantifiable is the difference in speed? Reyes has a career speed score of 8.5 while Drew lags behind at 5.8. But from home to first, what is the difference? Maybe it’s 0.5 seconds, maybe less? The reaction times of the fielders, the strength of the throw to first, etc…does it make the 0.5 seconds inconsequential mostly?
Take Drew’s 2010 numbers. He was 56/184 on grounders. How many were the “bang bang plays”; 5-10%? The difference in speed may be mitigated by other factors in those cases as well….fielders, throw to first, etc. So I’m not surprised to see the total difference in hits to be 3-5 or so.
I wonder if another factor could be line drive score. That is, the grouping into 3 distinct groups for a batted ball (ground ball, line drive, and fly ball) is limiting. A hard ground ball is classified as a just a ground ball. Does someone who hits hard line drives also tend to be more likely to hit hard ground balls? Is there a “solid contact” statistic? Does the fact that Matt Kemp tends to hit the ball harder than say Juan Pierre cause some difference?
Speed score might be the best thing we have to try and ascertain how fast players are, but it’s not really giving us an ordinal ranking of who the fastest are. If we had something like elapsed time from moment of contact to first, I’m sure you’d get a much higher correlation there.
(some day in the future…….)
Comment by Nathaniel Dawson — April 29, 2011 @ 9:15 pm
I do wonder though how many players there are who get high speed scores based solely solid fundamentals, I really doubt there’s enough of them to skew this sample though.
More likely it is something like line drive % might inform this graph a little more. I imagine there is a significant number if guys in this sample who are belipow average hitters who stic around because of their speed, which makes me think their BABIP numbers could be deflated by weak contact. LD% might give us some indication of who’s just not hitting the ball hard enough to matter. Maybe just cutting everybody off the list with a LD% below a certain threshold.
I feel like a problem when calculating speed vs GB BABIP is that it doesn’t include any analysis of defensive values…a ground ball to Adrian Beltre at third is less likely to result in an infield hit than a similarly hit ground ball to Michael Young at third (just for one example). It may be too complicated to calculate, and I’m not sure what values you’d use (especially since you’d have to analyze every individual grounder to every individual defender to make it work), but some sort of Speed/UZR vs GB BABIP *might* have more distinct results than this graph. I do admit that it’s only a hypothesis, though.
Here’s what I’m guessing the real story is. Speed actually confers a somewhat bigger advantage to BABIP on ground balls than is shown in the simple bivariate regression shown above. But the speedsters “give back” some of that advantage because, as a group, they don’t hit the ball as hard as the average batter. (Think of all the Juan Pierres who are in the group, whereas the Ryan Howards are at the other end of the spectrum.) I.e., some of their softer ground balls are finding infielder’s mitts whereas the stronger hitters are punching them through the infield.
I bet if you were to do a *partial* correlation between GB-BABIP and Speed Score, while covarying out something like ISO (isolated power), you’d see a much more impressive relationship. And that relationship would more faithfully represent the advantage a generic hitter gets on this ground balls due to his speed. I believe you’ll need a more sophisticated stat package than Excel to do this one, but XL may have partial correlation in one of the add-in packages.
This is exactly what I was going to say. LHB also get down the line much quicker. Is this factored into Speed Score at all?
Comment by DominicanRepublican — April 30, 2011 @ 8:36 am
Have you thought of doing an ANOVA analysis by quartiles? That might allow you to just compare the fastest group to the slowest group. I think that all the variation in the middle might be what’s killing your model. I’d just break up the speed scores by quartile and then see if there’s any differences there. I did that with LD% vs babip on pitchers and I got significant differences between the top and bottom groups. That’s really all you’re trying to prove anyways, right? With things like speed, we don’t really care about the guys with mediocre speed; it’d just be nice to find ways to profile the extremes, like juan pierre and Pablo Sandoval
This is not particularly surprising. People always talk about speed when BABIP comes up, but clearly the biggest component to BABIP is ability to hit the ball hard, consistently. Look at the career BABIP leaders among active players: Ichiro, yes, but also Votto, Holliday, Wright, Hamilton, Mauer, Cabrera.
Ground balls should obviously be skewed more towards benefitting speedsters than other batted ball types, but even so, Juan Pierre has a league-average career GB BABIP because for every infield hit he beats out, he makes up for by how few of his grounders are hard-enough hit to find a hole.
.. but, isnt that offset a little by the RH bunted hits, vs the LH bunted hits.. LH have to lean a bit down the 3b line with the bat, and change direction/momentum to down the 1b line. RH are leaning down the 1B line with the bat AND after contact they still go in that direction. Yes, LH get down the line faster on conventional swings, but RH get down faster on bunts. AND, your speed guys are more likely to bunt for a hit than a “non speed” guy.
I agree, and I also think we have to consider defensive positioning. The defense knows who has the wheels, and may cheat in or break in quicker; conversely, they may lay back more against hard-hitting slow pokes.
To be honest, I don’t have the stats in front of me; i was assuming that since righties generally pull the ball more than lefties push it, they’d get on more because of the higher distance between the fielder and the bag. simply put, because the left-side hole is much bigger than the right-side hole. Though, now that I think about it that may be offset by the fact that lefties get out of the box faster. I’d really have to examine the stats, but i’m not quite sure where to find them.
Is the straightforward conclusion here not “Speedier players don’t leg out quite as many hits on groundballs as we thought”, but more likely “The Fangraphs “Speed” metric doesn’t measure ability to leg out groundballs very well”?
Comment by Felonius_Monk — April 30, 2011 @ 5:33 pm
Two things. One, what impact do defensive shifts have on the data given they are exclusively employed against power hitters, and thus likely low speed players. Second, since we’re just taking BABIP into account here, does the lack of home runs in the speed guys production also skew the data? I’m not sure what the effect would be exactly in either case, or if it would be more than negligible, but I am curious.
(2) Perhaps the composite “Speed” score is not the best predictor. I would be interested in seeing what the formula would be if we entered all the coefficients that makeup the “Speed” stat in a step-wise regression and if that would increase our R-squared. Maybe there are factors in Speed that just aren’t good for predicting BABIP. Let’s not just throw out this idea.
The L/R pull effect and distance for the throw should be pretty well off set with what I would assume is a much smaller effective fielding range on the right side with the generally poorer fielding 1st basemen and 2nd basemen, as opposed to 3rd basemen and the SS respectively. Plus you have the 1st baseman often holding a runner on 1st, reducing his range.
I say we just split L/Rs up and see the result and put some of this anecdotal guess work to bed.
Or attempt to come up with a hard/soft hit GB correction based on LD rate? It would be somewhat subjective, as we don’t have data to prove such a thing, but a reasonable enough assumption might be made to improve the accuracy of the result….maybe…
The “outlier” on the right isn’t really an outlier. It’s not that far off the line. The reason the slope isn’t good is the point. It’s hard to predict babip from speed score. There is a lot of variation, and only some of it is around the slope
The BABIP of those guys I’m not sure is necessarily an indicator that power leads to a higher BABIP, as the reason those players have some of the best BABIP’s careerwise is due to the fact that those players are the best in baseball at hitting line drives. All of those players have career line drive percentages of 21-22%+. While other players like Albert pujols/arod who have comparable if not greater power than some of those players listed are lower on the list, because they simply hit fewer line drives.
Even though stats have grown a lot, the problem I have with this, is the problem I have with a lot of stats. Too many variables. We have been “putting up the microscope” and getting more and more particular, ERA+, adjusted ERA+, all the other shit I don’t even know. You can’t get a totally exact measurement. Waaaay too many variables. The amount of time between the last time the infield was watered, where the infield was playing, the pitcher, the handedness of the hitter, time of day, maybe sun in the eye of a third baseman, quality of fielder, accuracy of the umpire, etc. Tons of stuff. Even the “speed score” can’t necessarily be trusted as accurate.
Not only that but let’s say stats show that speed doesn’t increase your average. That may be true, however let’s say it stresses the defense out and they aren’t as mentally stable. They make more errors (would BABIP be accounted for here?), maybe the next batter gets a better pitch selection, maybe the defense is more edgy, lots of intangibles that you can’t really measure.
I thought speed scores weren’t considered significant over the course of one season; at least in the Bill James handbook, he says that they are better measured after 2 seasons.
I agree with the comment above about breaking out each individual component of speed score…..try a model with all 5-6 components of speed score, and see if any of the coefficients are not significant (likely).
BTW, which version of speed score is being used here?
Not true. Infield hit% is NOT used for these speed scores. From the glossary link:
“Speed score (Spd) is a statistic developed by Bill James that rates a player on speed and baserunning ability. Different locations include slightly different components, but the FanGraphs version consists of, “…Stolen Base Percentage, Frequency of Stolen Base Attempts, Percentage of Triples, and Runs Scored Percentage.”
One thing to consider is that players with more power (and perhaps lower speed scores) will hit harder groundballs, which are more likely to get out of the infield and make the player’s speed unimportant as to whether or not it turns into a hit… it might be worthwhile to take a look at BABIP on groundballs fielded by infielders vs. speed score. My guess is that would carry a higher correlation than just BABIP on all groundballs.
Quite small? That’s around a 6% difference in BABIP from the start of your line to the end of it. I mean maybe considering we’re going the full range from slowest to fastest to get that 6% means it’s not a huge difference, but 6% on GBs is the separation between a lousy infielder and a good infielder.
I have crunched the numbers on this for my baseball game, don’t have the notebook in front of me, but when you hit the ball to the 3B your chance of getting to base go way up compared to hitting it the 1B. The 2B and SS are about a wash, the only other place better than that is to hit it right up the middle past the pitcher.
There’s the obvious time issues at 3B, you have to field the ball relatively cleanly in order to make the throw in time. And if you’re a faster runner this makes the window close even faster for a 3B or SS. Though there could be lots more.