Thoughts on Carl Crawford and FIELDf/x
There’s been quite a lengthy discussion covering multiple angles at Dave’s post on the Carl Crawford signing with the Red Sox. From the unprecedented value of the contract for a player of that type to how the Boston lineup should look to what this means for the future of free agency, a variety of interesting debates stem from a transaction of this magnitude. The one thought that intrigued me most, however, was the argument that playing left field at Fenway Park could diminish the value of Crawford’s defensive range because of the Green Monster. The idea is that Crawford’s speed and range in the outfield may be better served at right field since the left field wall limits the ability for Crawford to chase down balls as many bounce off the wall that would be outs in other ballparks.
At the same time, Boston’s regular right fielder J.D. Drew is known to have a pretty good cannon of an arm. Even at the age of 34, Drew was rated a 62 in arm strength and a 69 in arm accuracy by Tom Tango’s Fans Scouting Report, the second of which was tenth among rightfielders last season. FSR rated Drew a better thrower than the other Sox outfielders, be it Crawford, Jacoby Ellsbury, or Mike Cameron. Conventional thought also says to place Drew in right field in order to maximize the utility from his good arm, mostly in situations for runners going from 1st to 3rd.
So on the one hand, you got the possibility of the Crawford/Ellsbury tandem roaming the deepest parts of Fenway in center-right, maximizing Crawford’s fielding value. On the other hand, Drew’s plus arm is a good fit for right field as well. With expected values pulling against one another, how can the Red Sox outfield arrangement be optimized according to Fenway’s ballpark dimensions?
Without fielding metrics to accurately weigh the value of throwing ability against defensive range given Fenway’s dimensions, we may have to rely on scouts to make an assessment. However, I believe that proper analysis of FIELDf/x will answer such questions definitively. Several months ago, a group of highly intelligent and single-minded individuals gathered in a darkly lit room (as it appeared on webcam) at the home city of your 2010 World Series Champions to listen to presentations on FIELDf/x, what looks to be the future game-changer of baseball analysis.
FIELDf/x records high resolution shots 15 times a second, identifying every human on the field with each shot assigned to a time stamp. It also records events, such as when the pitcher releases the ball, the batter hits the ball, the fielder gains possession of a ball, and the fielder throws the ball. Whereas PITCHf/x gives us about 250 pitches per game, there may be up to 1 million FIELDf/x data entries recorded a game. This comes out to over 2.4 billion lines of data for each season describing the locations of fielders, baserunners, and umpires sorted by game and time stamp, scaling the petabyte level in memory.
You can already start imagining how FIELDf/x can inform the Red Sox about their outfield. For instance, let’s look at Crawford’s case. With a full FIELDf/x database on a server dynamic enough to display timed animation of fielding routes, we can draw a baseline for True Defensive Range (coined by Greg Rybarczyk). Looking at away teams, we can estimate how a leftfielder’s TDR is affected when playing at Fenway versus when playing at other ballparks. Distribution curves of varying TDR values can be plotted for leftfielders at Fenway versus that of leftfielders at other ballparks, giving us an idea of how Crawford’s TDR should be affected by Fenway if we look at his TDR comparables.
For Drew, we can improve our assessment of arm strength by accounting for 1st-to-3rd baserunning situations and runners thrown out at the plate while accurately categorizing every throw a right fielder makes at any ballpark. Those situations can be compared with would-be 1st-to-3rd baserunning situations that are hit to left field instead (where the runner at 1st doesn’t think twice about passing 2nd). In whatever method is chosen, throwing ability can be more accurately assessed, whether it’s comparing its value in LF and RF at Fenway or comparing rightfielders across baseball. The theory is then that Crawford’s plus-plus range and plus arm versus Drew’s plus range and plus-plus arm can be quantitatively assessed in order to figure out who is a better fit in LF or RF at Fenway.
This seems like a lot of work in order to make one decision, deciding on switching two corner outfielders who are both already good at defense. However, what can be taken away from this FIELDf/x thought experiment is the breadth of questions that FIELDf/x can be utilized to answer — if assembled and analyzed correctly.
When FIELDf/x is fully operable in all 30 MLB ballparks, you can bet that some front offices will be all over this more readily than others. Some clubs will already have internal FIELDf/x-ready systems in place to filter out a large fraction of the massive dataset in order to store, read, mine, and analyze the meaningful bits (while other clubs will remain clueless). Whereas SQL databases are great for storing PITCHf/x data, it may not be enough to store FIELDf/x data. Clubs may have to find a more dynamic database system if they want to preview animation of fielding plays while sifting through terabytes of data.
You might see a few front offices hire a team of programmers and developers just for handling FIELDf/x data alone. It may also help them to create a team comprised of baseball minds, maybe a few analytically-minded scouts, just to figure out what they want from this data. For all the busy work and limited time a front office has, focusing on an organization’s needs (baserunning ability, infield defense, or outfielder throwing ability) may be more efficient than laying the framework for an exhaustive study on every aspect of fielding. Organizations will tackle FIELDf/x at different magnitudes as well as from different angles, extracting needs-based information for team-specific analyses.
Decisions such as whether or not the Red Sox should move Crawford to right field will soon be more answerable. FIELDf/x should pave the way for better analysis (and it already has) of any fielder’s reaction time, true defensive range, fielding routes, decision making, and more. As Rob Neyer said, FIELDf/x is going to change everything.
How would you use the data?












0

Whereas SQL databases are great for storing PITCHf/x data, it may not be enough to store FIELDf/x data. Clubs may have to find a more dynamic database system if they want to preview animation of fielding plays while sifting through terabytes of data.
Hmmm I already work as a data analyst and use a massive hadoop cluster.
Too bad I’m not actually the systems admin on it.
Assuming the software design team for FIELDf/x creates a strong output form — as in, byte-by-byte positioning of each relevant piece of information — for each record, I don’t think it’d be very difficult to manipulate through PERL, C, and R to create whatever you want with the data. 2.4 billion lines of data is not daunting in the world of large-scale analytics, especially when real-time performance can be ignored, as it can here.
I am very intrigued to learn more about FIELDf/x.
That was a retort about the SQL quote, not about anything in your post in particular. Just fyi.
The nature of the data, though, may require a different type of database. I don’t claim to be the biggest expert on databases and analytics in general, and I was wondering if some of the experts out there could comment on this.
For time-stamped position data such as that of FIELDf/x, I wonder if it would be particularly useful to group all lines of data by play (and categorized), so that a database admin could simply pull up, say, all double plays for a particular batter.
It would be useful to automatically generate images or even animations of the examples below from Harry Pavlidis. Whether that can be done within a dynamic database system or through an external programming application, there are several possible solutions I suppose:
http://www.flickr.com/photos/84639650@N00/sets/72157624701563531/
Yes, I’m actually intrigued on what kind of data could be collected.
I’m sure math/stat geeks could have a field day w/ this stuff.
Pun not intended.
Albert,
That’s what I was getting at when I said that it greatly depends on how well the software team implementing FIELDf/x writes its output.
If the ouput is in some sort of fixed format where each line indicated a moment in time, and some sort of setup where bytes 1-4 represent the batter code, bytes 5-8 the pitcher code, bytes 9-12 the first baseman code, etc. up until you reach the actual play data where (for example) byte 100 represent the result of the play (like an A here would represent double-play, B would represent strike, C would represent homerun, etc.), then it wouldn’t be too difficult to parse data effectively.
Has there been any discussion of Fieldf/x being a significant piece in helping San Francisco (which has had FIELDf/x in place for two years) win the World Series, and whether that is a major selling point to other front offices?
Well, even with the huge improvement you get from moving to essentially continuous variables over binned ones, the sample size is still too small for any non-Giants players (maybe some of the other NL West starters are close) so all they could really do would be compare their players to each other and not really to players from other teams (i.e. analyzing who to acquire) Heck, most of their new outfielders over the last year were from the AL anyway.
The most likely possibility for SF benefitting from FieldF/X would be in defensive positioning, but with no other stadiums having the data yet, there’s no FX way to compare teams. (All home teams should do better than road teams in that regard, I would guess.)
I don’t know what SF does with the fieldF/X data, but the team as a whole has had absurdly high UZRs over the last two years. Not limited to the good numbers from people like Aubrey Huff and Pat Burrell.
How so?
Seems like it’s going to change the methodology of everything, but I bet our conclusions are rather similar. Our numbers will likely get more precise, but leading to a similar result.
Like pitch Fx, I think it’s fascinating, but it did not really tell us anything drastically new that we did not already know … i.e., pitcher’s best pitch, best pitchers in league, etc … and even when it does tell us something “different” there’s always SSS to contend with.
I love this stuff, and have been waiting for something to truly give us an accurate measurement of a player’s defensive value. But, I don’t think that means teams will be paying market value for defense (in terms of WAR and fielding runs), which to me would be along the lines of “changing everything”.
It is cool though.
Will this include hit f/x data? I’m very curious to see that. It seems to me that hit f/x could be used in conjunction with field f/x because the batted ball speed will affect reaction times, though I suppose field f/x might capture some of this on its own. The hit f/x would go a long way to improving xBABIP and valuing hitters, in my opinion.
FIELDf/x will give time-of-flight, which is the relevant number for evaluating the defensive player.
Man, I would really love some public hit f/x data…
Even without FIELDf/x, Crawford in LF in Fenway will tell us something interesting about established fielding metrics. We know he’s an excellent fielder, and that’s supported by available stats – so we’ll be able to tell if and to what extent Fenway’s LF somehow throws the metrics off. An unusual effect has often been suggested, mostly accompanied by comments that “Manny isn’t really all THAT bad.”
His time with the Dodgers, patrolling a much larger and more conventional LF, showed that Manny clearly wasn’t such a bad fielder. Also, Manny’s arm has been severely underrated for years for whatever reason. When he came up with the Indians, people raved about his arm and he played RF exclusively because of it.
Thing is that with so many variables that could be considered, if each front office develops their own system they’ll also be developing their own algorithms and those will be of varying quality. It might then be interesting to evaluate those versus each other because this level of data creates new inequalities. The differences may not be substantial but we should be able to see if and how they are exploited.
Theo Epstein has said that despite speculation that Fenway will limit Crawford’s defensive contributions, their research tells them that the exact opposite is true. So take that for what it’s worth.
I would not be surprised if they had highly advanced simulation systems, and they’ve run 1000′s of simulations with him at different positions, etc and are confident that with him in LF is their best move.
What I disagreed with was Crawford saying he only wanted to play LF.
My thought was to put him in the OF position that has the chance to get to the most balls. That’s the team’s priority.
A lefty playing LF can, with sufficient range, take away the line (extra bases). What I would like to see is a scatter-type ploy spray chart for opponent hitters against BOS’s pitching staff, and see which field has the most balls hit to it (LF, LC, CF, RC, RF), and then you can make some decisions base don that.
I do not think, however, that the gren moster is going to reduce Crawford’s range. IMO, he probably did not rack up 150+ more catches than other LF by catching balls at the track. I’m guessing it has more to do with balls to his sides and in front. In that regard the Monster could help him, because he essentially doesn;t have to worry about those balls that in other fields that would be near the warning track at other stadiums. They may literally decide that the won;t give up bloop hits to LF or anything down the line.
Fielding Bible has Crawford at +57 plays over the last three years and that would appear to be about 13 going back or laterally and 44 coming in (they only break down bases that way, not plays). That’s probably a big part of him being a good fit for Fenway, where he will be able to play shallower than in an ordinary park.
“Without fielding metrics to accurately weigh the value of throwing ability against defensive range given Fenway’s dimensions, we may have to rely on scouts to make an assessment.”
I found this statement to be hilarious. This statement sounds like “Without any good single malts available, I had to drink Thunderbird wine” or “There weren’t any real lookers at the party, so I settled on someone that at least had most of her teeth.”
I suppose I am not quite as far down the Sabremetric path as many others. I mean, I love that statistical analysis has made our understanding of the game more complete, but I don’t completely dismiss EVERYTHING a scout with 20 years’ experience has to say. The good ones add something to the discussion, don’t they? (Or am I too new to realize I just committed heresy?)
Incidentally, Drew’s arm has never been rated as well by defensive metrics as it has by the FSR.
A RF’s arm mainly comes into play for a throw to third. These throws rarely happen unless you have a runner on 1st, or maybe 2nd with less than 2 outs. It would be a pain in the ass, but I wonder if the Red Sox could maximize their value by having Drew in LF and Crawford in RF most of the time, but switch them when there is a runner on 1st, or on 2nd with less than 2 outs.
Huh, this is interesting. Is there any rule that says that a player has to start in a certain area, or that they have to at least be in a certain order (see: “the shift”)? If not, this is an idea, but definitely a pain in the ass.
There is only one rule regarding defensive positioning: the catcher must remain in foul territory behind home plate. You could have the other seven defenders form a cha-cha line behind second base if you really felt the need to.
Pretty exciting stuff if you’ve got the bucks. One quibble, Craw doesn’t have a good or even average arm. It’s pretty poor. Like Damon poor, but luckily without the “my sister throwing opposite handed” motion. The wall might do a good job of neutralizing some of that (and since he catches just about everything it rarely matters), but I can assure you that good teams can and will run on him at will. Look for the Rays, Angels, Rangers, and Twins to run all over him.
I’m not sure I understand why it matters so much which outfield position Crawford plays. If he’s in left and you think he has enough range to cover more ground than your average LF, shift the CF over towards right as necessary.
“At the same time, Boston’s regular right fielder J.D. Drew is known to have a pretty good cannon of an arm. Even at the age of 34, Drew was rated a 62 in arm strength and a 69 in arm accuracy by Tom Tango’s Fans Scouting Report, the second of which was tenth among rightfielders last season. FSR rated Drew a better thrower than the other Sox outfielders, be it Crawford, Jacoby Ellsbury, ”
Having watched almost very Red Sox game this is not my opinion. JD’s arm strength may be living on past reputation.
An OF’ers effectiveness is due as much to quickness of release and their ability to charge ground balls or position themselves well on FB for the throw. JD does not do this well (maybe because of his back, hamstring, and shoulder problems).
He only threw out 1 runner last year (IIRC this last among RF’ers with half the playing time, let alone equivalent playing time), and this assist was in September. There were several plays he had at the plate on SF that a good RF’er makes but his throws were not even close to the plate and were late.
Crawford will still have value in LF, especially if he plays shallow like Manny did and use his speed to get to balls hit deeper. You don’t need a gun for an arm in LF has Manny showed, just a quick release and accuracy.
The majority of a player’s OF UZR comes from range. ARM component of UZR has minimal contribution towards how many runs he prevents or doesn’t.
The run expectancy of a player on 3rd isn’t much greater from being on 2nd. Stopping first to third isn’t a huge deal, or that common of a scenario.
Stopping a player from 2nd to home is extremely common. This can happen from all fields. One may even argue its a more common situation from fenway LF, since the LF’er is so close to the infield, while the RF’er is playing very deep.