Archive for Research
by Bradley Woodrum - March 2, 2012
·

Dare to dream.
On Thursday, we looked at Fielding Independent Offense (FIO) — as well as the Should Hit formula — and decided to toss stolen bases into the equation. The result were, let’s say, brow-elevating.
Today, we are going to put that result — the FIO formula — into action.
In the timeless words of Sir Samuel Leroy Jackson: “Hold onto your butts!”
Read the rest of this entry »
by Bradley Woodrum - March 1, 2012
·

IT’S SO *** **** HARD TO THINK
WITH ALL THESE DUCKS EVERYWHERE!
In August of 2011, I introduced Should Hit (in three iterations: ShH, SHAP!, and Complete SHAP!). Should Hit is essentially a simple regression of walk rates, strikeout rates, home run rates, and BABIP on weighted runs created plus (wRC+). In both its calculation and its simplicity, it is very similar to FIP — but its uses and impact are quite unlike FIP.
Like FIP with groundball pitchers, the formula has some biases — known, accepted (by me, at least) biases. For instance, because it ignores doubles and triples completely, Should Hit naturally undervalues players who excel at the extra bags and overvalues to the sluggers stuck at first. It presumes a certain number of doubles and triples for every player based on their home run rate and other peripherals — all poor proxies for something that is a verifiable skill or weakness in many players.
Ultimately, though, the tools (ShH and its brethren) work rather well. For the curious thinker, ShH can admirably predict what a player might hit with a normal/career BABIP or if their BB% or K% or HR% changes. However, at the time of its uncovering, I was wrongly under the impression that the current FanGraphs iteration of the wRC+ formula did not include stolen bases. It mattered little to me at the time — the only reason I thought the uncovering was so interesting to begin with was that only four peripherals could explain almost 93% of the variation within wRC+ (and that is still amazing to me!)
But today, we are going to add in SBs and stand back with a decanter of thought and ask ourselves: “What the hell did we just make here?”
Read the rest of this entry »
by Bill Petti - February 28, 2012
·
When we last left the question on Park Factors’ effect on ERA estimators we found that the estimators performed the best in hitters’ parks when looking at starting pitchers. FIP and xFIP performed better than tERA or SIERA when predicting the next year’s ERA for this group of pitchers. For the other park types, the pattern looked similar to what we generally see — SIERA generally performs best, while all estimators provide better leverage over a pitcher’s YR2_ERA.
But what if we want to predict how pitchers with certain batted-ball profiles (fly ball vs. ground ball) will perform in different parks? If we’re trying to predict how C.J. Wilson (lifetime 1.68 GB/FB ratio) will perform moving from Texas to Anaheim — or Michael Pineda‘s (0.81 GB/FB ratio) move from pitcher-friendly Safeco to hitter-friendly Yankee Stadium will turn out — in which estimator(s) should we have more faith? That is the focus of Part III.
I used the same methodology as Part II to determine park type. I then coded each pitcher as ground ball or fly ball based on their GB/FB ratio. A pitcher’s GB/FB is one of the most consistent metrics (for starter pitchers, the year-over-year correlation is 0.87, which is highest for all outcome metrics), so there was little concern about a pitcher changing their batted-ball profile between seasons. A GB/FB greater than 1 was coded as ground ball; less than 1 was coded as fly ball. In the end, 1,387 season pairs were included in the analysis:
Read the rest of this entry »
by Matthew Carruth - February 24, 2012
·
Prior to the 2005 version, the Major League draft used to alternate picks between the American and National Leagues like they also used to do with home field advantage in the World Series before Bud Selig had to dip his meddlesome fingers in. There are some well-known times when the first overall pick did not go to the team with the worst record in the previous season.
The most recent was when the Padres got to select ahead of the Tigers in 2004 despite the 2003 Tigers having happened. Luckily for the Tigers, the Padres picked Matt Bush and the Tigers landed Justin Verlander. I don’t think they’re crying foul over that missed opportunity.
Read the rest of this entry »
by Bill Petti - February 22, 2012
·
In my series’ first part, I looked at the effect that Park Factors have on various ERA estimators. The original question I attempted to answer was whether certain estimators were better suited for predicting performance, depending on whether a park is hitter-friendly or pitcher-friendly. The short answer was that ERA estimators did a much better job in hitter-friendly parks than pitcher-friendly parks, relative to YR1_ERA.
One question I didn’t answer was whether the effectiveness of estimators in various types of parks also varied by pitcher role (i.e. starters versus relievers). Generally speaking, ERA estimators perform better when you restrict the analysis to starters only — since relievers tend to be more volatile year-over-year. The question is whether this same pattern will hold given park factors’ impact. And as predicted, ERA estimators do a better job predicting performance for starters versus relievers.
The current data set includes 533 pairs of starter seasons and reliever seasons where the pitchers threw in the same parks in the first and second years, and did so as starters or relievers both years. Before segmenting by park type, we see results that are consistent with previous analysis regarding ERA estimators and their predictive powers for starters and relievers:
Read the rest of this entry »
by Jeff Zimmerman - February 20, 2012
·
I have gone through all of the 2011 MLB transactions and compiled the disabled list (DL) data for the 2011 season. I have put all the information in a Google Doc for people to use
Read the rest of this entry »
by Matt Swartz - February 16, 2012
·
This week, I’ve talked about the retrospective price of WAR on an aggregate level. What I haven’t studied is the retrospective price of WAR by position. I thought this was particularly important in light of my finding that positional adjustments didn’t matter much for arbitration salaries. Players who played tougher defensive positions were underpaid in arbitration, relative to those who played easier defensive positions. As it turns out, the price of WAR has been much more expensive for some positions.
Read the rest of this entry »
by Jeff Zimmerman - February 15, 2012
·
Intentional walks (IBB) are usually given to good and/or unprotected players in a lineup. Pitchers would rather face the next, weaker hitting batter. The IBBs lead to an inflated walk rate (BB%) for hitters. By removing IBB from a player’s BB%, a true walk rate emerges. A problem I noticed was that when a player’s IBB% increases so does their non-intentional walk rate (NIBB%). Here is an attempt at putting some numbers behind the assumption.
Read the rest of this entry »
by Bill Petti - February 15, 2012
·
(Note: I noticed a coding issue in the data, which resulted in three parks having a different classification. The data has been re-run to reflect the new results and the article updated to reflect the findings.)
Researchers have gone to great pains to highlight and account for factors outside of an individual player’s control when evaluating their performance and value. The standard for this is of course Voros McCracken’s seminal research into defense independent pitching and Tom Tango’s fielding independent pitching (FIP). While baseball is arguably the most “individualistic” of the major team sports, players do not perform in isolation from each other or from their environment.
Lately I’ve become more interested in how the physical environment of a team and its players affects their outcomes on the field. My initial research led me to look at whether a team’s home park and the degree to which it inflated or suppressed run scoring put the team at a fundamental advantage or disadvantage in terms of winning. The results suggested that hitter-friendly parks do, in fact, put a team at a fundamental disadvantage, likely due to the stress that playing 81 games a year in that environment places on the pitching staff.
In this article, I am concerned with how park factors may affect the various constructs we’ve developed to help us better evaluate a player’s talent and likely performance in the future. Specifically, to what extent to do park factors affect the usefulness of various ERA estimators? It seems reasonable to assume that while much of what happens when a ball is put in play is not controlled by a pitcher. However, given that some extreme parks are likely to exercise their own environmental force over the outcome of batted balls it stands to reason that ERA estimators that factor in a pitcher’s batted ball profile may do a better job in certain types of parks than others.
Read the rest of this entry »
by Jeff Zimmerman - February 14, 2012
·
Recently, one of our readers, Simon, noted that the Rockies might be targeting fly-ball pitchers with the recent additions of Guillermo Moscoso, Jamie Moyer and Jeremy Guthrie. I decided to examine if going after fly-ball pitchers was a practical method for limiting runs at Coors Field.
In an ideal world, the Rockies would love to have all extreme sinker-ball pitchers. The Rockies GM, Dan O’Dowd, stated this stance recently on Clubhouse Confidential.
In an ideal world, every single guy in Colorado would be a heavy sinker ball guy who would have a tremendous ground ball to fly ball ratio.
It is not an ideal world and he knows it. He goes on further to state:
Unfortunately not all of our decisions are made in an ideal world. When we balance fly ball rates, we really try to balance soft and hard.
Read the rest of this entry »
by Matthew Carruth - February 14, 2012
·
Read about the worst relative strikeout seasons here.
A natural extension of seeking to identify the worst pitching strikeout season in baseball history is to find the best. That covers the two extremes. I suppose I could do the most average strikeout seasons next, but (yawn) I had to go take a nap after just writing that sentence.
What I really enjoy about looking at baseball in this way is that it often gives me a fresh perspective on history that I’ve long lost the ability to recall. Such is the case here where exploring the topic of lots of strikeouts led me to a lot of reading about two pitchers in particular from baseball’s past that I hadn’t thought about, statistically, in a while.
Before I get to them, Pedro Martinez’s remarkable 1999 season deserves a digital nod of acknowledgement. It takes a mountain of talent to rack up enough strikeouts to more than double the league rate when that rate is already as high as 16%, but that’s what Pedro did in ’99, striking out 37.5% of batters he faced. It’s in the top ten of all time and the best since integration.
Read the rest of this entry »
by Matthew Carruth - February 13, 2012
·
In the past, while researching pitchers that had started team’s Opening Day games, I came across the name Glenn Abbott. Abbott had been the Mariners’ Opening Day starter a few times including in 1979, the team’s third year in existence. During that season, Abbott would go on to be pretty awful over 518 batters faced. Notably, he struck out just 25 hitters that year, a 4.8% strikeout rate that I found fascinating in its ineptness.
Late last week however, Jeff Sullivan reminded me of the 2003 Detroit Tigers and a Lookout Landing reader noted Nate Cornejo’s season that year in which Cornejo fanned a mere 46 hitters over a larger 842 batter sample. Amazingly, Cornejo netted 1.9 WAR that year. Cornejo’s strikeout rate was superior to Abbott’s by a touch, but it struck me that because of the changing nature of the game, with strikeouts more frequent in 2003 than in 1979, that Cornejo’s season was perhaps worthier of enshrining.

Read the rest of this entry »
by Matt Klaassen - February 10, 2012
·
Earlier this week on Twitter, I was part of a discussion comparing Troy Tulowitzki and Evan Longoria, two of the best players in the game. I personally give Longoria a slight edge, but obviously Tulowitzki is great, too. If someone prefers him to Longoria, that is fine, and I could probably be talked in to it. What really spurs this particular post is the discussion we had about comparing their offense. Keeping in mind that this was a casual discussion rather than a deep evaluation of “true talent” involving all of the necessary regression and adjustments, someone noted that over the last three seasons (2009-2011) the two players have had virtually identical offensive value per plate appearance: Tulowitzki has a 137 wRC+, and Longoria has a 136 wRC+. I argued that Longoria’s performance was more impressive given that the American League has superior pitching relative to the National League.
However, Dave Cameron made an interesting point: the Rockies play in the National League West, where hitters seemingly face s larger proportion of stud pitchers — Dave mentioned Tim Lincecum, Matt Cain, Madison Bumgarner, Clayton Kershaw, and Mat Latos in this connection. He also pointed out that Longoria did not have to face the Rays’ own excellent pitching staff. So I decided to look at it more closely. The point is not to settle the Longoria versus Tulowitzki dispute. Rather, I am interested in whether individual hitters face (or do not face) particular pitchers enough that they require a “divisional” adjustment of some sort.
Read the rest of this entry »
by Jesse Wolfersberger - February 2, 2012
·
The contracts that baseball players sign are some of the longest contracts in business — not just sports. When handing out nine- or ten-year deals, projecting salary inflation is critical, and yet getting an accurate forecast is nearly impossible.
Read the rest of this entry »
by Eno Sarris - February 2, 2012
·
Punxsutawney Phil is due to make his appearance today. He’ll survey the ground around him, take stock of the adoring fans, and prognosticate about the weather. With how bad our weatherpeople are at long-term meteorological predictions, maybe it makes sense for us to turn to a land-beaver for our winter forecast needs.
But what about the state of the ground in baseball today?
Read the rest of this entry »
by Matt Klaassen - February 1, 2012
·
It is officially February when baseball news is reduced to vague rumors about teams from which Roy Oswalt and Edwin Jackson may or may not be considering one-year offers. Well, that and that Mark Teixeira saying he might bunt to beat the shift this season. Hoo boy.
There is something of interest in the Teixeira report, though. Sure, we do not know whether he is actually going to do it or not. Remember, this is the time of year when players say things like “I’m going to steal 20 bags this year” even if they have never stolen more than 10 in any season. Still, it is not a crazy idea. While sabermetric writing on the internet went through a phase of arguing that bunts are counterproductive to scoring and winning, research has progressed to show that bunts are not as bad as all that. In certain situations, they can be a good idea in terms of getting the win in a close game or simply “keeping the fielders honest” (also known as “game theory,” a term I am pretty sure Bruce Bochy uses frequently).
But what about power hitter like Teixeira? Isn’t bunting always a bad idea for them? To answer this properly would require a great deal of complex thinking and programming. For now, let’s take a simple approach by looking at some data from 2011 to see whether Teixeira is simply blowing smoke or making sense.
Read the rest of this entry »
by Eno Sarris - February 1, 2012
·
Players leave money on the table every year. It’s true! Pitchers, in particular, have been signing away free agency years at below-market prices for a while now.
Consider the most recent big signing, Yu Darvish. He most likely would have made more money had he stayed in Japan for three years and come over as a free agent. Through the arbitration process in Japan, he was due around $27 million over the next three years, and his deal with the Rangers only pays him $25 million over the same time frame. Had he continued his dominance, and come over in three years, it seems likely he would have made more than $30 million over three years. He would have had the leverage of the unrestricted free agent.
But Darvish’ plight resembled that of the arbitration-eligible pitcher here in the states. He could only talk to one team, which should sound familiar. And he probably valued some non-monetary benefits that a long-term contract offered: security and the ability to compete against the best in the world. How prevalent is this sort of give-and-take in the normal process here in the states? How many pitchers have given up free agent years at below the going rate?
Read the rest of this entry »
by Dave Allen - January 23, 2012
·
This year’s Hall of Fame ballot had a very weak pool of first-year candidates. Bernie Williams was the leading vote getter with just 9.6% of the vote, and the only member to break the 5% cut off to stay on the ballot. At the same time the 14 returning candidates saw their vote total increase by an average of 7.1%, and five had increases of over 10%. Many have suggested that there is a relationship between these two facts; that is, with few good first-year candidates to vote for there were extra votes for the returning candidates.
Last year David Roher at Deadspin/Harvard Sports Analysis Collective noted that over the 2000s the average number of votes per HoF ballot was fairly constant, between 6.6 and 5.35. This would suggest that in years with strong first-year candidates there would be fewer votes for returning candidates and vice versa. I wanted to more explicitly test this relationship and see whether it extended further back than just the 2000s.
I looked at every Hall of Fame vote from 1967, when the current voting rules were put in place. Along the x-axis is the average number of first-year candidates voted for. Along the y-axis is the average change in vote share for returning candidates compared to the previous year (here an increase from 60% to 65% would be denoted by 0.05).
Read the rest of this entry »
by Eno Sarris - January 12, 2012
·
Wine and cheese make for a delectable combo. But the two foods don’t age the same. Wine takes much longer to turn to vinegar than it does for your cheese to grow fuzzy green mold. That’s why wine is the one used in sayings by older men verifying their remaining virility.
Power, patience and contact are the components of a delectable (productive) hitter. And yet, like wine and cheese, it turns out that these different skills age differently. Ages 26 through 28 are often used to represent a hitter’s peak, but not all of their different faculties are at their apex in that age range. Let’s check the aging curves, once again courtesy stat guru Jeff Zimmerman.
Read the rest of this entry »
by Eno Sarris - January 11, 2012
·
When Johnny says his last lines in “The Outsiders” — “Stay gold, Ponyboy. Stay gold.” — there’s more than a slight touch of mortality in the moment. There might even be outright pessimism about the directive. After all, the Robert Frost poem he’s referencing finishes: “Nothing gold can stay.”
Turns out Johnny and Frost know a little something about pitchers and strikeout rates. Thanks to the inestimable Jeff Zimmerman, we have strikeout aging curves for both starters and relievers. As dawn turns to day, it seems, pitchers also lose their gold.
Read the rest of this entry »
|
Post Count:777