Ten Things I Learned About Next Season

Over the past few months, Chris Constancio and I have been working hard to put together a projection system for The Hardball Times 2007 Season Preview. While I am afraid to count all the time we put into this, I can say for sure that over the past few weeks, each of us has been spending as much time on these projections as we have at work (or in my case, studying and attending classes). With all the time and effort we’ve put into these projections, we sure hope it was worth it!

Anyway, last night we finally completed this project: three years worth of projections for almost 1300 players. Not only was actually putting the projections to rest a huge relief, but it was also a learning experience. Using the most granular data available to us and complex aging analysis, we have generated what we believe to be extremely accurate projections three years into the future, and well, what fun is it generating all those numbers if you don’t look through them afterwards?

Of course, as soon as the final numbers were in, that’s exactly what I did, and I ended up with a bunch of interesting tidbits, adding to a list of things I noticed in the process of actually putting together the projections. I’ve cut the list down to 10, and ripping Dave off completely, I present to you 10 things I learned while building a projection system…

The American League is better than the National League

It’s true. Of course, I’m not the first person to make this observation; the definitive article on the difference between the two leagues was written here on THT by Mitchel Lichtman last summer. But knowing that the American League is better is not enough; to incorporate this fact into our projections, we had to calculate the difference between each league in each statistical category, and see if the disparity in pitcher quality and in hitter quality was any different.

So what did we find? Well, the American League has been clearly the better league for the past three years. AL pitchers are particularly better at striking out batters than their National League counterparts, and a little better on balls in play. Meanwhile, American League hitters are better pretty much across the board.

We’ve factored these adjustments into our projections, so that we don’t overrate players going from the National League to the American League or underrate players going from the AL to the NL.

Adam Everett can field a little

He sure can. We at The Hardball Times have purchased detailed zone rating data from Baseball Info Solutions (look out for it in our Stats section some time soon), and based on that data, we have developed fielding projections for every major league player.

All those projections will be available in the 2007 Season Preview, but here are a few to whet your appetite: The best fielder in the game is Everett, who is projected to be 19 runs above average at shortstop. Scott Rolen does slightly better in terms of runs—he’s at +20—but at an easier position. And guess who else projects to 19 runs above average?

If you guessed Alex Rodriguez … well, you didn’t. But our data shows that A-Rod was so great in 2004-05 that even a poor season last year can’t keep him from being one of the best fielders in baseball. Before you scoff, I’d like to remind you that Rodriguez won two Gold Gloves at shortstop before coming to the Yankees. Would it be that surprising for him to be great at third base as well?

Barry Bonds will break Hank Aaron’s record

John Beamer asked what Bonds’ odds were a few weeks ago, and we’d like to add a piece to the puzzle: The Hardball Times projection for Bonds’ 2007 home run total is … 22. That would put him at 756, one above current home run king Hank Aaron.

What about some other historical marks? Craig Biggio needs 70 hits for 3,000; we project 144. Tom Glavine needs 10 wins for 300; we say he gets 11. Sammy Sosa … I enjoyed watching Sosa play during his prime too much to run a projection for him this season.

(Note: All projections, no matter how good, still include quite a bit of uncertainty in them. Don’t bet your mortgage on 756 home runs for Bonds or 300 wins for Glavine. But betting it all on red … sometimes that works out well.)

No matter how many adjustments we make, there is still an infinite number of things we haven’t accounted for

One of the frustrating things about putting together this projection system was that the more we did, the more we wanted to do but couldn’t because if we put these projections out in August, they wouldn’t be very meaningful. For example, we incorporated detailed park factors and accounted for a player’s batted ball distribution to strip away as much luck as possible, but one thing I wanted to do, but couldn’t, was include data from the fabulous Hit Tracker website in our projections.

Instead, you’ll have to settle for an essay in the back of the book on the luckiest and unluckiest players, according to the Hit Tracker data. What I did was look at how much distance atmospheric conditions (wind, altitude, temperature) added to each player’s home runs versus what we expected.

These calculations account for park, so that players from Colorado don’t all show up on the unluckiest list because they have to play at a high altitude. But if you had to pitch at Coors when it was 90 degrees outside, that’s something that we don’t expect to repeat next season.

I’ll throw out some names to entice you. Lucky: Noah Lowry, unlucky: Justin Verlander. Lucky: Adam Dunn, unlucky: Pat Burrell.

Using Recurrent Neural Networks to Predict Player Performance
Technology is rapidly advancing possibilities in decision-making.
Albert Pujols rocks

Shocking, I know. We project that Pujols will hit .336/.429/.657 next season with 45 home runs and just 58 strikeouts. What’s more is that Pujols is projected at an insane 16 runs above average as a fielder, putting him at +7.3 wins above replacement next season.

How impressive is that? Pujols’ 25th percentile projection of 5.6 wins above replacement is better than any other player’s average line! Don’t worry, though; Ryan Howard has around a 10% chance of being as good as Albert next season. Yeah, Howard definitely deserved the MVP.

No less impressive is Johan Santana, who we project will post a 2.67 ERA and be worth almost six wins above replacement. What I hadn’t realized, though, is that Brandon Webb is the same age as Santana, and almost as good. Don’t be fooled by his Chase Field inflated ERA; Webb’s career ERA+ is a pristine 139.

A pair of former Marlins will shut down the AL East

Last off-season, the Florida Marlins let A.J. Burnett leave via free agency for Canada’s cheaper dollars, and traded Josh Beckett to the Red Sox for a rookie haul. How did that work out for the Blue Jays and Red Sox?

Not so good. Beckett was gopher-prone, and posted an ERA over 5.00, while Burnett made just 21 starts, most by the time the Jays were far out of the pennant race.

We forecast that both these pitchers will do significantly better this season, with ERAs in the mid-3.00s, and hopefully, both will stay healthy the whole year through.

Good players are prone to a greater range of outcomes

One thing we’re including in these forecasts is 75th and 25th percentile projections, which tell us how good or bad a player might be. Because these projections are denoted in wins above replacement, they are pretty heavily dependent on playing time.

To make the intervals around each player’s projection easier to understand, I’ve labeled each as “low” variability, “medium,” and “high.” The players with low variability in their projections are probably going to perform close to their predicted line, while guys with high variance will have a larger range of possible outcomes.

What that means is that players who are projected to get more plate appearances tend to have more variability in their projections, which I think makes sense. For a mathematical reason I won’t go into, superstars tend to see even more variance in their percentiles.

I think this actually makes sense if you think about it: A bad season from some scrub won’t really have much of a negative impact on a team’s win total; a bad season from a superstar can bury you.

Justin Morneau may actually be better than Joe Mauer

There’s been a lot of talk amongst excited Minnesota fans and greedy Yankees fans about which of the two young Twins stars is better. The stat-head consensus is generally with Mauer, but now I’m not so sure about that conclusion.

Our projections find that over the next three years, Morneau will be worth .40 wins more than Mauer, and that’s including a sizeable advantage for Mauer as a catcher. I realize that .40 wins is nothing, but Morneau also seems to have more upside than Mauer, and more importantly, he doesn’t play a position that’s known for breaking players physically.

If Mauer has to move from the catcher spot, he will lose more than two wins a year of positional adjustment, which would put him way behind Morneau. It’s true that Mauer is two years younger, but again, in terms of physical age, he likely isn’t ahead of Morneau, and is also aging faster.

How come our projections like Morneau so much? Well, Morneau happens to be a very underrated fielder—we project him to be 13 runs above average at first in 2007.

Let’s put the Wang disagreements to bed

Okay, my attempt at a dirty joke failed, but the point is still important. I can’t recall the last time I got so many pissed off e-mails as I did after running a column suggesting that Chien-Ming Wang would see a strong decline next season.

That assumption was based off batted ball data, but only a year’s worth, with some pretty crude assumptions. These projections are based on up to four year’s worth of batted ball data as well as more basic outcome statistics. And the verdict is … Wang is awesome.

We project a 3.82 ERA next season despite a 63/41 K/BB ratio, mainly because Wang is projected to allow just 12 home runs all year.

Projections are tough to do

This system is the result of hundreds if not thousands of hours of work, which included a horrible false start that almost sank the whole project. We’re happy (we’d be ecstatic if we weren’t so tired) with the results, and we hope that you enjoy our Preseason Book and the projections we’ve worked so hard to generate.

But I have to acknowledge the awesome work done by other keepers of projection systems: Nate Silver, Dan Szymbroski, Sean Smith, Ron Shandler, and Tom Tango, to name a few. There’s only so much you can do to make your projections stand out, and all these guys have done a great job generating outstanding sets of projections.

We hope you come to feel the same way about our effort too.

Print This Post

Comments are closed.