The Fans and Marcel in 2011: Position Players

With the Super Bowl over and pitchers and catchers reporting in a week, baseball season is finally in sight. For me one of the most interesting parts of the offseason is the Fans projections. I think they offer an incredible collection of information on what the crowd thinks about the coming year. As the offseason transition to spring training I wanted to take stock of these projections.

Last week Tango released the Marcel projections, which give a perfect baseline for comparison for the fan projections (you can also view the Marcels here at FanGraphs). Marcel takes three years of past performance, weighting more recent performances more heavily, regresses towards mean and then applies a simple aging factor to come up with its projections. It is designed as the minimum projection system not taking into account park factors, minor league data or other more complicated inputs. By comparing the fans to it we can get a nice picture of what they are thinking.

Last year we saw that the fans were relatively optimistic — by about half a win — but it turns out they still did fairly well. Today I will look at the position players and break up the comparison into two parts: projection for playing time and projection for wOBA. I am just looking at players with 15 ballots and grouping all ballots, not separating by favorite team ballots. In all cases the Marcel projections are on the x-axis and fans on the y-axis. I indicate the Marcel equals fans with the red line (this is not the line of best fit). Thus points below-and-to-the-right of the line represent players projected higher by Marcel, and those above-and-to-the-left those projected higher by the fans. Here is projected PAs.

There is an interesting triangle pattern. There is considerable variation in the fan-projected PAs for the group of players with low Marcel-projected PA. At the top-left of this triangle are players with high fan-projected PAs and low-Marcel projected PAs. I think they fall into two categories: those whose health the fans are optimistic about (Jacoby Ellsbury) and those at the top of a depth chart for the first time (Mike Stanton). For players with a high Marcel PA projection the fans are in agreement, the point of the triangle at the top right.

With few exceptions the fans are much more generous with playing time than Marcel. Marcel regresses playing time heavily, something that the fans seem not inclined to do. Part of this might be selection bias. The players graphed are just those with over 15 ballots. Maybe fans are just not interested in projecting players who lost their starting jobs (who would fall below the red line). Overall I trust the fans with their better depth-chart knowledge, but want some of Marcel’s regression. So I might take 85% of the fans’ PA projections, or something like that.

Now to wOBA.

Here there is much more agreement. The relationship is a very tight linear fit, with the fans broadly agreeing with Marcel on wOBA. But there is still systematic deviation from the x=y line. There is a trend for fans to be lower on below-average players and higher on above-average players. I think this also stems from Marcel’s regressing. Below-average performance is going to be regressed up toward average, and above-average performance is going to be regressed down towards average. If the fans are less likely to regress we will see this pattern. For average players the fans are also higher than Marcel, so even on top of the lack of regression to the mean it looks like the fans are optimistic. Here I am a little more likely to trust Marcel over the fans.

Tomorrow I will look at pitchers.




Print This Post



Dave Allen's other baseball work can be found at Baseball Analysts.


10 Responses to “The Fans and Marcel in 2011: Position Players”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. phoenix2042 says:

    so if marcel does not look at minor league stats, then when it forecasts midseason call ups, it sees that they played 60 games or something and assumes they will only play 60 games next season too? how exactly does that work?

    Vote -1 Vote +1

    • Dave Allen says:

      Close, it takes those 60 games, but then regresses it towards the mean. This is one of the limitations of Marcel and why I think the Fans will better predict playing time.

      Vote -1 Vote +1

  2. Austin says:

    I would be really curious to see Tango test the fan projections with some broad adjustment made to them – since we know fairly well the nature and the extent of fan optimism based on past years, wouldn’t it make sense to take advantage of the information that they do provide while adjusting them for what we know is wrong with them? Maybe we could take away a couple of points of AVG and OBP and a few points of SLG, perhaps scaling the adjustments to the wOBA of the hitter. We would also want to take away a couple of UZR runs. Does this make sense? If we just made some broad alterations agnostic of the specifics of the projection, the fans would probably approach the established systems, and based on its recent track record, likely beat a laggard like PECOTA.

    Vote -1 Vote +1

  3. tangotiger says:

    By far the biggest value to the Fan forecasts is the playing time (depth chart).

    For the “rate stats”, your suggestion is decent enough that you can look at various components and use those in a regression. I can think of something like SB, if the fan knows a guy has a hammy, etc.

    ***

    Marcel’s playing time forecast is simple enough:
    PA = 200 + 0.5*PA1 + 0.1*PA2
    where PA1 was PA last year and PA2 was PA two years ago.

    So, someone with 600 PA last year and 400 PA two years ago is forecasted at 540 PA this year.

    This is simply an equation based on historical precedent. Basically, if you find the last 30 players to have done that 600/400 split, you’ll see his PA in the next year was an average of 540.

    (Don’t hold me to that exact case though! I don’t guarantee the equation will work for every combination of PA. But, if someone wants to try this particular illustration, go ahead.)

    Vote -1 Vote +1

  4. OB says:

    Danny Valencia an interesting data point in that the fans expect more regression than Marcel despite Valencia being a league average type hitter last year. That bucks the general trend.

    Vote -1 Vote +1

  5. Ben Hall says:

    I find the playing time question an interesting one. It seems to me that when we look at the average playing time of a group of players, it will be strongly influenced by those outliers who had major injuries that resulted in a very small number of games played (and thus plate appearances). But it is very difficult to anticipate such injuries. I’m thinking of something like Ellsbury last year. I would guess that most projection systems had him with fewer PA than the Fans, and thus were closer. But I’m not sure if that’s particularly valuable, because everyone was going to be way off. This is an extreme example, but Pedroia or Youkilis offer similar examples (I’m obviously a Red Sox fan, and last year gives plenty of examples). Is a projection that is just 100 or 200 PAs off (rather than (200 or 300) better? I’m just making up numbers here, and my entire hypothesis may be off, but I’d very curious to hear what other people think.

    Vote -1 Vote +1

    • Rudy Gamble says:

      Ben -
      It matters what you’re looking for in a projection system. If you’re a fantasy baseball player (like me), then playing time estimates are paramount. Tom makes it quite clear that crowdsourcing the playing time estimates is better than any formula. I’d personally feel better using the fan’s playing time estimates and Marcel’s rate statistics but, if forced to choose, I’d prefer the fan’s rate stats over Marcel’s playing time estimates.

      If you’re looking at projections solely to understand a player’s potential, I’d stick with Marcel and focus on AVG/OBP/SLG.

      Rudy

      Vote -1 Vote +1

  6. Tack says:

    When does fan voting end?

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>