- Community – FanGraphs Baseball - http://www.fangraphs.com/community -

Evaluating 2012 Projections

Posted By Will Larson On February 21, 2013 @ 1:58 pm In Research,Strategy | 19 Comments

Evaluating 2012 Projections

Hello loyal readers.  It’s time for the annual evaluation of last year’s player projections.  Last year saw Gore, Snapp, and Highly’s Aggpro forecasts win among hitter projections (http://www.fangraphs.com/community/comparing-2011-hitter-forecasts/) and Baseball Dope win among pitchers http://www.fangraphs.com/community/comparing-2011-pitcher-forecasts/ .  In general, projections computed using averages or weighted averages tended to perform best among hitters, while for pitchers, structural models computed using “deep” statistics (k/9, hr/fb%, etc.) did better.

2012 Summary

In 2012, there were 12 projections submitted for hitters and 12 for pitchers (11 submitted projections for both).  The evaluation only considers players where every projection system has a projection.

As Table 1 shows, Dan Rosenheck blew away the competition as the best forecaster, taking 1st among pitchers and 3rd among hitters.  My personal projections (Larson) took 2nd, with the Steamer projections taking 3rd overall.  Bringing up the rear were the Marcel, Guru, and CAIRO projections.

System

Hitters

Pitchers

Average

Rosenheck

4.40

1.25

2.83

Larson

4.40

3.75

4.08

Steamer

5.20

3.50

4.35

CBS Sportsline

3.60

5.50

4.55

Fangraphs Fans

4.60

6.25

5.43

ESPN

6.00

6.25

6.13

ZIPS

6.20

7.75

6.98

Rotochamp

8.20

7.00

7.60

Marcel

9.80

9.75

9.78

Guru

8.80

10.75

9.78

CAIRO

9.60

10.75

10.18

Smith

na

5.50

na

GQE

7.20

na

na

Table notes: Table 1 shows the average projection rank across each category.  For example, Fangraphs Fans took 7th in Runs, 2nd in HRs, 9th in RBIs, 3rd in AVG, and 2nd in SBs, for an average rank of 4.6 in the hitters categories.

Detailed Forecast Analysis: Hitters

I look at two main bases of comparison: the first is the Root Mean Squared Error both with and without bias. Bias is important to consider because it is easily removed from a forecast and it can mask an otherwise good forecasting approach. For example, Marcel projections show very little bias, giving them a low RMSE, but are very poor at predicting variation among players, meaning that it’s not a terribly good forecast if you’re trying to rank expectations of future performance.

Hitter RMSE Results:

system

R

rank

HR

rank

RBI

rank

AVG

rank

SB

rank

AVG rank

Guru

23.523

5

8.525

7

24.807

2

0.032

1

6.630

1

3.2

CAIRO

24.577

7

8.108

2

24.186

1

0.039

4

7.054

7

4.2

GQE

20.954

1

8.377

4

26.829

7

0.040

5

6.889

5

4.4

ESPN

23.950

6

8.066

1

25.113

3

0.042

12

6.729

3

5

Larson

21.284

2

8.153

3

26.316

5

0.041

11

6.779

4

5

CBS Sportsline

25.414

10

8.506

6

26.788

6

0.036

2

6.720

2

5.2

Fangraphs Fans

26.797

12

8.488

5

25.755

4

0.037

3

6.929

6

6

Marcel

23.299

4

9.032

10

27.048

8

0.040

6

7.746

10

7.6

Rosenheck

21.612

3

9.455

12

28.568

10

0.040

7

7.257

8

8

ZIPS

25.408

9

8.918

8

32.646

12

0.041

10

7.532

9

9.6

Rotochamp

26.103

11

8.929

9

28.334

9

0.041

8

7.973

12

9.8

Steamer

24.721

8

9.331

11

28.791

11

0.041

9

7.961

11

10

This table presents the RMSE of the forecasts.  The RMSE is essentially the average forecast error, in absolute value, and is of the units of the statistic.  So, for HRs, each system is between 8 and 10 HR off for each player, on average.  Here, we see that the GURU forecasts have the lowest RMSE overall, doing best at projecting AVG and SBs.  John Grenci’s GQE forecast does the best for Runs, ESPN for HRs, and CAIRO’s for RBIs, respectively.

But what if a projection is great at ranking players but is terrible at projecting their actual output? Such a projection would still hold value because of the information contained in the player-to-player variation. In fact, this information is probably more valuable than the actual level of output for the player.

Hitter Fit Results:

system

R

rank

HR

rank

RBI

rank

AVG

rank

SB

rank

AVG rank

CBS Sportsline

0.336

5

0.430

5

0.388

1

0.303

6

0.602

1

3.6

Larson

0.375

3

0.438

1

0.381

2

0.291

9

0.560

7

4.4

Rosenheck

0.377

2

0.432

3

0.376

4

0.309

5

0.555

8

4.4

Fangraphs Fans

0.322

7

0.432

2

0.298

9

0.312

3

0.602

2

4.6

Steamer

0.326

6

0.431

4

0.356

5

0.339

1

0.545

10

5.2

ESPN

0.345

4

0.424

6

0.380

3

0.225

11

0.567

6

6

ZIPS

0.240

11

0.398

7

0.310

7

0.338

2

0.576

4

6.2

GQE

0.390

1

0.340

12

0.341

6

0.146

12

0.568

5

7.2

Rotochamp

0.270

8

0.360

10

0.291

10

0.277

10

0.586

3

8.2

Guru

0.268

9

0.343

11

0.300

8

0.298

7

0.550

9

8.8

CAIRO

0.204

12

0.369

8

0.261

12

0.309

4

0.500

12

9.6

Marcel

0.247

10

0.364

9

0.280

11

0.291

8

0.522

11

9.8

 

This table is the r^2 of the simple regression: actual=b(1)+b(2)*forecast+e.  The b(1) term captures ex-post bias, allowing b(2) to better capture the information content in the forecast.  Here, we see that while CBS Sportsline is in the middle of the pack when it comes to accuracy, its far and away the best at measuring player-to-player variation, with the best SB and RBI projections.  The Larson projections were best for HRs, GQE for Runs, and Steamer for AVG, respectively.

In general, the Fans, Rosenheck, and Larson projections are all essentially averages of projections.  For hitters, this strategy seems to do the best, and has held for the last three years.

Detailed Forecast Analysis: Pitchers

We can perform the same analysis for pitchers.

Pitcher RMSE Results:

system

W

rank

ERA

rank

WHIP

rank

SO

rank

AVG rank

Rosenheck

3.439

1

1.402

2

0.280

5

38.343

1

2.25

Steamer

3.777

4

1.368

1

0.271

1

41.520

4

2.5

Larson

3.682

3

1.403

3

0.273

2

40.938

3

2.75

Smith

3.606

2

1.428

5

0.290

7

38.750

2

4

ZIPS

4.376

12

1.413

4

0.275

3

44.135

8

6.75

Marcel

3.932

6

1.458

8

0.291

8

42.249

5

6.75

Guru

4.000

8

1.438

6

0.352

11

42.681

6

7.75

CAIRO

3.901

5

1.502

11

0.426

12

43.908

7

8.75

ESPN

4.158

10

1.502

10

0.279

4

46.271

11

8.75

Rotochamp

4.103

9

1.456

7

0.295

9

45.431

10

8.75

Fangraphs Fans

4.193

11

1.476

9

0.283

6

46.558

12

9.5

CBS Sportsline

3.968

7

1.523

12

0.301

10

44.182

9

9.5

Here, we see that Dan Rosenheck’s projections are best in terms of RMSE, leading in Wins and Strikeouts.  Steamer takes the other two categories—ERA and WHIP.  As before, overall fit is probably more interesting, and in the table below, we see the results.

system

W

rank

ERA

rank

WHIP

rank

SO

rank

AVG rank

Rosenheck

0.559

1

0.118

1

0.183

2

0.559

1

1.25

Steamer

0.495

5

0.113

2

0.187

1

0.503

6

3.5

Larson

0.502

4

0.102

3

0.165

3

0.506

5

3.75

CBS Sportsline

0.545

2

0.071

8

0.066

10

0.558

2

5.5

Smith

0.516

3

0.075

7

0.078

9

0.544

3

5.5

ESPN

0.468

7

0.060

9

0.153

5

0.517

4

6.25

Fangraphs Fans

0.479

6

0.081

6

0.136

6

0.500

7

6.25

Rotochamp

0.460

8

0.093

5

0.098

7

0.475

8

7

ZIPS

0.373

12

0.095

4

0.155

4

0.442

11

7.75

Marcel

0.430

10

0.045

12

0.081

8

0.459

9

9.75

Guru

0.426

11

0.045

11

0.059

11

0.455

10

10.8

CAIRO

0.434

9

0.058

10

0.035

12

0.428

12

10.8

 

The Rosenheck projections turn out to not only be the most accurate, but also capture the most player-to-player variation.  For a projection system to do both the best is remarkable, and Dan’s projections lead in Wins, ERA, SO, and took 2nd in WHIP, losing to Steamer.  Clearly, Dan knows something that the rest of us don’t.

I asked Dan the secret to his success, and he suggested that he uses some combination of structural modeling and forecast averaging to arrive at his projections.  This is in contrast to the Larson projections, which are weighted averages of forecasts but contain no structural models, or the Steamer projections which is entirely structural modeling.

Key Points

This is the third year that I have done evaluations of set of forecasts.  For hitters, a single structural forecast does not appear to ever do very well.  However, these structural forecasts can be used in a weighted average of forecasts that seems to do better than any individual one.

On the other hand, pitchers appear to be much more forecastable by an individual system.  Averaging still works, but people have forecast pitchers well without averaging.

Congratulations to Dan on his forecasting dominance in 2012, and best of luck to everyone in 2013.

All of the non-proprietary numbers in this analysis can be found at my little data repository website found at http://www.bbprojectionproject.com.  I encourage everyone to submit projections for consideration in 2013!


Article printed from Community – FanGraphs Baseball: http://www.fangraphs.com/community

URL to article: http://www.fangraphs.com/community/evaluating-2012-projections/

Copyright © 2010 Community - FanGraphs Baseball. All rights reserved.