Archive for March, 2007

Yankees-Tigers: Live Win Probability

Despite last night’s lack of live win probability, the Yankees-Tigers game is now up and running. No need to refresh the page, live stats will just keep coming.

Don’t forget we’ll have every single game live this season starting opening night. If you have any feedback, please let us know!


Dodgers-Angels: Live Win Probability

Update: It still hasn’t showed up, and I’m assuming it won’t. There may be two live spring training games tomorrow.

Update: So, we were supposed to be getting live data for this game, but obviously it’s not being delivered. I’m hoping it will start to show up, even if it’s late.

We should have the Dodgers-Angels game live tonight. There’s been a number of improvements since yesterday and you should no longer need to refresh the page since it should update by itself.

Overall we’re still getting our feet wet with the live data. I’m sure our live data displays will evolve significantly over time. If you do get a chance to check out the game tonight, we’d certainly like to hear your feedback, positive and negative.


Live Win Probability (Alpha)

If anyone wants to take a peek at some our Live Win Probability stuff in the extremely early stages. You can check out the remainder of the Mets-Braves spring training game.

I’m not sure if there will be any other live games today, but there should be various games running for the 4 or so days left of spring training.

Please remember this is really an “alpha” product right now, and will hopefully be “beta” worthy by the time the season starts. It’s also only the scoreboard as there will be the usual big graph, play-by-play and box scores for each game. We’ll probably have live pitch counts too.

You will have to refresh the page to get the graph to update for the time being. The F5 key is probably the easiest method of doing this.

Time for me to get back to work!


The Golem’s Mighty Swing

gms_cover.pngI was browsing a used bookstore this past weekend and stumbled upon a graphic novel by James Sturm titled: The Golem’s Mighty Swing. I’ve been on a graphic novel kick lately and I couldn’t believe my luck in finding one that was actually about baseball.

Set in the 1920’s, it tells the tale of an all Jewish barnstorming team called the Stars of David Baseball Club. Their fictional leader, Noah Strauss, was for a short time a member of the Boston Red Sox, playing behind Tris Speaker, Duffy Lewis, and Harry Hooper. Barnstorming in the 20’s was far from a lucrative profession, so when a sports agent approaches the team about dressing up one of their players as a golem, (a mythical Jewish protector/destroyer) they have little choice but to say yes. Unfortunately, the golem only heightens the already prevalent anti-semitism that existed in the small towns they played in.

The story is beautifully illustrated in black and white and when the Stars of David take the field, Sturm has a knack for bringing the game of baseball to life. Within each comic frame, players are perfectly suspended in mid-motion, making that slide into second, or a pitchers windup seem all the more real.

gms_inside.png

It’s a quick read, but if you’re a fan of baseball or graphic novels, it’s definitely worth your time.


Box Scores

Box Scores have been added to potentially make your life just a bit easier. They contain all the usual goodies including the position the player played and the order the player hit in the lineup.

If you see anything missing that you feel is essential, just let us know and we’ll try and cram it in, assuming we have readily accessible in our database.


Pretty Good Daisuke, Pretty Good

After causing a major panic from his March 11th “bombing”, Daisuke Matsuzaka threw quite the gem yesterday. He struck out 7, while allowing only 1 walk and 1 hit in 5 and 2/3’s innings of work against the Pirates. This no doubt gave Red Sox fans that warm fuzzy feeling that was sorely lacking the 10 days in between Daisuke’s starts.

While we learned last week that his March 11th start was fairly typical of high priced pitchers, the 7 strikeouts he recorded yesterday was a rare feat indeed. There were only nine times this spring that a pitcher has struck out seven or more batters:

Ian Snell – Way back on March 6th, Snell threw 3 innings while striking out 7. Snell showed a lot of promise last year and this spring he’s showing why he’ll be the ace of the Pirates pitching staff (even if no one knows who he is).

Rich Harden – On March 15th the oft-injured Harden threw just 3-plus innings and struck out 9. Then he struck out 7 on March 20th in 5 innings. Overall, Harden has struck out 25 batters in a mere 13 innings of work this spring. Please stay healthy this year! There’s nothing I enjoy more than watching the Ks pile up.

Oliver Perez – He matched Harden on March 15th with 9 strikeouts, but it took him 5 innings to do it. He’s having a fine spring and he was dazzling to watch just three years ago. Perhaps he’ll find some of his 2004 magic in the Mets rotation this season.

Aaron Harang – Three days after Harden and Perez fanned 9, so did Harang. His spring has not been so stellar. He’s given up 28 hits which sets his H/9 at a mere 17-something. On the bright side, he’s still striking out a batter-an-inning, and has given up zero walks.

Scott Kazmir – He struck out 7 in five plus innings of work on March 18th. Yet he’s walked 6 in 12-plus innings so far this spring. It will be interesting to see if he can recapture the much improved control he exhibited in 2006.

Brett Myers – He also struck out 7 on March 18th. He’s one of those guys who took the off-season “seriously” by shedding 25 pounds off his frame. He’s mentioned that he’s been a bit “uncomfortable” pitching at his new weight, but the discomfort isn’t showing in his stats.

Josh Beckett – The 2006 home run king struck out 8 on March 20th. He’s only given up a single home run in 16-plus innings this spring. He’s also given up just a single walk while he’s struck out 17 batters.


Manny’s Clutch Hitting

There’s an interesting thread going on over at Sons of Sam Horn that eventually delved into Manny Ramirez’s clutch hitting abilities.

There’s currently a stat displayed on FanGraphs called “Clutch”. This Tangotiger invention is the difference between a player’s Win Probability Added and his OPS Wins once it has been Leverage adjusted. Simply put, it’s how well a player did in his actual environment (which includes close and late and runner on base situations) and how he would do in a context neutral environment.

So if we look at Manny Ramirez, he had a -1.36 Clutch last season, and over the past 5 years he has a -4.80 Clutch. His -1.36 last season was among the worst when looking at qualified batters. Needless to say, Clutch suggests he’s anything but a clutch batter.

To further reinforce the point let’s look at his high/medium/low leverage splits. A 1.70 or greater Leverage Index I considered “High” and a Leverage Index lower than .75 I considered “Low”. Anything in between was “Medium”

LI Level     OBP   SLG    OPS
High        .439  .536   .975
Medium      .449  .623  1.072
Low         .426  .649  1.075

The more important the situation, the worse Manny Ramirez does. His on base percentage stays pretty similar, but his slugging percentage takes a rather large hit as the situation becomes more important.

All in all, the numbers in any situation are pretty damn good, but he certainly didn’t elevate his game when the game was on the line.


The Top 10 – Week of 3/12/2007

Sometimes I find it fun to look at which players are being looked at the most on FanGraphs. To me, it’s a bit of a “buzz” meter for certain players. Obviously people are looking at players they’re interested in, and some of it is probably how visible certain players are on FanGraphs. There’s nothing scientific about this, but I thought FanGraphs visitors might enjoy seeing who their fellow visitors are most interested in. Here are the “results” for this past week (3/12 – 3/18).

1. Albert Pujols – This isn’t a huge surprise to me. He’s at the top of pretty much every lits and is incredibly visible on FanGraphs as a whole. Apparently everyone wants to see just how great he is!

2. Johan Santana – Also not much of a surprise. He’s the best, hands down.

3. Corey Hart – The Brewers’ Outfielder is having a strong spring, but #3? Maybe we have a disproportionate number of Brewers fans visiting the site, or perhaps people are just checking out his projections for their fantasy baseball drafts.

4. Adam Dunn – His extremely high strikeout totals and equally high walk totals make Dunn a constant topic for debate. Not to mention, his power seemed to disappear last season. His stats are always worth some extra scrutiny.

5. Ian Kinsler – Not quite as much as a surprise as Hart since this player is on most fantasy baseball managers’ radar.

6. Manny Ramirez – He’s constantly in the news. I don’t have anything else to add.

7. David Ortiz – His Win Probability numbers are always worth a look. He’s had two, rather astounding, back to back seasons and no MVP to show for it.

8. Derek Jeter – As popular a player as any and another player where it’s especially fun to look at his Win Probability numbers considering his “clutch” reputation. He didn’t live it down… last year.

9. Barry Bonds – Something would be wrong if he wasn’t in the top 10.

10. Ryan Howard – And in the 10-spot, the reigning NL-MVP. Everyone must be wondering if he can repeat his 58 home run season.

If people are interested, this could become a weekly feature, with maybe the top 10 and “movers and shakers”.


Community Projections

Tangotiger is conducting his 2007 community projections:

“I’ve seen the results of six forecasting systems this year. (I’m sure some of you have seen more than that.) And all were based on some algorithm with little leeway for human interaction. Why is that? Because we can’t trust any single person’s opinion. But, what if we can get a consensus, a Wisdom of Crowds? Who knows more about whether Papelbon will be a starter or reliever this year: an algorithm or a Redsox fan? Who knows more about the number of games a 2006-injured Hideki Matsui will play in 2007: an algorithm or a Yankees fan? There are certain human observation elements that are critical for forecasting. That’s where you can come in, and why you are here.”

When you have a free moment, head over there and fill in the OPS and ERA projections for your favorite team!


Daisuke Matsuzaka – You’re Not Alone

All anyone can seem to talk about today is how the 103 million dollar pitcher, Daisuke Matsuzaka, was “bombed” yesterday in a spring training start. He gave up 2 home runs, to two “non-roster” players, and ended up surrendering 4 runs (3 earned) in 4 innings, which raised his ERA from 0 to 3.86. He also struck out 3 and walked none.

What about the highly paid pitchers not named Matsuzaka? Surely some of them had an equally atrocious day. Here were the highlights from Sunday’s action:

Brad Penny ($8.5 Million): He gave up 9 hits and 4 runs yesterday in only 3 innings. He also struck out none and has a 12.86 ERA this spring.

B.J. Ryan ($9.4 Million): 1 inning, 4 hits, 3 runs, 1 strikeout.

Freddy Garcia ($9 Million): 3 innings, 5 hits, 3 runs, and 2 walks. He didn’t strike anyone out.

Mark Buehrle ($9.5 Million): 4 innings, 6 hits, 6 runs, 2 walks, and 4 strikeouts. His ERA stands at 11 this spring. It’s only 1.5 higher than he makes in millions.

And that was only yesterday. On Saturday:

Barry Zito ($18 Million): 4 innings, 5 hits, 3 runs, 2 walks and 4 strikeouts.

Everyone panic!


Play-by-Play

We’ve just added play-by-play data in the Play Log section for each game. At this time, play-by-play data includes the Leverage Index, Run Expectancy, Home Team Win Expectancy, and the Batter’s Win Probability Added (WPA) for each and every play of the 2002-2006 season.

Win Expectancy is calculated as the result of the play, while Leverage Index and Run Expectancy are calculated before the play happened.

Everything is calculated before the play happened. We’ve also added BRAA which is the difference between Run Expectancy at the start of the play and the end of the play.

If you click on the play, you’ll get the pitch sequence for each play in a little pop-up box. The playoff games pitch sequence is a little screwed up right now. It “snakes” around, so for the first line it will be “pitch1, pitch2, pitch3, result” and then for the next line “result, pitch3, pitch2, pitch1”. We’ll try to get this cleared up soon.

We’ve also moved all the Win Probability graphs out of the Team section and into the Scoreboard section. We’re just trying to make things a little more organized and it will allow us to eventually vastly enhance our team stats.

If you have any problems or suggestions on how to improve the new scoreboard or play-by-play sections, don’t hesitate to let us know!


Preview: Scoreboard

Thought I’d show a quick preview of our new scoreboard. This way you’ll be able to quickly see all the day’s graphs in one convenient place.

You’ll also notice that the box score and play log links aren’t quite working yet. I’m hoping they will be sometime next week.


THT Projections: A (Quick) Closer Look

Earlier this week the much anticipated Hardball Times 2007 Season Preview was released, and with it a brand new projection system. I recently took a look at Bill James, CHONE, ZiPS, and the Marcel projection systems to see how they differed. Let’s throw THT into the mix and see where it has its major differences.

First off, let’s see how THT fares against the other projection systems in OPS and ERA as a whole when compared to the Marcel projection system (the simplest of the five).

System        ERA-R^2    OPS-R^2
ZiPS             .725       .908
Bill James       .714       .875
CHONE            .699       .865
THT              .681       .837

And in English, when comparing the other projection systems to the Marcel projection system, THT’s system is the least similar. (When look at batters with 300+ at-bats and pitchers with 100+ innings.)

So which batters does THT disagree on the most in terms of OPS?

Name            Bill James    CHONE   Marcel     THT    ZiPS
Frank Thomas          .939     .853     .874    .982    .892
Hanley Ramirez        .801     .791     .843    .714    .777
Robinson Cano         .860     .842     .852    .766    .836
Chris Duncan          .862     .776     .891    .753    .803
Melky Cabrera         .766     .796     .787    .715    .800

Except for Frank Thomas, who THT projects is going to have a phenomenal season, they’re the low point for the other four players. It’s interesting to note that those four are also first or second year major league players. There’s generally a lot of disagreement about Chris Duncan and Hanley Ramirez, but the THT projections for Robinson Cano and Melky Cabrera appear to be the sole point of difference. Let’s look at the pitchers:

Name            Bill James    CHONE   Marcel     THT    ZiPS
Tony Armas Jr.        4.85     4.64     4.96    5.81    4.88
Carlos Zambrano       3.40     3.47     3.48    2.77    3.46
Cliff Lee             4.43     4.20     4.48    5.04    4.55
James Shields         4.03     4.29     4.72    5.03    4.70
Brandon Webb          3.53     3.60     3.65    3.07    3.85
Randy Johnson         4.31     3.77     4.33    3.43    3.63

THT clearly hates Tony Armas Jr. (more) with his ERA about a point higher than the others, while they love Carlos Zambrano who they have at about a .75 lower ERA than the other systems. I threw in Randy Johnson since he was next on the list. It looks like the projections are pretty well divided for him between the 4.30-ish ERA, and the 3.50-ish ERA.

Anyway, the THT projections are certainly similar to the others, but there are clearly a number of key differences which are definitely worth a look. There’s also a lot more to projections than ERA and OPS, so I’m sure you’ll find many other unique aspects to THT’s projection system. Like with any projection system, we’ll have to wait and see which one happens to be the most accurate for 2007.


Bill James Projections – Updated

I thought I’d mention that the Bill James Handbook projections on FanGraphs have been updated with the latest and greatest.

“… many things happen during the offseason that change playing time for the coming season. That’s why we produce The Bill James Handbook: Projections Update with cutting-edge projections reflecting changes through the last couple days of February.

We adjust projections for many reasons, including:

-Playing time adjustments
-Free agent signings (including four Japanese rookies)
-Trades
-Injuries
-Ballpark changes”

I think FanGraphs has three of the four rookies in the database now with the exception of Daisuke Matsuzaka. He’ll show up next time he makes a spring training start. For those who can’t wait, the Bill James Handbook has him at 19-2 with a 3.13 ERA in 190 innings.

As always, if you’d like to dice, slice, and sort the Bill James Handbook projections to your heart’s content, you’ll have to purchase them here.

While we’re on the topic of projections, I’d like to give a quick shout-out to the new Hardball Times 2007 Season Preview. Besides the great commentary on teams from many of your favorite bloggers, it has player projections through 2009. I’m still digging in, but it’s full of fun and useful stuff.


Win Probability Changes

You may have noticed the Win Probability numbers have changed slightly. Don’t panic! There have been a few changes, for the better.

First off, we’re now using Tangotiger’s updated win expectancy tables which are no longer a flat 5.0 Runs per Game environment. Instead, we’re using the home team’s league, average run environment. This now puts batters and pitchers on “equal footing” and you should now be able to accurately compare batters and pitchers using WPA.

Second of all, we’re also using Tangotiger’s run expectancy tables to calculate Batting Runs Above Average (BRAA) for both batters and pitchers. Once again the run environment is set at the home team’s league, average run environment.

Next to BRAA there is a column titled “REW”, which stands for Run Expectancy Wins. This is a replacement for OPS Wins because we no longer need to estimate wins in a context neutral environment since we’re now using run expectancy.

Finally, Clutchiness has been shortened to Clutch (Clutchiness was excessively long) and is calculated as WPA/LI – REW.

Update (3/4/2007): Clutch has been switched back to being calculated with OPS Wins. More on this later.

Typically players remain in the same order, but their values have changed slightly. Batters should be slightly more valuable and pitchers slightly less valuable based on WPA scores.


Heath Bell – Maybe This Year

I’ll admit, I’m a Heath Bell protagonist. Last year I expected big things from the 28 year old reliever who ended up posting a 5.11 ERA in just 22 relief appearances. He didn’t quite live up to my lofty expectations:

“Don’t be surprised if he becomes an important piece of the Mets bullpen next season.”

Well Heath, it’s a new year and you have a brand new team (Padres) with new fans to impress. Let’s see where things went wrong last year and if they’re going to happen again this year.

He has pretty much everything you’re looking for in a relief pitcher: high strikeout rate, low walk rate, and he’s even a ground ball pitcher. He’s clearly mastered Triple-A where in 2006 his K/9 was over 14! Not to mention he posted an ERA of 1.29 in 35 innings.

2080_p_season_full_1_20061001.png

Yet what plagues him in the majors has been his extraordinarily high batting average on balls in play (BABIP). The past two years his BABIP has been .374 in 2005 and an insanely high .394 in 2006, which just happened to be the highest in the majors for pitchers with over 30 innings pitched. This same problem plagued him last year in AAA too, where he had a .378 BABIP, the 11th highest at the AAA level.

2080_p_season_full_7_20061001.png

Typically with BABIP this high, you’d think he was just getting unlucky, but it’s hard to ignore the past two years worth of data, so despite his incredible peripherals, maybe this is just who he is?

It’s clear the Mets, at least at the major league level, never had a whole lot of confidence in him. Of the regular relievers he was used in the least important situations possible. His average Leverage Index (LI) was a measly 0.35, with a Leverage Index of 1 being an average situation (the higher the leverage, the more important the situation). The previous year was not much different where his LI was 0.65, the third lowest on the team.

His 2006 ERA of 5.11 is mainly the result of 3 games which were completely out of hand before he even entered the game.

– On 9/26 he entered the game with the Mets trailing by 6 runs and gave up another 6 runs.

– On 9/11 he entered the game with the Mets trailing by 6 runs and gave up 5 additional runs.

– On 7/2 he entered the game with the Mets trailing by 3 runs and gave up 4 runs and an additional 4 unearned runs.

So, if we take away these three horrible (meaningless) outings, his ERA ends up being 1.76. Maybe you have questions whether or not the game on 7/2 was completely meaningless. If we leave that one in, his ERA is still a pretty nice 2.72.

Heath Bell is getting a fresh start this year and despite his historically awful BABIP, his strikeout and walk rates are just too good to ignore. I’ll stick with my same prediction as last year: I’d be surprised if Heath Bell didn’t become an important fixture in the Padres bullpen.


Spring Training

Don’t worry Cactus and Grapefruit Leagues; you have not been forgotten. 2007 Spring Training stats are now included in the FanGraphs player stats pages. Unfortunately, the stats are very basic. But, at least you can get a feel for how your favorite players are doing.

These will be updated nightly and Spring Training leaderboards should be up sometime tomorrow.