FanGraphs Baseball


RSS feed for comments on this post.

  1. Excellent as usual Dave. But would Kemp be doing as well if he were batting 6th or 7th? ;)
    Ps. So go the next step. WE probably gave the Dodgers a little less credit than they deserved. Do you think they can now make the playoffs in the NLW? I know that’s not exactly the object of the article but your thoughts would be welcome. Take the next step!!! :)

    Comment by Chicago Mark — April 30, 2012 @ 12:26 pm

  2. Isn’t the point of getting off to a strong start to help your chances of making the playoffs? As a fan I’m not really too concerned with what the Dodgers or Rangers winning percentage might be for the rest of the year, but whether or not they’ll use that advantage to get to the playoffs. This article doesn’t address that at all. I’d be interested to see how many of those 45 teams made the playoffs and how many would have had they played in the current format (three division winners plus two wild cards.

    Comment by Jake — April 30, 2012 @ 1:05 pm

  3. Yup. 22 games out of a 162 game season shouldn’t get anyone too riled up.

    Comment by Slartibartfast — April 30, 2012 @ 1:31 pm

  4. Make sure to tell David Schoenfield.

    Comment by Andrew — April 30, 2012 @ 1:49 pm

  5. Well yes the correlation is low between hot starts and true talent. However, there is still a somewhat ok positive correlation and there is an effect on the requirements of performance going forward in order to overcome the start. If the team is .554 WP true talent and they start 16-6 that has a big impact as opposed to starting 6-16 because you can only assume true talent performance going forward. They don’t compensate by over performing later.

    Comment by Colin — April 30, 2012 @ 1:51 pm

  6. of course they can! what a silly question! they are already 10 games above .500 and have a 4-game lead in the division, that means if they play .500 ball the rest of the way they will finish 86-76, and the regression “rule of thumb” above pegs them as a better than .500 team.

    I think it’s obvious they’d have to be the front runners right now, this isn’t a division that had a clear-cut favorite coming into the season.

    Comment by batpig — April 30, 2012 @ 1:53 pm

  7. You’re mistaken. He did, in fact, address the importance of the first month in “banking wins” to help make the playoffs.

    Comment by Nick Lindner — April 30, 2012 @ 1:54 pm

  8. You’re a genious batpig! I wasn’t asking your opinion though. I wanted to hear from Dave. That being said, I’d still peg the DBacks as the favorites. So what does DAVE think?

    Comment by Chicago Mark — April 30, 2012 @ 2:04 pm

  9. This article should be mailed to Orioles fans everywhere.

    Comment by BX — April 30, 2012 @ 2:08 pm

  10. I’ve now read this article three times to make sure, and I still don’t see it. The only reference to the playoffs or the post-season comes in the fifth paragraph: “The fact that these teams won 54 percent of their May-September games shows that the sample was primarily made up of playoff contenders, so we shouldn’t pretend that a strong start to the season is meaningless.”

    If you’re referring to the article he linked to, then I apologize. I don’t subscribe to either site, so I haven’t read that. In either case, I think it would be more useful to expand upon that snipet right there than to say that teams are unlikely to maintain a .700 winning percentage over the remainder of the year. It’s more of an impact over correlation thing.

    Comment by Jake — April 30, 2012 @ 2:12 pm

  11. The real question is “are the first month or games more predictive than a similar sample of games from other times in the season”. Using a restricted sample size, you’ve shown that a small sample of games is a weak predictor of a teams ability to win games going forward. Honestly, everybody knew this. Its a good part of the reason why the season is so long. The question is are the first 22 games more predictive than 22 games from elsewhere in the season.

    To properly address this question you need to look at the record of all teams (not just ones arbitrarily selected that have won 70%) and calculate the difference between their win expectancy (linear extrapolation) based upon the first 22 games and the actual number of games they win. You then need to compare this to the difference in win expectancy from either samples or 22 games (or windows of 22 games) chosen throughout the rest of the season. Where do the first 22 games fall in this distribution? ….probably smack in the middle.

    Comment by Jason H — April 30, 2012 @ 2:14 pm

  12. Jake, it’s in the last paragraph. Dave writes “A great first month to the season is mostly useful for putting wins in the bank that count in the final standings”

    Comment by vivalajeter — April 30, 2012 @ 2:36 pm

  13. Yeah! With a note that says, “Don’t enjoy the success your club is having after all those years in the wilderness.”

    (Or we could just let them enjoy it for a bit)

    Comment by Oliver — April 30, 2012 @ 2:36 pm

  14. Jason, with all due respect, you can decide the *real* question when you start writing your own articles. It’s silly to tell Dave that he’s asking and answering the wrong question, just because you might want there to be a separate article. Dave wanted to see how strong starts translate over the course of the season, so he looked at teams with strong starts. Why would he include all teams in the data, rather than just teams that had strong starts?

    Comment by vivalajeter — April 30, 2012 @ 2:42 pm

  15. There is one issue with the analysis. Among the clubs that go .700+ early are a disporportionate number that run away and for whom the games late do not mean much. It can be a different game after September 1, and this is particularly so for teams which have a 10 game or more lead. I suspect that the correlation would be a little tighter if you used May 1-August 31. But not much.

    Comment by Mike Green — April 30, 2012 @ 2:49 pm

  16. No one uses mail anymore. Try faxing it.

    Comment by abreutime — April 30, 2012 @ 3:00 pm

  17. Viva,

    Ok that is fine. However, I don’t think Dave actually answered any question because he really didn’t compare his data to anything. He basically showed that small samples are not predictive.

    Comment by Jason H — April 30, 2012 @ 3:10 pm

  18. Chicago Mark, you don’t seem to “get” the Internet. You can ask specific people who write widely-read articles to answer your specific questions, but mostly they will ignore you. Then you can either scorn your fellow readers’ answers to those same questions and resign yourself to eternal monologue, or you can try not to be a dick and just engage in conversation.

    Comment by Anon21 — April 30, 2012 @ 4:18 pm

  19. I’m curious whether we can improve this regression algorithm using RS and RA. What’s the correlation between April Pythag and May-Sept. Pythag?

    Right now the Dodgers (Pthag = .60) and Mets (.40) are big over-achievers, while the Cardinals (.78) and Rangers (.76) are under performing.

    Comment by philosofool — April 30, 2012 @ 5:06 pm

  20. I don’t get it–so was the correlation only run with the teams with .700+ winning %? If I understand correctly, then you are comparing the teams that won 80% to the ones that won 70% to see if the ones that won 80% fair better than the ones that won 70% –so we’re talking about 2-3 win difference from the high end to the low end–we shouldn’t be surprised that there isn’t much correlation. If you want to see if there is a correlation between the first month and the other months, why not use all the teams? What am I missing?

    Given that no team finishes the season with a .743 winning percentage, it is also unrealistic to expect the teams to continue this way–and the fact that they won .549 seems (probably only 30% of teams win this many over the year) to say that it is somewhat of a predictor of success.

    Comment by wahooo — April 30, 2012 @ 5:06 pm

  21. I don’t put much faith in Pythag records at this point in the year. Over the course of the season it might work out well because things even out, but in small samples they can get out of whack. The Mets gave up 13 runs in one inning in Colorado last weekend. Fluky events like that will have too much of an impact over the course of ~20 games.

    That’s not to say they’re not overachieving – they are, mainly because of their record in one-run games – but overall I don’t pay attention to Pythag until we’re further into the season.

    Comment by vivalajeter — April 30, 2012 @ 5:35 pm

  22. Did you do any work with slow starts? I only ask because the Angels are curious.

    Comment by JWTP — April 30, 2012 @ 5:37 pm

  23. Brilliant! “We should be careful not to overreact to the results of April performances, but also understand that they do carry some meaning,”

    Comment by jim mcAulife — April 30, 2012 @ 7:05 pm

  24. I fail to understand how a ~.550 winning % going forward “doesn’t mean that much”. Seems to me that it means a lot. .550 is nearly a 90-win pace. Sure, if you expected a team that started hot to do well, you might think you hadn’t learned much. But take a team at random, then learn they did this well in April, and thus could be expected to finish out the year playing .550 ball. Surely that’s a lot of information? I was expecting you to say something like .510 or .520, in which case I’d have agreed with the article title.

    Comment by Todd — April 30, 2012 @ 8:01 pm

  25. Strong teams regress. But by not including weak April records in your dataset, I’m not sure what is going on.

    I can take a glance at the all-time record wins for a season and figure out that you are not going to get a strong correlation (.743*162 = 120.366).

    If you did a Spearman’s Rank correlation of April Standings vs May vs September standings I think one might find a little more meaning in an early strong start.

    Comment by Nick44 — April 30, 2012 @ 9:23 pm

  26. That’s not at all what Jake was talking about.

    Comment by bstar — April 30, 2012 @ 10:05 pm

  27. Excellent point here, Mike.

    Comment by bstar — April 30, 2012 @ 10:08 pm

  28. The 1984 Detroit Tigers started the season something like 25-5 (.833) and finished the season with a 79-53 (.598) record for a total of 104-58 (.642). This was a very good team that trailed off, but not that much to the point that they dogged it coming to the finish line. And to top it off, they won the WS vs. the Padres.

    Comment by Boomer — April 30, 2012 @ 10:19 pm

  29. Does this mean I will stop hearing pundits tell me that the Rangers have the division sewn up? I am SO tired of hearing that the Angels have no chance at the division title now. Is it likely that they win it? Well, it’s less likely than it was on April 4th. I just need to turn the volume off when I have a game on. Former players are the worst. No, I have not contributed anything to this discussion. I’m okay with that.

    Comment by sportsczar — April 30, 2012 @ 11:23 pm

  30. The question is which is more reliable, not whether it is reliable in some (non existent) absolute sense.

    Also, the Mets have a -20 run differential, so you can’t chalk it all up to a 13 inning.

    Comment by philosofool — May 1, 2012 @ 12:15 am

  31. My guess is Dave thinks somebody already answered your question adequately.

    Comment by Sam Samson — May 1, 2012 @ 6:30 am

  32. Just like the Red Sox and Braves ran away with the Wild Card last year!

    I’m joking but still…

    Comment by Los — May 1, 2012 @ 9:36 am

  33. exactly what I was trying to say above, but better summarized by Todd. I’m not sure what these statistics really tell us. It is unrealistic to think that a team that is winning at a 70% clip will continue to win at that rate–the fact that the teams with that winning percentage do pretty well seems to say there is some correlation.

    Also, I disagree with the premise that the games in the bank are more important. A team with a .743 winning percentage will win 6 more games than a .500 team of 25 games, whereas a .549 team will win 7 more games over 147 than a .500 team. How is it that the banked games are more important?

    Comment by wahooo — May 1, 2012 @ 10:21 am

  34. I think the “banking wins” statement is mostly wrong.

    Comment by wahooo — May 1, 2012 @ 10:24 am

  35. what are you talking about? It is well known that no team that was ever ten games out at any point has ever won its division. also former players bring much needed insight as to how various clubhouses (both by sq ft.and locker size) as well as a particular city’s restaurant and golf scene impact playoff races.

    Comment by bpdelia — May 1, 2012 @ 1:06 pm

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current ye@r *

Close this window.

0.324 Powered by WordPress