FanGraphs Baseball


RSS feed for comments on this post.

  1. I’m not surprised by the lack of year-to-year correlation. Apparent “streakiness” is probably mostly the result of scheduling and luck.

    Comment by Pete — January 21, 2011 @ 11:53 am

  2. Man, this is absolutely fascinating. Brilliant analysis.

    Comment by bsally — January 21, 2011 @ 11:53 am

  3. incredible stuff… great job

    Comment by Adam D — January 21, 2011 @ 11:57 am

  4. amazing work

    Comment by chel — January 21, 2011 @ 11:57 am

  5. Best all-time articles on Fangraphs since its beginning. I’m surprised they volunteered to be upstaged so thoroughly, to be honest, haha.

    Comment by Oscar — January 21, 2011 @ 12:13 pm

  6. Fantastic. That first scatter-plot is beautiful.

    Comment by MikeS — January 21, 2011 @ 12:15 pm

  7. totally fascinating. i had thought otherwise, using my limited observed and perception-impacted following of baseball; that there were such things as streaky and consistent players. i always thought it would help me to know this in my head to head weekly fantasy leagues. knowing these results, i will rest easy putting no stock into this factor.

    i would be interested in the results, however, for fielding. i’m sure there would be inherent sample size limitations of using, say uzr, over a 7 day rolling period given that uzr is not thought to stabilze even over a full season, but there is the old thought that defense doesn’t slump. i wonder if the conclusions would be the same. or even the correlation between offensive ‘slumps’ and defenive ones. do players take their strikeouts into the field with them. it’s also said that speed doesn’t slump, so concluding that there is no correlation to player type is interesting as well.

    i look forward to more work from you, and especially appreciate your involvement in the comments threads, discussing the feedback.

    Comment by mike — January 21, 2011 @ 12:20 pm

  8. I wonder if the Red Sox use a similar analysis and try pick up players that score lower on the streakiness scale so they have consistent production all season long

    Trot Nixon, Mark Kotsay, Rocco Baldelli, David Ortiz, Victor Martinez*, Marco Scutaro, Edgar Renteria, Jacoby Ellsbury.. (i know some didnt play for the Sox at the time)

    *VMart is very interesting.. one of the streakiest 2 years in a row then he makes an appearance at the bottom of the list the following. Wierd.
    Went from 2005 p=.998 to 2006 p=.973 to 2007 p=.019

    Comment by Jim Lahey — January 21, 2011 @ 12:27 pm

  9. Yes, very interesting. I have to say, when I clicked on this article, my first thought was: Victor Martinez is ridiculously streaky and has been every year except 2007, I wonder if he was ever one of the streakiest in the league? Sure enough, Victor was top 5 in two years, 2007 he was bottom 5. I’m curious where he ranked in ’09, which I’d have pegged as one of his streakiest years, when he put up a .455 wOBA in April and a .245 in July … so, is Victor not a streaky player? I think he’s so streaky, he might just go an entire year being one of the most consistent players in the game.

    Comment by isavage30 — January 21, 2011 @ 12:34 pm

  10. You seemed to miss a bit there.

    Comment by Daniel — January 21, 2011 @ 12:43 pm

  11. Again, just wonderful stuff, Seth. I disagree with your conclusions though.

    The scatterplot, essentially a test for real effects, an autocorrelation test in frequentist terms … it shows nothing.

    The histogram (an order test) shows an obvious tendency, in the general population, towards streakiness. The skew can’t be missed with the eye.

    Autocorrelation is a Big Apple Near The Bottom Of The Tree Detector, and the order test is much more sensitive.

    I would think of the order test as being a seismograph, and autocorrelation being a count of the amount of stuff that fell off of shelves during a tremor. Given a powerful enough earthquake … the results will be near enough the same. Given a small tremor, autocorrelation is near useless.

    I read your results as saying that streakiness very clearly exists in the population of MLB hitters, but pinning the quality on any player using math … like tryuing to corner a weasel in a round room. Perhaps scouts and managers have a better feel for it, especially if they know how the guy has been hitting in BP and if he’s been hitting the ball hard when he gets a hittable pitch. I dunno.

    Slightly stronger indication for streakiness than Albert found, which is a credit to your adaptation of his model. Huge props, brother. Hands down the best sabermetric article I’ve read in ages.

    Comment by Vic Ferrari — January 21, 2011 @ 12:47 pm

  12. Those two scatterpoint graphs tell such a great story even without the other graphs. Awesome work…thanks Seth.

    Comment by Aaron J — January 21, 2011 @ 12:51 pm

  13. “Streakiness” is reminiscent of “cluchiness”. I think there’s a belief that, while not a repeatable skill for most, that cluchiness may be an actual skill for a few. Do you think this is true for streakiness as well?

    Comment by Esoteric — January 21, 2011 @ 12:51 pm

  14. Went to the google doc…

    2005: .998
    2006: .973
    2007: .019
    2008: –
    2009: .912
    2010: .922

    So streaky he was … consistent for an entire season?

    Comment by Jim Lahey — January 21, 2011 @ 12:59 pm

  15. Seth,

    Really great stuff. The only thing I would contend with is your parting remark. You claim that all players are streaky, yet also concede that you can’t find any year to year correlation of streakiness on a per player basis.

    Without any of the raw number in front of me back this assertion up, I would guess that all of this streakiness you’ve found (or atleast 98% of it) is due to the external factors brought up yesterday, factors that are just noise when trying to find a player’s “true streakiness” capture (parks, pitchers, injuries, etc.)

    I say this because if you were to believe these numbers and take them at face value – that players do actually have streaky seasons above and beyond random distribution around the mean – then you would see SOMETHING year to year. Essentially, if something is under a player’s control (to any degree), you will find some year to year correlation, positive or negative, somewhere. It can’t be nonexistent. Otherwise, as is the case here, as I believe, we’ve just measured the amount of streakiness imparted on a player’s production, due solely to the nature of the game of baseball.

    Really enjoyed it though. Looking forward to more!

    Comment by Lee — January 21, 2011 @ 1:01 pm

  16. there are some extra words in there, sorry, kind of rushed that post. work sucks!

    Comment by Lee — January 21, 2011 @ 1:03 pm

  17. I’d love to hear the answer to that question too.

    Comment by SteveM — January 21, 2011 @ 1:11 pm

  18. I think that’s over simplifying it, especially in regards to cold streaks. The variables are close to endless. You can’t discount focus levels as the season wears on, habits formed and lost, motivation, etc.

    Then you take in the fact that the cause of each streak could be caused by completely different things each time: undisclosed injury this week, upset stomach the next, recovery the week after, fighting with your wife after that. When you look at everything that could possibly affect an athlete’s performance, it’s no wonder that everyone is streaky and you can’t predict when or if they’ll streak.

    Comment by Matt — January 21, 2011 @ 1:11 pm

  19. Fascinating.

    Comment by Albert Lyu — January 21, 2011 @ 1:13 pm

  20. I love this site and read it religiously. In all of the countless (get it?) articles I’ve read here, however, this is by far the best. Interesting and highly applicable to the real world, and addresses one of the most common idioms in baseball. Add to it, the focus on the best player on my favorite team, and this is one brilliant article.
    What are the odds your next will suck? No correlation I’m guessing!

    Comment by SteveM — January 21, 2011 @ 1:14 pm

  21. “would guess that all of this streakiness you’ve found (or atleast 98% of it) is due to the external factors brought up yesterday”


    The other side of that coin is that managers may be riding the hot hand, rightly or wrongly.

    If a young LH batter, a fastball hitter mostly, is just tearing it up … what does the manager do when the next game’s starting pitcher is a cagey old LH junkballer? My guess is that he leaves him in, even if he has a useful RH vet on the bench, a guy he’d normally sub in for him in that instance.

    If managers understood streakiness perfectly (and I doubt they do, though it’s always dangerous to assume others are fools) then they would be able to hide it perfectly from Seth’s math, as rigorous as it is.

    If the same methodology were applied to only RH batters vs RH pitchers … would the skew of the order test increase?

    I don’t know, but I would bet real money on the over.

    i.e. The exagerating factors are likely less than the mitigating factors, or so I suspect.

    Comment by Vic Ferrari — January 21, 2011 @ 1:16 pm

  22. What you’re asking, I think, is this: when looking at that random-seeming scatterplot of “Batter Streakiness Across Seasons” are several closely-clustered dots different season pairs for the same guy. Or, put another way, are there individual players for whom r is high, despite it being low across the population as a whole?

    Comment by joser — January 21, 2011 @ 1:18 pm

  23. And to reiterate, or make my point clearer – the line I disagree with is:

    Rather, the truth seems to be that all players are streaky players. Being human, they have their ups and downs, and they are inherently streakier than random chance would dictate.

    Once you account for the X factors, I don’t think there’s actually any “streakiness” at all, as I assume we all are defining it (the ability to actually under or outperform your talent for extended periods of time.)

    Comment by Lee — January 21, 2011 @ 1:19 pm

  24. Great work here, and fun to read. Is it clear from statistical theory what the distribution of p-values should look like? The implicit assumption here seems to be that they should be uniformly distributed under the null that streakiness is randomly assigned to players. I’m not convinced. It might be better to adjust the streakiness scores from the simulation for each player to mean 0 and variance 1 (instead of obtaining the p-value), and then plot the distribution of these normalized values, which, I think, under the null should be distributed standard normal. This test might be isomorphic to the route you take, but that is not immediately clear to me.


    Comment by notdissertating — January 21, 2011 @ 1:21 pm

  25. Hey Vic, long time hockey reader… love your stuff.

    I think the sample is just too big for part time players to have an impact on what’s going on here. And if I understand your example correctly, you are saying that if a manager leaves in a “hot” part time player, even if is clealrly hurt by the platoon split during the next game, this would make him less streaky. I think it would make his propensity for streakiness greater, because in reality, just because he is hot doesn’t mean he’s going to be any more likely to hit the crafty old southpaw. So, he has a bunch of hot games in a row, then a game with an expected wOBA of crap. This is what Seth is measuring as streaky.

    I think there’s just way too much to wade through here to say with any certainty that players are streaky outside of the factors they can’t control.

    Comment by Lee — January 21, 2011 @ 1:26 pm

  26. Lee: Was that a response to my post?

    What if Seth modelled an imaginary league where there was significant streakiness with several players, but little with most others?

    How would the scatterplot look? What would autocorrelation tell us? How would the order test look?

    Good questions I think.

    Comment by Vic Ferrari — January 21, 2011 @ 1:33 pm

  27. Maybe I missed it in the first article (should go back and re-read it, I guess) but I still find myself wondering how much streakiness you get purely from random chance. I mean, if your typical high-in-the-order hitter gets at least 4 plate appearances per game and has a batting average better than .250, he should get a hit in every game — if he had no “streaks”. In other words, if (say) Ichiro had no streakiness whatsoever, not even the kind that comes from random noise, he’d get to his annual 200 hits by getting one in every game, plus an extra one about every fourth game. And no one would talk about DiMaggio’s 56 game hitting streak being perhaps the hardest record in sports to break, because many players would be shattering it every year.

    So while players are not dice, there’s some amount of streakiness we should expect from them even if they were, and that’s really the baseline we should expect them to exceed (or not) when we’re talking about them being streaky or consistent. Afterall, the argument against the “hot hand” is just that: even random events can come in streaks, so you shouldn’t be making decisions based on it. So at what point can we say a player really (probably) is a “hot” or “cold” hand, at least temporarily? Or, more precisely, how likely is any given streak, and so how much can be attributed just to random chance?

    Comment by joser — January 21, 2011 @ 1:41 pm

  28. Sorry, it was a response to mine, before I read yours.

    I think.. the question you raise is very interesting, and I think it actually lies outside the scope of what Seth said, and what I objected to.

    He claimed that everyone across the league exhibits some amount of streakiness, and it’s shown when looking at the league wide plots. I think that this baseline of streakiness actually represents the true mathematical random distribution of events, given the external factors, and that the league is perfectly unstreaky, or averagely streaky. I hope this makes sense.

    However… you bring up an awesome point. From what we are looking at here, and I’m not sure by Seth’s work if this is true or not… would you even see those very streaky players (as you have created in your model league)? Would they poke their heads out from under all of the other average streakiness players?

    I’m not sure if what Seth did would catch those players or not. I’m sure he could tell us.

    I think you probably wouldn’t, but I wouldn’t be surprised if they really existed. However, I think the main point of this article was to say that the league exhibited streakiness, as a whole, I that’s where I really disagreed with him. The league is randomly distributing the events, but it’s these outside factors that lead Seth to conlcude they are exhibitnig streakiness.

    Comment by Lee — January 21, 2011 @ 1:42 pm

  29. (And yeah, I know, wOBA is a better measure of offense and we don’t care about batting average; I’m just trying to keep my question simple and brief, and getting hits is going to be one of if not the first thing people think of when you start talking about offensive streakiness)

    Comment by joser — January 21, 2011 @ 1:47 pm

  30. Lee:

    Thanks on the hockey props. Bye the bye, I have executed precisely the same math as Seth for NHL EV shooting percentage, both of us stealing from Albert and converting to hypergeometric (is it surprise I like Seth`s thinking :D ), this starting a couple of years ago.

    The files at are penner.php and wolski.php I think. I`ve never published links to them, they are php files and run very slowly. Hopefully there is a url advice comment there. If not, let me know.

    In any case, Hockey players would appear to be extraordinarily streaky. This because while manager`s actions in MLB mitigate, coaching actions in NHL exaggerate (i.e you play hot shooters with good linemates more).

    The truth lies somewhere between the two. Either that or human nature has driven freakishly consistent persnalities to baseball and erratic loons to hockey. That seems unlikely to me.

    The fact that even Seth`s order test shows only a moderate right skew is a testimony to the qualioty of field management in major league baseball. Or so I think.

    Somewhat ironically, this is the type of information generally used to display their foolishness.

    Hockey is the polar oppisite of baseball though, coaching decisions to play the hot hand exaggerate the effect, baseball managers decisions

    Comment by Vic Ferrari — January 21, 2011 @ 1:48 pm

  31. Lahey, are you drunk again? The article just said that streakiness is random; why would the BoSox assign value to something that doesn’t exist? If they did any analysis on streakiness, they would have done something similar to this and known that streakiness isn’t a skill.

    Vmart isn’t interesting except that he’s a good example of how streakiness isn’t a skill.

    Lay off the liquor, Jim.

    Comment by Ricky — January 21, 2011 @ 1:48 pm

  32. Wouldn’t the histogram just be reflecting that the mean streakiness is weighted a little bit towards being streaky? That is, the league as a whole is generally a little more streaky than a random distribution would imply. But I don’t think it tells us anything about any one player being more or less streaky across seasons, which would indicate it a skill.

    Comment by Travis — January 21, 2011 @ 1:50 pm

  33. Ron Gardenhire’s use of Jason Kubel and Brendan Harris are great examples of what Vic is talking about in his 1:16 PM post.

    I know there was comments about park factors in the part 1 comments section. But what about just home/road? I thought of this when reading how streaky 2009-2010 David Wright was compared to David Wright sans Citi Field. I realize a sample of 1 was the impetus for my comment and not significant observations, but still interested in how it would turn out.

    Comment by glassSheets — January 21, 2011 @ 1:50 pm

  34. What events (we are using at bats) are you using to measure hockey players? Shifts?

    Comment by Lee — January 21, 2011 @ 1:53 pm

  35. Fantastic stuff. You might find it interesting to look at “hot hand” research from bsaketball. Yet another study has come out on it, one in a long line of studies. They point to a similar conclusion as your study. Human streakiness is random but no amount of explanation will get people to see it that way.

    From True Hoop on Thursday:

    “The book “Scorecasting,” by Tobias J. Moskowitz and L. Jon Wertheim is the latest to make a killer case that the hot hand really does not exist (or is far more scarce than most basketball players would admit). Their case may not matter, as they also include this line: “Amos Tversky, the famous psychologist and pioneering scholar who initiated the original research on momentum and the myth of the hot hand, once put it this way: ‘I’ve been in a thousand arguments over this topic. I’ve won them all, and I’ve convinced no one.’”

    Comment by Owen — January 21, 2011 @ 1:55 pm

  36. I was using shots directed at net, the script actually crudely graphs the Wright-style rolling average plot for each player. The Black Stat. If memory serves the url appendage is `shottype=1` for shots only. `shottype=2` for missed shots included, and `shottype=3` for all shots directed at net.

    I think the other url appendages, by way of example are `team=PIT`and `player=87`… which would of course be Crosby. Let me know if that doesn`t work, I`m just going by memory.

    Comment by Vic Ferrari — January 21, 2011 @ 2:08 pm

  37. The nature of Hockey, as you mentioned, is inherently more streaky. And while the management has something to do with it, I’d chalk most of it up to line chemistry, and in the case that you are using shifts as your time increment, you introduce all of factors that come along with playing a specific team compared to another team (opposing shut down lines, defense quality, goaltending quality, home/road, etc.)

    This is the problem when you use the pure mathematical distribution of of events. There are things that are natural to the game of baseball and hockey that look streaky when compared to the pure distribution, but this isn’t what we want to measure when we say “that guy is streaky.”

    Now, baseball is fascinating for this type (and any other statistical work) because events can really be pinned down on a single player’s shoulders (way more than any other sport, but of course there are endless factors involved as opposed to a player hitting a pitching machine in a vacuum.)

    This is why I think baseball appears to be way less streaky than hockey. There’s just less noise.

    Comment by Lee — January 21, 2011 @ 2:08 pm

  38. I read the previous post and I still wonder, why does it take on numbers from 0 to 1? Why might we expect it to be uniformly distributed?

    Comment by Barkey Walker — January 21, 2011 @ 2:16 pm

  39. This is really cool.

    I wonder if any of the minor streakiness effect is due to homestands and road trips. On a road trip or home stand of > 1 week you’d have a few days where the rolling 7 day window included a mix of home and road games, but also a few days where it were exclusively one or the other and you’d expect them to hit slightly better at home and slightly worse on the road.

    Comment by don — January 21, 2011 @ 2:21 pm

  40. At work so maybe I didn’t get everything but, I wonder if you are dismissing your analysis too soon. You seem to say that there is no correlation between a player’s streakiness from one year to another, but is that true for every player? What I want to know is, do the same players show up near that y=x line consistantly on the scatter plot? Or, what players have a low variance of streakiness?

    Comment by Dan — January 21, 2011 @ 2:22 pm

  41. Matt,

    I agree with your thoughts. To me the key fact is that everyone seems to be streaky to some extent, so there’s no way to discern who’s streakier.

    I will say this though: when I checked the overall distributions of starting pitchers and relief pitchers (I did this after submitting the article), starters had a strong shift to the streaky side (much stronger than for hitters), and relievers had basically none. This may be evidence that strength-of-schedule (much more than park factor) is what’s driving this. One would expect a starter’s game-to-game shifts to be significantly impacted by the opposing lineup, whereas a reliever’s usage may be more consistent on a game to game basis (e.g. a lefty specialist always facing tough lefties) or more random (a closer coming in to pitch the ninth, regardless of who’s due up).

    Comment by Seth Samuels — January 21, 2011 @ 2:27 pm

  42. I disagree completely, Lee. Humans be humans.

    This MLB writer is very good though, I hope he carries on.

    Comment by Vic Ferrari — January 21, 2011 @ 2:29 pm

  43. Mike,

    I agree that UZR would probably be pretty noisy. You’re dealing with extremely small samples and a lot of uncertainty. The other issue is that I don’t actually know where to get daily UZR data.

    My suspicion though, would be that there’s not much of a difference. While we can identify isolated incidents, I doubt we’d find a big overall trend. My guess is the same with “taking it out to the field,” though it’s an interesting idea.

    When FieldFX hits, that should make those things a lot easier to look at.

    Comment by Seth Samuels — January 21, 2011 @ 2:32 pm

  44. Vic,

    I’m not familiar with the order test. Mind if I e-mail you to find out more?

    I appreciate the kind words though. And I love the earthquake analogy (especially as a northern Cali resident).

    Comment by Seth Samuels — January 21, 2011 @ 2:35 pm

  45. Travis,

    I don’t want to speak for Vic, but I think he (and I) would agree with that interpretation.

    Comment by Seth Samuels — January 21, 2011 @ 2:36 pm

  46. Player Average Streakiness Variance Number of Years
    Matt Holliday 0.717 0.003 6
    Aaron Hill 0.575 0.113 4
    Jose Lopez 0.352 0.011 5
    Kenny Lofton 0.764 0.015 5
    Randy Winn 0.750 0.019 8
    Adrian Gonzalez 0.806 0.020 5
    Eric Chavez 0.353 0.022 6
    Shannon Stewart 0.235 0.024 5
    Juan Encarnacion 0.298 0.025 5
    Scott Hatteberg 0.637 0.026 5
    Rafael Furcal 0.658 0.030 7
    Miguel Cabrera 0.516 0.036 7
    Vernon Wells 0.605 0.038 8
    Jose Vidro 0.271 0.039 5
    Angel Berroa 0.753 0.027 4
    Ron Belliard 0.448 0.041 5
    Placido Polanco 0.452 0.042 9
    B.J. Upton 0.507 0.118 4
    Carl Crawford 0.647 0.045 7
    Dan Uggla 0.631 0.047 5

    Comment by Dan — January 21, 2011 @ 2:42 pm

  47. Sorry it is hard to read but above I provided the top 20 players in consistency of streakiness with 4 or more years in the analysis by Seth. The interesting players would be those with high or low average streakiness. We can say that these players are consistently streaky/unstreaky.

    For example, the number 1 there Matt Holiday was extremely consistent at being streaky, and Jose Lopez was consistent at being, well, consistent.

    Comment by Dan — January 21, 2011 @ 2:45 pm

  48. Esoteric,

    I do think that’s at least theoretically possible. I had been thinking of it more like BABIP, where you have knuckleballers who are just qualitatively different, but clutchiness may be a good example too.

    I do find a few guys who appear to be consistently in the same general range (e.g. Matt Holliday is in the .640-.800 range for the last six years, and Randy Winn has barely been below .700 since 2002).

    I don’t personally know a good way to test this probabilistically, but if anyone can suggest a metric, I’ll try to find the time to run it.

    Comment by Seth Samuels — January 21, 2011 @ 2:51 pm

  49. Well, your opinion is certainly respected by this commentor… but what do you say to the fact there is zero correlation year to year for these streaky players? That’s a hard pill to swallow.

    Comment by Lee — January 21, 2011 @ 2:57 pm

  50. Travis,

    The histogram (order test) tells us that there is streakiness in the population. In other words, it tells us that some players are more streaky than others, in no uncertain terms.

    Autocorrelation won’t pick that up. Gelman (who owns a much bigger brain than mine) has some good commentary on this subject lately. His blog is a good read, he’s an engaging writer, and I would guess he’s one of those rare academics that is also a good teacher.

    The acid test is a model, at least to my mind. Build a population of hitters, some consistent, some streaky … see how it shakes out. Does autocorrelation (i.e. real effects, Seth’s scatterplot) help you find it? Does the order test (Seth’s histogram) shed light?

    Since you have specifically assigned streakiness to players, and you know who they are … can you identify them with math?

    These are good questions, I think.

    Comment by Vic Ferrari — January 21, 2011 @ 3:00 pm

  51. Lee,

    I can’t disagree with you. To be honest, I hadn’t considered park/opponent effects until you brought it up yesterday (a major blindspot, I realize, but it happens). I added in the sentence mentioning that at the last minute, but the rest had already been written and submitted.

    In truth, I think your 98% statement is probably a bit of an exaggeration. It’s hard to tell really. I will say (as I noted in a comment above), the effect is strongest for starting pitchers, strong for batters, and nonexistent for relievers, which may give some credence to your claim.

    I left that particular line in because, honestly, it strikes me as probable and the data don’t necessarily refute it. If I had to guess, I would think that, in addition to park and competition effects, some players may be affected by, e.g., a bad breakup or a child’s illness or something (negative externalities strike me as more likely to have an influence than positive ones), while others might be less affected. These external events would be, for our purposes, completely random, so there’s a decent chance that it would still show up this way.

    Vic’s idea about simulating a season is an interesting one. If anyone has the time to code something like that, it would be great to see. I can try to adapt my code to do it, but it might take a while (I have lots of homework these days).

    Comment by Seth Samuels — January 21, 2011 @ 3:10 pm

  52. Notdissertating,

    I’ve run it that way too, it just doesn’t look as interesting. I think it’s not quite isomorphic (though it’s been a while since I took linear alg), but it doesn’t much affect the results. CI for the mean is (.08,.19), for the median it’s (.07,.22). For the uniform distribution it’s (.52, .55) for the mean and (.53,.59) for the median.

    Also, as a first-year grad student, I love the username.

    Comment by Seth Samuels — January 21, 2011 @ 3:27 pm

  53. Joser,

    I think yesterday’s column addresses your concerns. It lays out the method I use. The point is that this is after adjusting for that random variation (though it misses, as Lee has noted, the effects of park and opposing pitchers).

    Comment by Seth Samuels — January 21, 2011 @ 3:29 pm

  54. Q: Why does it take on numbers from 0 to 1?
    A: Because the metric is probability — a 0.9 score means that the players’ season was streakier than 90 percent of possible seasons, while a 0.1 score means the season was less streaky than 90 percent of possible seasons.

    Q: Why might we expect it to be uniformly distributed?
    A: Good question. It makes some intuitive sense, I guess. See my post above in which I ask the same thing.

    Comment by notdissertating — January 21, 2011 @ 3:30 pm

  55. Owen,

    There is one factor in basketball that would be tougher to account for, which is that players who think they’re on a hot streak might be more likely to take bad shots. I’ve never seen a study that tries to adjust for this–I have heard of studies showing that it happens though. Maybe Scorecasting does (I plan to read it, for whatever that’s worth). In truth, I’m not sure you *could* properly account for this, since you’re dealing with, say, a probability shift from like 70% on a good shot to 20% on a bad shot, so you’re gonna get a lot of noise when you start valuing tough makes more than easy makes.

    I suppose in baseball that could equate to swinging at bad pitches, but I suspect you’re less likely to swing at something a foot off the plate than to try to launch a fallaway three from the corner (see Bryant, Kobe). Also, in baseball, you hear about guys saying the ball looks like a grapefruit or what have you. If they’re seeing the ball better (or “better,” if you prefer), that should mean taking bad pitches more effectively, as well.

    Anyway, I’m not at all prepared to say that streakiness is purely random. But there’s a distinction that’s not often made between something that is “random” and something that is ” indistinguishable from randomness.” I’m inclined to believe that this falls in with the latter. Very little about our lives is individually random, but the overall picture has a whole lot of noise to it.

    I’ve used your post as a jumping off point to something else, I guess, but anyway those are my thoughts.

    Comment by Seth Samuels — January 21, 2011 @ 3:39 pm

  56. Yea, I think Vic’s idea of creating a model with streaky players would shed a lot of light on the situation. However, just thinking about the parameters of that model is an enormous task.

    You could easily take 10-20 players and model them so that their true wOBA fluctuates every 15 games or so. .380 for 15 games, 300 for 15 games. But this still leaves all of our existing noise. Our data for this project is so laced with things we don’t want to measure.

    After modeling our streaky players, we’d have data for 10 guys who were streaky in a vacuum, and a real data on a league full of players who are (as I believe) totally averagely unstreaky, but play with many factors that make them look streaky.

    There just doesn’t seem like a good way to drill down to exactly what we are looking for. I wonder if tango has any thoughts on how to clear the noise. He posted part 1 yesterday on his blog. (and he said for the most part what I’ve been saying)

    Comment by Lee — January 21, 2011 @ 3:45 pm

  57. If we’re really talking about something random, then (I think) it should be uniformly distributed because each of these statistics is a percentile, relative to the player’s individual distribution (which is approximately normal). So each “true streakiness” score can be thought of as a p-value relative to the player’s individual distribution. If we’re just going by randomness, then just as we should see a t score less than -1.65 about 5% of the time, we should see a p-value less than .05 about 5% of the time.

    I realize I might be wrong about this assumption, even if the data bear it out, so I’m happy to accept corrections or comments.

    Comment by Seth Samuels — January 21, 2011 @ 3:45 pm

  58. Dan,

    Thanks for putting that together. I had actually been looking at that on my own computer when you posted. I guess the thing is, it’s hard to know whether those 20 players are that consistent because they really are, or because we’d just expect to see that happen sometimes in a big enough sample.

    I’ll see if I can tease that out.

    Comment by Seth Samuels — January 21, 2011 @ 3:48 pm

  59. Hmmm… I am guessing the sample would have to be really large to “expect” a player to have a variance of 0.003 over 6 seasons. Outlier? umm… maybe? If you do any testing on this subject, I think it would be even more interesting.

    Comment by Dan — January 21, 2011 @ 3:57 pm

  60. Is it, conditional on their season total, this is the percentile of simulated streakiness that their actual season falls into?

    Comment by Barkey Walker — January 21, 2011 @ 4:02 pm

  61. That is right. If the simulation generates the true distribution, then the distribution will be uniform on [0,1].

    The problem is that you really need to remove opponent effects from this. To say it in a way that might appease the FG powers, If Longoria was playing third yesterday when I went to bat, then he is probably playing third today. A more traditional, pitcher centric approach, would say, if I was playing vs a Boston pitcher yesterday, I am probably playing vs one today. That story makes less sense because it could also lead to negative correlation. i.e. If I was playing vs the fifth pitcher in the rotation yesterday, I’m probably playing vs the first today. Also, if I faced a closer yesterday, I’m unlikely to face one today.

    Obviously, park effect matters too. Fly balls are not worth as much at Safeco field as they are in Coors. This could also change how I approach my at bats. i.e. a guy who hits 10-15 HRs per year is not going to swing for the fence in Safeco, but might in Coors.

    Comment by Barkey Walker — January 21, 2011 @ 4:12 pm

  62. How did you get a CI on a median?

    Comment by Barkey Walker — January 21, 2011 @ 4:17 pm

  63. The definition of p-values is that they are u(0,1). That is their one and only property. Yay Fisher for giving them too us!

    Comment by Barkey Walker — January 21, 2011 @ 4:18 pm

  64. I think this might be like the birthday problem (if you have 19 or so people in the room, chances are two of them share a birthday. Far sooner than you would think).

    Your correlation is already a very good test.

    Comment by Barkey Walker — January 21, 2011 @ 4:21 pm

  65. F*ckin A, Ricky

    Comment by Bubbles — January 21, 2011 @ 4:22 pm

  66. I just ran it. Basically, Holliday looks like an outlier (we’d expect something that extreme about 0.0002 of the time). But there’s nothing in the distribution of variances to indicate non-randomness.

    I just did a quick-and-dirty nonparametric test, but basically I took the variance of 5 randomly selected numbers in the uniform distribution, 6 randomly selected numbers and so on, calculating the distribution of expected individual player variances. I then compared a player for whom we had five seasons to the distribution of five-number samples, and so on, so I got a p-value for each player’s variance. The distribution is not significantly different from a uniform dist:

    Low High Count
    0 0.1 14
    0.1 0.2 7
    0.2 0.3 14
    0.3 0.4 15
    0.4 0.5 14
    0.5 0.6 16
    0.6 0.7 10
    0.7 0.8 14
    0.8 0.9 10
    0.9 1 15

    So, there’s nothing here that indicates to me that these guys are actually the rare consistently streaky players. I think if that existed, we’d see a slight peak at the bottom end, and then smoothness above that.

    Comment by Seth Samuels — January 21, 2011 @ 4:26 pm

  67. Okay, let’s do this (and it has been a while since I took any prob/stats or did anything like this so I may be wrong):

    For each player I took the max real streakiness – min streakiness.
    This is the probability that a random event would occur between that max and min.
    Now, exponentiate that to the number of years and you have the probability that the player would randomly have his streakiness within his max – min.

    So for Matt Holiday it is (0.80-0.64)^6= 0.00168%. I am using players with over 4 years so a sample size of 130. That would not be expected.

    Actually, to me if you look at it for individual players, and not the league as a whole, it seems there is some correlation for many players.

    Please tear apart my logic and teach me a lesson.

    Comment by Dan — January 21, 2011 @ 4:31 pm

  68. yes. exactly. this is my understanding.

    Comment by notdissertating — January 21, 2011 @ 4:31 pm

  69. Barkey, median CI’s are bootstrapped. Sorry, should’ve noted that.

    Comment by Seth Samuels — January 21, 2011 @ 4:58 pm

  70. Yes.

    Comment by Seth Samuels — January 21, 2011 @ 4:59 pm

  71. Dan,

    Down the rabbit hole we go. So yes, the p-value on Holliday’s season is about 1/5000. The thing is, if we focused on Holliday only, we’d be biasing the results. So, we calculate that p-value for everyone (I did it for the 129 players with 5 or more seasons) A hypothetical distribution if we had a league where *some* players were streaky would probably have a larger number of players concentrated at the bottom (since some would be there reliably while others would be there randomly) and be pretty uniform above that. This is, as I noted a little earlier, not something we see.

    Sticking with Holliday’s results for a minute, the p-value on his variance is 0.02%. But we don’t have a sample of one, we have a sample of 129. So the probability of seeing an outcome as extreme as Holliday’s isn’t 0.02%. Rather it’s 1-(probability of no outcomes as extreme as Holliday’s)^(number of players in sample), which translates to 1-(.0002)^129 = 1- .975 = .025. So the probability of seeing one result like Holliday’s in this sample is about 2.5%. That’s not a lot, to be sure, but it happens. We’d expect a result as extreme as Randy Winn’s (the next lowest p-value at .75%) about 62.1% of the time. This number very rapidly approaches 100% in a sample of this size.

    So basically, we’re left with a question of whether we had the unfair coin come up tails, despite a 2.5% chance that it would happen, or whether Matt Holliday is the only player in the last ten years who is either consistently streaky or consistently unstreaky. My guess is the former.

    Hope that helps.

    Comment by Seth Samuels — January 21, 2011 @ 5:13 pm

  72. I would say about 5% of that population have probabilities less than 50% and are interesting to look at. Time to go home, very interesting and was fun to play around. I’ll def read your stuff…

    Comment by Dan — January 21, 2011 @ 5:35 pm

  73. great explanation of a really tough concept. i personally like the anecdote of the blade of grass on a golf course who is amazed and befuddled that of all the thousands upon thousands of blades of grass this little white ball landed right on top of her. what is the probability of that?! of course the golf ball had to land somewhere, so taken in context, it is only surprising from the perspective of any given blade of grass. matt holliday’s consistent streakiness is only surprising if you happen to be looking at it from his perspective – random chance suggests *someone* would exhibit crazy amounts of streakiness.

    for the interested reader, this episode of NPR’s Radiolab has the golf ball anecdote among other enlightening discussions of stochasticity:

    Comment by notdissertating — January 21, 2011 @ 5:47 pm

  74. As a sixth-year graduate student the username is perhaps more apt than it ought to be!

    Comment by notdissertating — January 21, 2011 @ 5:52 pm

  75. Seth,

    This has been a very interesting look at the topic of (probably) random fluctuations in player performance.

    When I first read the articles I had the same question as Sunny (from yesterday’s post) regarding why a player should be regarded as “streaky” if they start the year or end the year at a considerably different wOBA than the overall average wOBA. That is to say, if a player has a down month (or a good one) are they “streaky”?

    Let’s look at an absurd theoretical example of a player who plays through the All Star Break at a consistent 0.400 wOBA, but then plays the second half at a consistent, albeit lower 0.300 wOBA. We’ll assume that the same number of ABs occurred in both sections so that the weighting is equal. By your definition, this would seem to give a raw streakiness of 0.050 (as long as I’m interpreting it correctly), as the season wOBA was 0.350 and he spent the entire season either 0.050 above or 0.050 below the season figure.

    However, I would argue that a similar player with a 0.350 season wOBA but who went through two up cycles and two down cycles would be streakier. That is, first 40 games at 0.400 wOBA, next 40 games at 0.300 wOBA, next 41 games at 0.400 wOBA and last 41 games at 0.300. The raw streakinesses of the two scenarios are the both 0.050 (so the “true streakiness” will be the same figure as well).

    For that matter, are the players above any more or less streaky than another 0.350 season wOBA player that spent 90% of the season at a respectable 0.378 and the other 10% sucking wind at 0.100?

    Again, my understanding of the method and/or my fast-and-loose area calculations could be off a bit (not to mention it certainly would take a great deal of skill to perform in these exact fashions), but I think that perhaps I just disagree with the notion that the 1-norm is the most effective in determining streakiness. I am not particularly helpful since I don’t have any alternate suggestions, just thought I’d add those thoughts in there.

    Additionally, it would be interesting to see whether the path length metrics your briefly mentioned yesterday show the same year-over-year independence that the “true streakiness” stat exhibits.

    Comment by EWolf — January 21, 2011 @ 10:41 pm

  76. Seth-

    Interesting work. The first scatterplot clearly indicates that there’s nothing about players that correlates with streakiness. Yet you have that odd histogram.

    You’ve combined 11 years of data into a single analysis–my hunch is that this histogram is reflecting some sort of yearly variation.

    I downloaded your data and recreated the bins. I then did a pivot table in Excel doing a count by year of the number of players in each of your bins. Finally I figured the mean and standard deviation for each year.

    Year Mean Std. Dev.
    2001 7.75 2.022895267
    2002 7.6 2.72222819
    2003 8.2 3.707744385
    2004 8 3.094987459
    2005 7.45 2.584875035
    2006 8 2.492092758
    2007 8.05 2.928534751
    2008 7.2 3.001753873
    2009 7.7 1.780005914
    2010 7.3 2.154554539

    There’s some pretty clear variation by year–especially 2003. It has the highest mean and the highest std. deviation. Whatever is causing the skew in your histogram is something that varies by year.

    Comment by George Purcell — January 22, 2011 @ 12:17 am

  77. Unfortunately, the noise is such that it tends to overwhelm any analysis. I am sure for example that many “streakiness” issues for a given year by a number of players is due to them playing hurt or going through personal issues (the latter JD Drew in 2007). This increases the amount of streakiness in the larger population, as does scheduling and park effects, not to mention seasonal effects, aging (older players seem to be more streaky, esp on the cold side), etc, and this may mask some individuals true streakiness.

    The data is imperfect and makes it difficult to find evidence of what you are looking for. Like clutch hitting, catchers impact on pitchers performance, etc. Sometimes one needs to look beyond the numbers, since the numbers are only as good as the data, and the data is not always correct or complete enough.

    As they say, the absence of evidence is not proof something does not exist.

    Nice study though, and at the very least I think it suggests that such true streakiness is not that frequent or significant.

    Comment by pft — January 22, 2011 @ 3:47 am

  78. EWolf,

    That is, as it happens, the exact same thought I had myself. That’s why I tried using the length of a LOESS curve around the moving averages, which I mentioned in the comments at some point. But that yielded results that were no different. So, given that I knew I was going to end up with a null result, I felt like it was better to explain it in terms everyone could relate to, rather than trying to explain local regression curves. But yes, I do agree with your premise.

    Comment by Seth Samuels — January 22, 2011 @ 4:36 am

  79. George,

    Yes, it does vary a bit from year-to-year. It doesn’t do so in any way that appears to be meaningful to me, but I’ll admit that I haven’t delved into that as closely as I did other things.

    That said, looking at the numbers you posted, I’m not sure where you’re getting your means from. My guess would be that you’re using my raw streakiness stat rather than my true streakiness stat, but even still, your results are rather different from what I get. Perhaps I’m missing something in your calculations.

    My quick and dirty analysis finds that, using nonparametric tests, the streaky-skew is highly significant in all years except 2003 and 2005. Perhaps something’s going on in those two years. I think it’s ultimately a function of opponent pitching, more than anything else.

    Comment by Seth Samuels — January 22, 2011 @ 5:05 am

  80. Pft,

    I totally agree with your assessment. Somewhere earlier in the comments I noted the importance of distinguishing between something that is random and something that is mathematically indistinguishable from randomness. I don’t at all believe that this analysis shows the former. I do think that, for the time being, there’s no reason to think that any streakiness we see would be meaningfully different from randomness.

    I have to ask though, you said older players seem to be more streaky. Is this something you’re pulling from my data? Or are you just saying that anecdotally?

    Comment by Seth Samuels — January 22, 2011 @ 5:08 am

  81. What I did was rank order true streakiness then place the measures evenly in 20 bins. I then did a simple count of the number of events in each bin that occurred in a given year and took the mean of the bin counts for each year.

    I don’t have a way to post the pivot table, but take a look at the bin counts for 2003:

    7 10 9 4 7 14 9 7 3 3 7 6 5 11 9 17 14

    There has also been a decrease in the number of players qualifying–and the number of these players was highest in 2003 (164). By contrast 146 qualified in 2010 and 154 in 2009.

    Comment by George Purcell — January 22, 2011 @ 10:10 am

  82. Terrific and fascinating work, about which I may say more in a bit. But first this important caveat: as cool as it is, the streakiness metric doesn’t seem to capture all of our subjective sense of a player’s streakiness.

    Johnny Damon in 2003 measures as consistent (.211), but he had a .703 OPS through July 7 and .812 afterwards. He was coming off a messy divorce and only the second half was consistent with his established talent, and I correctly argued at the time that only the second half was predictive.

    Carlos Pena in 2003 measures as tremendously consistent (.094), but he had a .589 OPS in his first 143 PA and a .596 in his last 116, and a .955 in the intervening 256. A few years later I argued that that prolonged “hot streak” in the middle, as well as similar streaks in 2004 (which did measure as streaky at .869) and 2005 (not enough PA to measure) argued for tremendous upside as a hitter, and that was correct, too.

    Todd Walker in 2003 was so streaky (subjective sense of those watching him every day) that the team sent him to a sports psychiatrist, but he measures at .220.

    (It’s presumably a fluke that all three of these examples are from 2003!)

    So it appears as if this streakiness metric is not good at characterizing seasons marked by prolonged streaks; it seems to nail micro-streakiness but not be good at macro-streakiness, as it were. I’d be curious to see what happened if the moving average was taken over a much longer time period.

    Comment by Eric M. Van — January 22, 2011 @ 12:26 pm

  83. I don’t know why people are assuming that the schedule (park and opposing pitcher variations) adds to streakiness. It’s clear that it would make a perfectly consistent hitter appear to be streaky, but it seems just as clear that it would make a maximally inconsistent hitter appear to *less* streaky. If you were absolutely locked in and you face Pedro in his prime in a big park with the wind blowing in, you’re probably going to go 0-4. If you were slumping terribly and faced a AAA callup with the wind blowing out a gale, your odds of going deep despite the slump are much higher. The schedule just adds *noise* and regresses the actual streakiness towards the streakiness inherent in the schedule.

    Comment by Eric M. Van — January 22, 2011 @ 12:36 pm

  84. Fair enough. If the results are the same, then this is clearly a more accessible approach. It certainly got me thinking about the topic a bit.

    Comment by EWolf — January 22, 2011 @ 3:01 pm

  85. Eric,

    Thanks for the comments. If it helps, one of the other ways I ran it was taking the distance between a player’s maximum moving average and his minimum, and doing it that say gives Damon a .733, Pena a .603, and Walker an .884. I don’t particularly like that measure though, because I don’t think of the difference between the absolute max and the absolute min as really being about streakiness. As far as Damon goes, I tried adjusting it to measure performance before and after July 7, just to see the probability of his having as extreme a first-half second-half difference as he did in reality (.043 by wOBA). The p-value comes up as .13, which is not that bad (and that’s one-sided, to boot). So Damon’s second-half improvement just wasn’t all that extreme. More generally, though Damon did have some second-half improvement, his ups and downs just weren’t very extreme aside from a really severe cold stretch at the end of the season.

    With Pena, part of it is that the windows you’re using are pretty arbitrary (e.g. if you look at his last 145 PA, he had an .805 OPS), and most of what you’re seeing there could well be random fluctuation, given the small sample size. Seriously, if you can get your hands on it, take a look at Pena’s moving average (I’d post it, but I don’t have the ability to do that). It’s just not that streaky.

    As for Walker, it’s kind of interesting, but he basically has an incredibly consistent season except for a massive cold spell for about a month and a half. But really, outside of that, the line is nearly flat. So, we’re left with a question of whether we want to place extra emphasis on his super-cold streak, even though he was so unstreaky for the remainder of the year. To be honest, I’m comfortable with the way the data treats him.

    Nothing is going to be perfect with this kind of measure, I realize, but there’s nothing with these three players that makes me think it’s systematically getting anything wrong.

    More to the point, as I’ve mentioned in earlier comments, I tried this with several different streakiness comments and still got the null result. So the main benefit of the way I presented it was that it was easiest to understand.

    Comment by Seth Samuels — January 22, 2011 @ 3:20 pm

  86. George,

    Got it, thanks. At any rate, as I said, the skew is pretty reliable from year to year, so I think it’s probably still an underlying thing, rather than just year to year variation. Also, there’s no theoretical reason that this should vary much from year-to-year, since performance effects are stripped out. It would be one thing if this included the switch to the unbalanced schedule or something, but without that there’s not any reason I can think of for the population to vary from one year to the next other than random variation.

    Comment by Seth Samuels — January 22, 2011 @ 3:32 pm

  87. Eric,

    I should also add that I’m glad you brought up Damon’s relationship problems. I think that kind of thing would be a big part of what drives the streakiness we observe. Such off-field events occur at times that are, from a baseball standpoint, completely random, and may well be part of why there’s no identifiable individual streakiness–since we can’t control for off-field noise.

    Comment by Seth Samuels — January 22, 2011 @ 3:34 pm

  88. It seems things have settled down here, so I’d like to thank everyone again for all your thoughts, comments, criticisms, and kind words. Hopefully I can find some time to do this more often, so this won’t be a one-off thing. If you have more questions or comments, I get an e-mail when you post here, so by all means continue to do so.

    Thanks again.


    Comment by Seth Samuels — January 22, 2011 @ 9:11 pm

  89. Hello,

    First of all, thank you for the interesting study.

    I seem to be a bit late to the discussion, but if you’re still around … have you looked into any correlation between streakiness and DL-time? My first thought upon seeing the skew of your last histogram was that it could be due in part to playing through/after injuries; this could also help account for the weak negative correlation with PA.


    Comment by MattD — January 24, 2011 @ 9:19 pm

  90. Matt,

    I haven’t, but that’s a good idea. Unfortunately, I don’t have any data on DL time, nor do I know where to get it. I could definitely see that being an explanation for underlying streakiness. If you have any idea where I might get something like that, I can try to run it.

    Comment by Seth Samuels — January 25, 2011 @ 4:21 pm

  91. Hi Seth,

    I looked at a somewhat similar topic one day: the consistency of players. If you wanted to take a look here’s the link:

    Instead of taking differences between a player’s seven day length and his seasonal mean for a metric, I used a time series model approach. I think this would be interesting using wOBA and trying to predict a player’s next day wOBA based on his past seven days. What’s nice about time series would be the lag (of seven days in your case) determined parametrically. What do you think?

    Comment by Kevin — January 25, 2011 @ 5:14 pm

  92. This is the only I source I know about:

    I saw it when it was put up about a while back, not sure if it’s been updated…

    Comment by MattD — January 26, 2011 @ 2:10 am

  93. Too many more articles like this and we’ll understand baseball and won’t need to read Fangraphs anymore.

    Comment by Jon — January 26, 2011 @ 5:42 am

  94. Bubbles, answer that. I gotta rock a piss off, buddy

    Comment by Ray — January 26, 2011 @ 11:24 am

  95. Kevin,

    Interesting piece. Honestly, I don’t think it would really work with WPA for a few reasons. The first is just, most obviously, you’d have sample size issues, since your wOBA on the left side would be based on 4 or 5 plate appearances. Also, one of the main concerns with parametric methods is the need to rely on a particular set of assumptions about the distribution. I’d frankly be shocked if the error in a model like that were normally distributed. Not to say parametric methods are bad–they’re very useful. But it’s a weaker argument when assumptions don’t apply.

    It’s also worth noting that I’ve tried some variation on this in the past (it was a while ago, so I don’t remember exactly, but I think I did a logit model on walks, based on like 20 previous PA’s, and didn’t find anything there. If you have the time and the interest, I would still encourage you to go ahead with it, or at least to keep tinkering. You never know what you might find.

    Comment by Seth Samuels — January 26, 2011 @ 10:10 pm

  96. Here are a couple notes:
    (1) We need to first establish an agreed upon definition of streakyness, because player performance always varies in baseball. To me streakyness is “statistically significant variations in player performance which occur in a random pattern.” Using this definition at least two of your streakiest cases are not streaky at all because the variation in their performances is not random. Therefor their data, and similar cases, should not be used in your year-to-year correlation.

    (2) “The one relationship that was statistically significant was a weak negative correlation between streakiness and plate appearances (r = -0.061, p = 0.016). It is tempting to think that this may suggest that better players (who play more) are less streaky, but this is unlikely.” – Actually this correlation is just a statistical representation of something we know to be true. As n increases we approach the true mean of N.

    This is a fancy way of saying that as the # of AB’s increase all players averages in all stat categories will regress towards that players mean for each category. Since there is a maximum of 162 games more AB’s in essence means more AB/Game. So if a player gets more AB/Game then their performance, when viewed on a game by game or series by series manner will appear more consistent. To look at it from a logical standpoint, If every player got 20 AB’s each game, then there would be a greater chance that they would post extremely consistent box scores.

    When you look at it this way it makes incredible sense to see this correlation from a statistical standpoint. This correlation is not meaningless, but is totally anticipated and would be shocking if it were not statistically significant, or at least closely approaching statistical significance, every season.

    (DAVID WRIGHT 2009) “Neither of the two David Wright seasons we looked at earlier makes the list, but his concussion-marred 2009 season was the streakiest in the league that year.” – In this case you variable is not streakyness. You are determining a players streakyness by analyzing performance statistics. These statistics, and your streakyness coefficient, are being confounded by a third variable.

    Wright was injured. Injury affects performance. Your streakyness coefficient identified his performance variations as significant and therefore tagged his performance as streaky. Unfortunately your coefficient is neglecting the fact that these variations were not random. They were effected by a third variable which impacts performance. If you determined Wright’s streakyness coefficients for the smaller periods in between each of the injuries you are likely to get a very different value.

    When you minimize the effect of the confounding variable (injury) Wright’s 2009 streakyness coefficients are each likely to approach those in other seasons. You don’t expect that performance statistics (especially counting statistics) would correlate well when comparing an injury marred season to a healthy season, so why would you expect that a streakyness statistic (if it exists) would correlate well from an injury marred season to a healthy one.

    (BRENNAN BOESCH 2010) Brennan Boesch’s 2010 season wasn’t streaky in the least bit, but you list him as the streakiest player of 2010. Boesch was the model of consistency. He was consistently brilliant in his first 30-40 games or so, and consistently horrible the rest of the season. This change was due to a change in the way pitchers approached him, there is nothing streaky about it.

    If you isolate his performance from before pitchers adjusted to him and seperated that data from his performance after made those adjustments you’d have two very different sets of statistics, and both would have similar streakyness coefficients. Boesch is not a streaky player because pitchers adjusted and he failed to adapt to that, he is just a younger player who’s talent was negated by pitchers who had figured him out.

    Here your measure is again being affected by a confounded viable, although I have trouble giving that variable a name. It is not that common that a player is shut down as badly as Boesch was, but there was very little streak to it. Consistent brilliance and consistent darkness. Kind of like the extraordinarily predictable, and not streaky at all, rising and setting of the sun. So again your coefficient is identifying statistically significant differences, but failing to recognize that these differences are not occurring in a seemingly random pattern.

    (4) So injury and other events in a players season are confounding your variables and really reducing the validity of this study. Without controlling for confounding variables there is no way to use statistics to represent a players streakyness. I use statistics exclusively in making fantasy determinations because I can’t watch a lot of games and because my eyes too often lie to me, only seeing what I want to see. Numbers can also be use to lie, and while your intention was to uncover the truth, you need to combine statistics and common sense in order to do that.

    I am thoroughly surprised that you didn’t see these problems when Boesch and an injury riddled season popped up on your most streaky list. The only way to make any determination on the existence of streakyness requires using a very unscientific method. You need to pinpoint where the so called streaks which your correlation has identified are occurring in the season. Then you need to and attempt to correlate them with isolated events which you deem to have an impact on the players performance in the statistical category being evaluated.If you can correlate the players streak to a confounded variable you must control for that variable of exclude the data.

    There is no way to do this objectively, and as such there is no way IMO that you can statistically evaluate streakyness. If you have any ideas for objectively eliminating (or significantly reducing) the confounding variables without eliminating too many valid cases then please let me know as I would be happy to brain storm with you.

    The way I see it there is no truly valid point in time to say, this is the game pitchers figured Boesch out. Additionally how do you correct for playing a series in 100 degree arlington. This is likely to affect performance, but should the data from this series be excluded or controlled for. You could make a case, for it, but then you’d have to start controlling for playing in Minnesota in April. I just don’t see any way that you could statistically analyze streakyness, but I also see no reason to use this article and the data presented to discredit that some players are streaky. I know you say that this article is not meant to be absolute proof that streakyness doesn’t exist, and agree with you. I may even go one step further to say that this article provides very little proof at all, however the ideas are sound and it is an excellent premise for more statistical analysis.

    Comment by Darryl Strawberry Fields — June 8, 2011 @ 10:35 am

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *

Close this window.

0.162 Powered by WordPress