FanGraphs Baseball


RSS feed for comments on this post.

  1. Totally spitballing while in class right now (stats class actually), one factor might be reliever WPA? Bullpen usage is a substantive part of managing baseball, and some kind of reliever WPA/162 or something like that could speak to how effective a manager is at using his relief pitchers. It also obviously speaks to how good his relief pitchers are, but I don’t know hw you can get around that.

    Comment by ICEYhawtSTUNNAZ — November 15, 2011 @ 11:09 am

  2. This exactly the path I started wandering down. Some sort of WPA added per bullpen appearance or something.

    But, like you noted, that’s going to be heavily influence — so I suspect, at least — by the quality of the bullpen.

    Comment by Bradley Woodrum — November 15, 2011 @ 11:13 am

  3. What does a manager generally have control over?
    Lineup composition, SB attempts, bunts, hit and runs… Defensive alignments/shifts, pitching changes..
    WPA can be attributed to some of these moves, but generally I think there will need to be qualitative analysis done in conjunction with the numbers as all managers will have different cards (players) to play.

    Comment by Nathan — November 15, 2011 @ 11:21 am

  4. First thing that comes to mind is some sort of “synergy” metric. Does a manager exhibit an ability to produce more wins for a team than the sum of its parts would indicate it could? A lot more thought is required, but to first order you can compare combined WAR or RC vs. actual wins. This would be similar to the idea of clutch for a hitter and an analysis of consistency from year to year would need to be done to demonstrate a real-world skill.

    Comment by Josh Goldman — November 15, 2011 @ 11:22 am

  5. How about (as a crude first approximation) team WPA from the 7th innings onwards, adjusted by overall team WPA (i.e. their W-L record)?

    Then you could try to overlay some sort of adjustment for bullpen & bench quality, though perhaps the reason why the bench is quality is because the manager’s starting the wrong players. It’s all very knotty.

    Comment by Aaron (UK) — November 15, 2011 @ 11:23 am

  6. Continuing down this path a bit, maybe one could figure if there’s any trend in Leverage Index when he chooses to make a bullpen move/pinch hits/etc. Try to find out if making a move is more likely to be a success or failure given a certain amount of leverage.

    Comment by ElJosharino — November 15, 2011 @ 11:24 am

  7. I have felt for a while that an individual manager can’t do much to actually make his team perform better than expected but he can certainly do things to make it perform worse. The manager should be on the same page as management and implement management’s vision. If there is any truth to the story of Art Howe flatly refusing to play Billy Beane’s Moneyball players in the first portion of the season (as represented in the recent film) then here is a perfect example of how a manager can make things much worse.

    Joe Maddon will likely be presented as a manager who makes his teams better. I’d suggest that Maddon is simply the best current example of a manager who is totally buying in to what his GM is attempting to accomplish and is implementing this strategy completely. The vast majority of the credit in Tampa Bay should go to management for evaluating talent extremely well and by simply behaving completely rationally by exploiting inefficiencies and refusing to overpay for talent (like the very large majority of its irrational peers). I suppose the manager of the team has to have some credibility among the players, that the manager has to be viewed to a certain extent as one of them as opposed to some guy who is a tool of ownership/management. But this is not a baseball skill, this is a management skill, a people skill.

    Comment by Robbie G. — November 15, 2011 @ 11:27 am

  8. I personally don’t think it’d be useful, but you could compare team fWAR with actual win-loss record.

    Comment by Yirmiyahu — November 15, 2011 @ 11:27 am

  9. Could you look at teams that consistently outperform their WAR? Managers don’t seem to outperform their Pythag W-L, but maybe the manager plays a part in getting to those runs scored-runs allowed measures in the first place.

    Second, if we’re talking about the “clubhouse chemistry” aspect of managing, we also have to talk about it for the players, which makes it even more difficult to capture.

    Third, I bet that however you measure it, you find that the variability in manager performance has decreased over time. There are too many practices set in stone (closer in the 9th) for managers to deviate much from the script. But maybe early in baseball, someone like Joe McCarthy was innovating in ways that are measurable.

    Comment by cwendt — November 15, 2011 @ 11:30 am

  10. Another wrinkle in this is leaving starters in too long. Joe Maddon — whom I anecdotally and personally believe is the best active manager — has a tendency to let some pitchers go maybe a batter too long.

    Is this something that just seems like a trait of his because his pitching staff is exceptional and they’re pitching deep into many games, or is it an actual flaw? Would other managers make the same mistake with a similar rotation? I don’t know.

    Comment by Bradley Woodrum — November 15, 2011 @ 11:30 am

  11. Yes, definitely. There is no way a manager’s full value will be statistically observable. In fact, we can say the same thing — to a lesser extent — about players.

    Players are paid to produce, though, so we have a better idea whether they are producing, but managers deal with more ethereal and — like you said — qualitative matters. Still, let’s try to observe what we can and see what we find.

    Comment by Bradley Woodrum — November 15, 2011 @ 11:32 am

  12. Ooh, that’s a great thought.

    (WAR win-total) — (actual win total) = (managerial residuals)

    There is a danger, I imagine, in attributing all the residuals to just the manager. For instance, the third base coach can cost or add a significant number of runs based on whether they consistently send runners in good or bad situations.

    Still, I think this is also the right track.

    Comment by Bradley Woodrum — November 15, 2011 @ 11:37 am

  13. Along the same line as ICEYhawySTUNNAZ, how about measuring all the stats of pinch hitters against what the expected stats of people they replace? Also, you could compare situational stats against league averages to look for areas where the team consistently outperforms the league. Obviously, this is going to be biased for a good team when compared to a bad team, but it’s a start. I realize this is vague, so an example: if team A hits .200 in full counts with RISP and the league average is .225, then it counts somewhat against the coaching staff. Obivously, you’d want to drill down to as specific a situation as possible while still maintaining a decent sample size for comparison (so taking this down to batter/pitcher matchups, in the third inning of day games after night games, and full count with RISP is probably not good for a comparison). Clearly, the formula to calculate this could get to be massive and difficult to calculate quickly. Also, it would need to be normalized to a finite number of outs, as a team that plays above or below the averages would skew that direction in a longer game.

    Comment by Kevin — November 15, 2011 @ 11:37 am

  14. How about some measure of Actual Wins per WAR? A measure of how well the manager is able to turn the contributions of his players into wins. It would have to be adjusted for the number of wins of an all-replacement team.

    Comment by Hoof — November 15, 2011 @ 11:38 am

  15. I agree with this. We would need to control for team skill and luck (BABIP). It would be nice if we could isolate the effects of a manager, but we would likely be measuring an entire coaching staff. Coach of the year awards are usually given to the most overachieving team, so it would make sense to measure how much more than expected they can get out of their team.

    Beyond this, we need to define what we are trying to measure. Are we measuring their application of game theory, like good or bad decisions? Are decisions judged as the perspective when they are made or only retrospectively using results? Getting the most out of their players? Turning careers around? Influence in recruiting players? I think the first thing to define here is what events we are looking to measure. Once we have that, we can make some hypotheses and begin testing to see what matters.

    Comment by Steve — November 15, 2011 @ 11:39 am

  16. To address the geeky grammar stuff: “Managing” is never a verb. It is either a participle (which can be compounded with a linking verb to create a finite verb) or a gerund. The former is a verbal adjective and the latter is a verbal noun. You are correct that you are using the gerund but you are not exactly correct that you are using the noun as an adjective. Instead you are creating a compound noun and omitting the needed hyphen, an aspect of grammar that is quickly disappearing. Your phrase “Managing Statistics” is in fact an abbreviated form of “the Statistics of Managing,” where the “of” indicates an aboutness-relationship. The prepositional phrase is an adjectival phrase, which is why “Managing” as you use it appears to be an adjective itself. However, if we understood “Managing” as itself an adjective, we would not be able to recover the proper meaning of your compound noun phrase because we would be taking your phrase to mean “Statistics that manage,” this being the meaning of a participial use.

    Comment by LTG — November 15, 2011 @ 11:39 am

  17. That’s an interesting thought. Of course, the problem does persist — as I noted in a comment above — that the manager still decides when to pull their starter. If the manager consistently makes his starters slough through 7 innings, regardless of how many runs they allow, then their LI in the 7th through 9th would be ultimately lower.

    Comment by Bradley Woodrum — November 15, 2011 @ 11:40 am

  18. Oh yeah, there’s definitely going to be a lot of noise involved at first. Obviously the more complex it gets, the better it’ll be at filtering out that noise.

    Comment by Josh Goldman — November 15, 2011 @ 11:40 am

  19. As Josh Goldman suggests, you can put together a “synergy” estimate.

    Wins – TeamWAR and Wins/TeamWAR are good start. I think comparing WPA and WAR might also have some value.

    As the other Nathan says, we can look strategies. We can take a simulation based approach and compare a manager’s actually strategies to the best strategies we can find via simulation. Line up optimization being the easiest one of these. So, for instances, runs_actual/runs_optimal would be a good indicator. And net change in run expectancy due to hit and runs and steals and such would be useful as well. Defensive alignments are a bit trickier, but again we can take a simulation based approach.

    A manager’s (and staff’s) influence on player development is a different matter, though. How much of Jose Bautista’s improvements are due to the Jays organization. If it’s even a quarter, then that’s already a couple of wins by itself. Has Mark McGwire taught the younger Cardinals to hit with more patience? How can we tell these things? I have no idea there, since we’d credit the players’ improvement to the players themselves.

    Comment by Nathan A — November 15, 2011 @ 11:42 am

  20. But that would be factored in (if the manager was leaving the starter in for the 7th when they shouldn’t be, then their WPA for innings 7-end would be worse). It doesn’t take into account starters continuing into the 6th when they shouldn’t (you could draw the line at the 6th but you’ll include a lot more ‘normal’ innings without substitutions if you do).

    Comment by Aaron (UK) — November 15, 2011 @ 11:45 am

  21. An astute correction.

    Comment by Bradley Woodrum — November 15, 2011 @ 11:46 am

  22. Do managers miss enough games per season to do an analysis along the lines of “win % differential” for basketball players? For reference:

    Comment by Bryce — November 15, 2011 @ 11:47 am

  23. This might be a stretch, a lengthy calculation process, but could there be a “runs from optimal lineup” calculation? Compare the optimal lineup the manager could have used (based on maximizing runs) to the lineup they did use. No manager will be perfect, obviously, but the closer to zero, the more optimal the lineup construction.

    Comment by Semi Pro — November 15, 2011 @ 11:48 am

  24. What about measuring lineup optimization by figuring each position’s performance vs. league average. Figure, for example, the wOBA of the #2 hitter vs. the league average #2 hitter.

    Comment by ElJosharino — November 15, 2011 @ 11:49 am

  25. Brad,

    Why does the world of managing need more statistics?

    Comment by Edwin — November 15, 2011 @ 11:49 am

  26. I did something like this in a college statistics project a few years back. I used only for the batting component, e.g. how the manager managed his position players (so that my project had a managable scope).

    Methodology: I found a bunch (about 5) different projection systems for each hitter’s offensive and defensive performance, then predicted based on that what each team’s optimal starting lineup would be. I compared how similar the actual team’s lineup was to the optimally projected one in terms of scoring runs (using a lineup analysis tool like, and ranked managers within different projection systems.

    Every month, I revisited my project and compared the actual team’s performance in terms of RS (and individual players’ performance), and basically gave managers positive points if their lineup did better than an optimal one would have, and took away points for worse performances. Then, I redid the optimal lineup using that first month’s performance by each player, as well as anecdotally changing some lineups due to some injuries, etc (that part was somewhat less than scientific).

    I redid that process every month or so throughout the season, and tallied each manager’s month by month final score, to get a score for the year.

    Obviously, within this methodology, there is a big “luck” component in the short term, because a manager could start the wrong guy, or have a bad batting order, etc. etc. and get “lucky” that the guy hits much better than projected (although it could also just be the case that he had a talent or “behind the scenes” knowledge that allowed him to project better than Marcel or whatever other system). However, in a large enough sample (5 years of month-by-month data maybe?), I think this would lead a pretty reasonable result a the luck evened out.

    Maybe something sorta similar to this could be applied to pitchers, to create a TMS (Total Manager Score)? Thoughts?

    Comment by Ty — November 15, 2011 @ 11:50 am

  27. used *data* in the first paragraph… sorry, long week.

    Comment by Ty — November 15, 2011 @ 11:50 am

  28. Comparing WAR to team records doesn’t seem to me to have any value. WAR, as the Fangraphs staff constantly remind us, is not a measure of true talent. It is more a measure of actual production and amount of playing time. The latter, at least, is more in the manager’s control than team wins and losses. Also, it could be argued that the manager’s people skills or lack thereof have motivated the players to be more or less productive (WAR) than their true talent would predict.
    WPA from the seventh inning on (adusted to the talent of the relievers or not) sounds so silly to me as not to even merit discussion.
    This is an extremely difficult problem. Anybody who can come up with a method that will even partially measure the manager’s effect on the game will be a genius in my eyes.

    Comment by Husker — November 15, 2011 @ 11:51 am

  29. A manager’s (and staff’s) influence on player development is a different matter, though. How much of Jose Bautista’s improvements are due to the Jays organization. If it’s even a quarter, then that’s already a couple of wins by itself.

    A great point — and something I am addressing in my second piece.

    Comment by Bradley Woodrum — November 15, 2011 @ 11:51 am

  30. Hi Edwin.

    Well, philosophically, nothing needs more statistics. Baseball is a pass-time for most, an occupation for others, and it had hummed along just fine for 150 without managing statistics. Frankly, if we’re talking about need, there are some world economics issues that deserve the term (see: Somalia, Sudan, Indian slums, etc.), but baseball notsomuch.

    Managers, though, do not have any analytical stats for evaluation. Why should we spend so much time and effort evaluating all but the manager? We can look at a GM’s trades and signings, a player’s WAR, and umpire’s strikezone — but only a manager’s wins? Surely there’s something else yet-uncovered.

    Comment by Bradley Woodrum — November 15, 2011 @ 11:55 am

  31. M-war. Do it up fangraphs

    Comment by cdawg — November 15, 2011 @ 11:57 am

  32. Would have to say shut downs and melt downs should would be part of mwar. So many times you see a manager bring out someone no one in their right mind would use.
    I don’t really see what else you could use…every thing else seems to be on the talent of the team not the manager.

    Comment by cdawg — November 15, 2011 @ 12:01 pm

  33. I had thought of this myself, but this system could unfairly punish managers who set lineup according to correct expectations, but then nagging injuries or unusual under-performance makes them look bad.

    Comment by Bradley Woodrum — November 15, 2011 @ 12:07 pm

  34. What’s the old quote, “Everyone loses a quarter, everyone wins a quarter, I’m paid for what I do in the other half.” (Could be third third third, I forgot)
    But, maybe use fWAR W-L (subtracting 40 from W and 40 from L) / Actual W-L (subtracting 40 from W and 40 from L.)?

    Comment by Michael — November 15, 2011 @ 12:09 pm

  35. Well, generally the manager will have a say in who that third base coach will be, so some of that responsibility is still the manager’s.

    That is, the manager will choose who he wants as his coaches and what their role will be on the team.

    Comment by siggian — November 15, 2011 @ 12:17 pm

  36. I’d say a plus/minus system using WPA to determine if the move was the “correct” move. For instance, was the win probability added by a successful bunt worth the risk? If not, it’s a negative. Was the reliever matchup a favorable one based on splits? If not, it’s a negative. If so, a positive.

    Comment by Jeff — November 15, 2011 @ 12:20 pm

  37. One of the problems is that optimal lineups are just numbers. They ignore egos. It’s all fine to say a certain player should bat 2nd, but if he’s always seen himself as being “the man” and batting 4th, it might make sense for a manager to stick with batting that player 4th (provided he delivers good production) to avoid a confrontation. Many people talk about how conservative managers are in their lineups, but many players are as equally conservative.

    This is a case of managing lineups vs managing personalities. Sometimes it wouldn’t be worth winning the lineup battle.

    Comment by siggian — November 15, 2011 @ 12:30 pm

  38. Brad,

    I think you took my question a little too far.

    I’m not against statistics. I like using statistics to measure batter or pitcher performance, or even GM performance, because those things are easily measured, and the value from using those statistics can be significant. These types of statistics help a team better deploy their resources. That is why we should spend so much time and effort evaluating these things.

    I don’t see value in coming up with a statistic to determine who the most valuable managers are, since managers have such a low impact on season success compared to player performance. I agree that that there is more to a manager than W’s and L’s, but even if you come up with a statistic to better measure manager performance, how does this help a team?

    Comment by Edwin — November 15, 2011 @ 12:35 pm

  39. Would each event be worth 1 point, or the amount of lost/gained WPA?

    Comment by Bradley Woodrum — November 15, 2011 @ 12:44 pm

  40. If I were tackling this, I would first endeavor to create some sort of xWPA, creating a context neutral baseline so we could judge managers. I just thought of this, so suggestions are definitely needed. xWPA wouldn’t just be based on the base-out states, but also on the best guess true talent of the players involved, or expected to be involved assuming the manager makes no changes. xWPA would also have to include some sort of best guess for a true talent platoon split for each player since that’s what managers manage most. Anyways, for each play actual WPA – xWPA = manager WPA, which is in wins above replacement (where a replacement does nothing).

    This idea gives managers credit for players outperforming expectations and for having the players most likely to succeed in the right situations. In the long run, a manager that motivates his players right gets rewarded with this stat, but more importantly, managers that make canny substitutions are rewarded. As an added bonus, the end result is already in WAR. I think this is a good place to start. Who’s up for inventing xWPA?

    Comment by Jon S. — November 15, 2011 @ 12:46 pm

  41. I think the Plus/Minus idea is the best approach, but how do you separate out opportunity from this? Won’t some managers have more chances to gain/lose points simply by the random number of decision situations they find themselves in, or the managing style they prefer?

    Comment by Edwin — November 15, 2011 @ 12:46 pm

  42. I’ve given a lot of thought to this prior to this discussion.

    Here goes:

    Any time a manager changes the flow of the game by bringing someone into the game to replace another player can be measured. Was it the best possible solution? And then the follow up. How did it work? This is the case for relievers, pinch hitters, and pinch runners. I guess you could include batting lineup as well. How well is his lineup ranked vs. how well did it perform.

    Then there is the “strategy” side to things. The general play by play that a manager usually has control over. Intentional walks, sacrifice bunts, defensive positional alignment, pitch calling.

    Then we have to consider the moves that a manager didn’t make. Should he have brought in a reliever? Should we have seen a pinch hitter there? Would a sacrifice had been a better option for the team? etc. OR, why did he bring in that reliever, obviously it was the wrong choice.

    Perhaps on the reliever side of things, the available options could be ranked for the situation and the manager could get a rating based on who he brought in. Then the number could be altered based on the relievers performance.

    Just some ideas.

    Comment by dusto — November 15, 2011 @ 12:48 pm

  43. Chris Jaffe wrote a book on evaluating managers that introduced some advanced metrics. Well worth reading.

    Comment by Detroit Michael — November 15, 2011 @ 12:50 pm

  44. Dusto: I agree with your thoughts. Maybe WPA and the hypothetical xWPA would help quantify them?

    Comment by Jon S. — November 15, 2011 @ 12:53 pm

  45. I don’t see this as being a problem. If a manager’s optimal runs/actual runs ratio is seen to be below a given threshold, then you could easily say that he has to slot batter x into the cleanup spot for the sake of managing egos. It would be much like the fact that batters with low wOBAs (relative to their career numbers) have a low BABIP – that is, it would be noted as an explanation for poor performance.

    Comment by TP Baseball — November 15, 2011 @ 12:53 pm

  46. That’s likely a better measure of GM performance than managerial acumen. The Yankees will have better wOBA from their #2 hitter than the Orioles (for example) because they are able to get those players, not because Girardi is a better manager than Showalter.

    Comment by TP Baseball — November 15, 2011 @ 12:55 pm

  47. Another complicating factor is the way managers sacrifice future performance for present success. Billy Martin comes to mind as a manager who ruined the careers of many young pitchers by overpitching them to achieve short-term success. Therefore, there should be some mechanism to reward the sustained success of TRL, Earl Weaver, etc.

    Comment by Nate S — November 15, 2011 @ 12:59 pm

  48. I think there are many lines of research to be undertaken on this topic. The first thing that comes to my mind is player performance vs expectations. So say something along the lines of preseason zips WAR vs actual end of season WAR for players managed. Granted this will ignore fielding, but I think it is a useful measure. Obviously this won’t take things like players playing through injuries into account, but I’d argue that this hits on something that should be attributed to the manager anyway.

    Comment by Billion Memes — November 15, 2011 @ 1:01 pm

  49. I would say each event would be worth 1 point.

    As for Edwin’s question, couldn’t you give a +1 for not doing something because that would be the right approach per WPA? Wouldn’t every manager be often put in a situation where a move could be made regardless of approach? For instance, a runner on 1st in the 3rd inning has nothing to do with manager style, it’s just a common situation. Then the question might be whether or not to bunt and that’s where the +/- system comes into play.

    I haven’t really thought this out completely. I was just throwing it out there.

    Comment by Jeff — November 15, 2011 @ 1:03 pm

  50. It seems to me that based on the above comments there are some very interesting ideas, but perhaps too many, leading to us trying to boil the ocean. I would suggest that, contrary to what some have suggested, an all-encompassing “mWAR” is too great a task to take on at the moment. Rather, let’s look at a few select

    Comment by Kevin — November 15, 2011 @ 1:04 pm

  51. Actually, “managing” is a verb — it’s the present-tense form of the verb “to manage.” Were it not, it could be neither a participle nor a gerund.

    Also, I disagree with your construction. “Managing” is here a participle, because it is indeed functioning adjectivally to modify the noun “statistics” — that’s why it takes the “of,” because it’s an adjective in the genitive case. The thing is, the genitive can be either objective or subjective; one could indeed mean “statistics that manage,” but the other means “statistics related to managing.” (Which is which? Alas, I can never keep the objective and subjective genitive straight as to which means which order — whether subject and object in that case refers to the noun or the adjective.)

    Comment by The Ancient Mariner — November 15, 2011 @ 1:05 pm

  52. Similar to this, ESPN recently looked at managers who consistently beat their pythagorean W/L and Scoscia was the clear winner there, Gardenhire pretty good too.

    Comment by DD — November 15, 2011 @ 1:09 pm

  53. Rather, let’s look at a few select scenarios in which the manager plays a part, and worry about a WAR-like, universal statistic later. In no particular order, the things that a manager has control over, and how these things can be analyzed, isolating out the manager’s assets (aka players) to see which he is optimizing:

    Batting order
    Pitching Rotation
    Bunts (ignoring the difference between sacrifices and bunting for hits)
    Steals (I will not include hit-and-runs because it is not easy to tell when a hitter has chosen to swing at a pitch rather than being told to do so)
    Pitchouts (although this is a debatable one to be included here)
    Pinch Hitter usage
    Relief pitcher usage

    And in no particular order, the things which are in the Manager’s control (at least partly) which I would suggest are too difficult to quantify and measure, and where our efforts should not be focused:

    Team chemistry/synergy
    Managing egos
    Positioning of players
    Player growth/development (the Jose Bautista effect)
    Player skill

    Comment by Kevin — November 15, 2011 @ 1:13 pm

  54. I’d say one of the goals of a manager is to get his players to overachieve. Maybe we could see if the players on the team consistently outperform their projections (whether you use PECOTA, ZiPS, Marcel, or whatever).

    Comment by Eric — November 15, 2011 @ 1:21 pm

  55. Good point. Maybe then you take the context of the team into account. Say, using Runs Created you figure what percent of your team’s Runs Created come from the #2 spot, and compare that to league average? So if someone’s #7 spot is creating a greater percentage of his team’s runs than the rest of the league’s respective #7s, while the #2 spot is creating a lower percentage of team runs than the rest of the league’s #2s, then perhaps the manager in question loses points for not swapping the two.

    Comment by ElJosharino — November 15, 2011 @ 1:59 pm

  56. I know one thing…… After all these statistics take off….. we will realize that there are about 28 league average managers…… Most of the overpriced managers will be let go, and then Joe Maddon will get a sweet raise…

    Comment by — November 15, 2011 @ 2:10 pm

  57. Yeah, but winning the lineup battle would be a manager skill just like wanting to make the better lineup would be in the first place. An inability to convinced Bobby McPasthisprime that he needs to slide out of the #4 slot represents a failing on the manager’s part, just like putting Bobby there in the first place out of ignorance would be.

    Comment by NBarnes — November 15, 2011 @ 2:51 pm

  58. I think you’re on the right track. Start simple, then expand.

    As stated by ‘dusto’ above, for bunts, steals, pinch hitters and relief pitchers there should definitely be a way to take into account moves that weren’t made as well as the ones that were.

    With steals and bunts there should be a handicap of some sort to account for the times a player executes them on his own.

    Perhaps with the pinch hitter stat you could also look at not only whether the best bench player was used in a certain at-bat but also whether or not the guys on the bench are actually the optimal bench players at all. i.e. Are there guys rotting away in AAA who would be demonstrably (or at least hypothetically) better in that situation? (e.g. Mathis v. Conger)

    Comment by Adrastus Perkins — November 15, 2011 @ 3:18 pm

  59. I can tell you where not to look.

    I spent a few nights going over Pythagorean W/L and Pythagenpat data to see if Mike Scioscia really had some sort of elusive skill to enable the Angels to outperform their Pythagorean regularly.

    Going back to 2000 and comparing the +/- for every team up to the end of 2011, I found that the numbers themselves are random but collectively the distribution follows a pattern. It’s already known that the best and worst teams tend to outperform and underperform their Pythagorean, respectively, but for the teams in the middle it seems to be random flux.

    As for Scioscia, he’s at +25 for outperforming Pythag W/L since taking over the Angels in 2000. It looks great but it doesn’t mean anything. Torre was +41 with the Yankees but ?5 in his time with the Dodgers.

    I also thought that being with one team for a long period of time might be a factor as well so I looked at some managers who had at least 10 years with one club and how they fared Pythagorean-wise in their last 7 years with a team, giving them three years leeway. Once again, nothing. Bobby Cox was an average of ?3.42 his last 7 years with the Braves. There’s a chance he just got worse as a manager but I doubt it.

    I’ll post my findings in more detail in the community blog if anyone’s interested.

    Comment by Adrastus Perkins — November 15, 2011 @ 3:42 pm

  60. The question marks should be minus signs… Don’t know what happened there.

    Comment by Adrastus Perkins — November 15, 2011 @ 3:43 pm

  61. I see a lot of people focusing on the bullpen, but the thing we need to realize is that a lot of these teams are seeing the manager as someone whose most important job isn’t in-game managing, it’s handling personalities and motivation.

    The Cardinals in particular made it clear they were looking for someone with “leadership”, not experience managing games or handling certain situations like maybe the Cubs seem to be with their rigorous interview process.

    How can we quantify something like players just playing better?

    As a Cardinals fan I always noticed that LaRussa always seemed to have a large number of high performing bench players, and it always seemed to me that one of his greatest attributes was getting the bench players contributing in positive ways throughout a season, moreso than other managers I got to watch regularly. I have no clue if there’s any statistical evidence to support such an assertion, but I think we really need to be looking at things like this instead of just in-game management.

    There’s probably too much noise to pull much of anything on the manager’s influence out of all that, though, so I’m not sure where to start.

    In terms of in-game management, couldn’t we just look at the optimal WPA for managerial moves like bunts and pitcher changes for every game state and then compare that to what the manager actually did? Kind of a batting average for managers, with 1.000 being all moves the best? I mean we have tons of articles about this stuff every postseason, so I’m not sure why it’s never been compiled into something like this. Maybe it’s just too much data to sort through, but it doesn’t seem like it’d be too hard to make a program to do it.

    We’d then want another metric that measure how often the “sub-optimal” moves produce positive or optimal-or-better results, to see who really is “pushing all the right buttons.”

    Comment by Samuel Lingle — November 15, 2011 @ 3:58 pm

  62. This.

    I mean… it’s a whole book. I don’t imagine it’s perfect, but it’s good. So there’s already more rigorous thinking than “practically none” out there, if you remember to look for it before claiming it doesn’t exist.

    Comment by Andy — November 15, 2011 @ 6:34 pm

  63. Assuming that Scioscia is a bad manager in lieu of data is getting away from this whole objective knowledge thing. Just sayin.

    Comment by john — November 15, 2011 @ 7:06 pm

  64. This is exciting. I feel like I’m watching the future of managerial grading begin right in front of my eyes

    Comment by YazInLeft8 — November 15, 2011 @ 7:26 pm

  65. I tried expected WAR wins-wins, but it had a 0 correlation from year to year. So instead I decided to do something else. I did the linear weights for SB and caught stealing, divided it by team speed score, and multiplied it by league average speed score. I converted this to above average format. Then I found the wOBA of bunts and put it into wRAA format. I added them both together.
    The leaders were

    MIke Scioscia14.25610337
    Joe GIrardi 10.16566916
    Ron Gardenhire 10.11724406
    Buck Showalter 9.066025265
    John Farrell 8.63586869
    It is too time consuming, but a good thing to do would be to use regresed wOBA based on PA’s and the same thing with UZR/150 and find out how many runs the team could have saved/scored with a perfect lineup and subract actual runs scored/ saved from that, then compare to league average. I would also add some wRAA for intentional BB’s and WPA compared to WPA/LI

    Comment by thomas — November 15, 2011 @ 7:39 pm

  66. In terms of a manager’s situational decision-making, evaluate him based on his tinkering vs. what the status quo would have looked like (i.e. if the manager was completely hands-off and never called for a bunt, mid-inning pitching change, hit-and-run, etc.) For example, if Matt Kemp is up with Dee Gordon on first base and one out, run a simulation of the probability of the Dodgers scoring a run that inning if Kemp is simply allowed to swing away. If Don Mattingly makes a move, like having Gordon steal, and the Dodgers score a run, award Mattingly accordingly (maybe by subtracting the difference between Gordon’s run and the probability of the Dodgers scoring otherwise). If Gordon is thrown out and the Dodgers do not score, deduct points from Mattingly (let’s say the Dodgers had a .3 chance of scoring, meaning Mattingly would lose .3 from his managerial score).

    The same concept would apply for a mid-inning pitching change, with the probability of pitcher x giving up a run weighted against the actual outcome, based on either a manager’s action or inaction in going to the bullpen.

    Obviously there are a ton of other variables that factor into run-scoring outcomes beside a manager’s moves, and this theory also does not account for the manager’s clubhouse presence or teaching ability.

    Comment by Alex L — November 15, 2011 @ 7:49 pm

  67. Why not just actually watch the games and count how many times a decision works out vs how many times it doesn’t? You could use fWAR and the pythagorean W/L, but those have faults themselves. If a guy steals and actually steals, it’s a plus 1, if not, minus 1. You could say “well it depends on the player” which is true, but knowing your talent is part of managing.

    Comment by Antonio Bananas — November 15, 2011 @ 7:55 pm

  68. Trying to keep it simple and stupid here, and echoing some suggestions above:

    How about Team WPA – Team WAR?

    Comment by fang2415 — November 15, 2011 @ 8:23 pm

  69. Whoops, that’s too simple/stupid, because WPA gives you wins compared to 50/50.

    So you’d want (WPA winning %) – (WAR winning %). Something like:

    (81 + WPA/2) – (WAR + 50)

    (assuming 50 is the wins for a replacement-level team)

    Comment by fang2415 — November 15, 2011 @ 8:36 pm

  70. Bill James wrote a book on this in the 1990s – he laid out 15 different statistics to measure managers by and has a variety of essays in the book on the problem of measuring managers and how he developed or decided to use the variables he did.

    On a site like this – I would hope that authors would do a bit more homework – it is especially irksome when a post like this one misses something as big as a book on managers by someone who helped to popularize the field of sabremetrics like Bill James.

    Comment by KB — November 15, 2011 @ 8:52 pm

  71. Hmm, just tried this over the last five years and I’m getting a 95% correlation between WAR wins and (WAR wins) – (WPA wins)… So all the good teams (BoSox, Yankees, Phillies, etc.) come out last. Can’t tell why, but then it is late and I am stupid. So either I’m screwing something up, or this method won’t work for some probably-obvious reason that I can’t see just yet.

    Comment by fang2415 — November 15, 2011 @ 9:14 pm

  72. Bill James manager statistics are in his annual handbook. They mainly tell tendencies and don’t really give you much insight into how good a manger’s decisions are, only how often he did something. I have been meaning to get Jaffe’s book but have not so I don’t know what he did in the book. If you really wanted to break new ground, we should set up something like Bill James’ Project Scoresheet where a few volunteers for each manager would follow the manager and note the 5-10 decisions each game where he makes a decision that is questionable/able to be studied. My guess is this would give us 10-15 issues that could be studied on a leaguewide basis. I think you would have to do this at this level to get a good read. For instance, some formula might take points away from a manager because he didn’t use his lefty relief specialist vs Prince Fielder in the 8th inning. Someone closely following the team might know from watching the postgame interviews or the newspaper that the lefty wasn’t available because his wife had a baby, he threw 25 pitches the night before, had a sore elbow, etc.

    Comment by Will — November 15, 2011 @ 9:44 pm

  73. There are so many different places to go with this topic, and so much inherent noise in almost all of them, that I wonder if it might be more productive to take an approach similar to Tom Tango’s Fan Scouting Report – allow fans to rate managers on the 20-80 scale in any number of different categories, such as:
    Use of in-game offensive strategies (hit and run, sac bunts, etc.)
    Use of substitutions (pinch hitters, runners, defensive replacements, double switches)
    Leveraging of bullpen
    Appropriate workload for relievers
    Appropriate workload for starters
    Managing/development of young talent
    Lineup construction
    Control of team, media relations, etc. (the intangibles)

    I’m sure I’m forgetting something, and each category could certainly be weighted appropriately to produce a final grade; it’s true that that grade would have a more subjective basis than might be ideal, but I have a hard time envisioning an objective statistic that cuts through the noise cleanly enough to provide a statistically reliable measurement of managing skill.

    Comment by Preston — November 15, 2011 @ 10:47 pm

  74. As will said the annual bill James handbooks have a number of interesting managerial stats. Check it out.

    Comment by Benj — November 15, 2011 @ 10:59 pm

  75. It is hard to measure the moves a manager could have made, but you can look at the decisions they actually follow through on. Looking at things like bunts, stolen bases and pinch hitting, you can see the WPA of making the move compared to not making the move. Baseball isn’t that simple where it is plainly bunt or don’t bunt, pinch hit or leave the guy in, but at least it is something. You can have some kind of accumulated +/- based on the WPA of the manager’s active decisions as compared to the WPA of the passive opposing decision.

    Comment by Jack Burton — November 17, 2011 @ 5:20 pm

  76. I’m not assuming Scioscia is a bad manager. I’m just saying that outperforming the Pythagorean W/L is not an indicator that he’s a good manager.

    Another stat may very well find that he is.

    Comment by Adrastus Perkins — November 19, 2011 @ 5:01 am

Leave a comment

Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *

Close this window.

0.131 Powered by WordPress