## Microeconomics And Offense (Part 1)

In Microeconomic theory, there are two factors in production: capital and labor. Labor is the manpower used to create output, and capital is the machinery and technology that makes that labor work more effectively.

At its core, there are two main factors in scoring runs: getting runners on base and knocking those runners in. By looking at these skills as capital and labor it reveals much about each teamâ€™s offense, and can give insight to where a team needs to invest.

True, there are other elements to a teamâ€™s offense that contribute to scoring runs, stealing bases for example, but there simply is no substitute for setting the table and knocking runners in. These two variables explain 94% of the variance between teams in runs per game in the majors last year.

In this two-variable system, maximizing output depends on finding the optimal mix of these two inputs, given a budget constraint. The key concept is that there are diminishing returns for both inputs, and therefore it is inefficient to continually spend on one or the other. For example, it would be inefficient to have one worker try to use a warehouse full of machines, and likewise it would be inefficient to have 50 workers waiting their turn to use the companyâ€™s one machine. The optimal allocation will be some combination of the two inputs.

What this all means for baseball offense is the theory that a team with good lineup balance will score more runs than a team full of leadoff-hitters or a team full of power hitters. A team looking to improve their offense will get more return for their dollar if they find the right type of player. Toronto, for example, will get more return from a top of the lineup table setter than they would out of another middle of the lineup slugger. Why? Because the Blue Jays led the league in clearing the bases last year, but the bases were just not occupied enough.

The blue lines in the above graph represent the league average. To no oneâ€™s surprise, Baltimore, Cleveland, Oakland, and Seattle lie in Quadrant II; below average in both categories. Seattle again shows us just how historically bad its offense was in 2010. The Mariners are not just far away from the other data points, they are on their own island.

Quadrant IV contains teams that were above average in both categories: Chicago, Tampa Bay, Texas, New York, Boston, and Minnesota. The Yankees led the league in runs per game by a quarter of a run, and that distance is shown quite nicely in the graph.

Quadrants I and III are where the interesting analysis lies.Â Kansas City and Detroit fall into Quadrant I, as they were above average in OBP, but below average in strand percentage.Â These teams were good at setting the table, but too often left runners on base. In the theory of capital and labor, these teams would get a good return on an investment in a run producer.Â True, the Tigers have Miguel Cabrera, but this graph shows that having Brennan Boesch and Carlos Guillen follow him in the lineup is holding Detroit back from reaching its offensive potential.

Quadrant III contains Los Angeles and Toronto. These teams were above average at clearing the bases, but below average at getting players on base. The Blue Jays are an interesting case. They show the potential to be the leagueâ€™s top offense if they could more efficiently get on base. Toronto has a decent middle of the lineup with Jose Bautista, Aaron Hill, Vernon Wells, and Adam Lind, but they are handcuffed by the lack of a good leadoff hitter. Fred Lewis and DeWayne Wise simply did not get it done.

More thoughts on this concept next week and a look at the National League.

Print This Post

Only intermediate micro has two inputs and attempts to make reference to everything in terms of capital and labor. Once you start talking about production functions at an advanced level, you assign shares to different types of capital.

There are very specific, arcane reasons why something like a Cobb-Douglas is used in economics. To put it simply, it’s so you can preserve certain things economists find important while making it mathematically tractable. Those things are unlikely to be the ones people care about in baseball.

What you’re saying seems to apply to 1960s-1970s macro more than modern micro. Those models did boil everything down to micro and macro. However, contrary to the point I think you’re trying to make here, the most prominent of these models, the Solow growth model, says what really matters for economic growth is the effectiveness of the use of capital and labor (generally thought to be “technology”), not the growth rate of labor and capital (power and OBP) themselves.

I was thinking the same thing. I’ve never seen a microeconomist use Y = K*L type equations, only macroeconomists.

Also, isn’t using OBP and strand rate to predict runs a bit like using ERA to predict runs against? The 96% correlation suggests this is near definitional.

A Cobb-Douglas production function, which is a generalized version of what you just wrote, is commonly used in micro.

Ironically, however, to solve such a maximization problem, an economist would take the natural log of the function. A Cobb-Douglas would often be Y=K^(alpah)*L^(1-alpha) where alpha is between zero and one. If you take the natural log, you get ln(Y) = (alpha)*ln(K) + (1-alpha)*ln(L).

When you look at it that way, all of this economics stuff simplifies to a kinda pretentious version of linear weights.

Footnote: the reason why taking the natural log is permissible is because what economists care about is just the point which is profit (or utility) maximizing, not necessarily the value of the profit, because logarithms are monotonic functions. After you figure out the optimal values of capital and labor, you can then reinsert them into the original equation and find out whatever else you care about.

Yeah, that is what used in intermediate micro classes. But I’ve never seen a microeconomist use it in anything but teaching intermediate microeconomics. I have seen tons of macroeconomists use it for their research.

“Yeah, that is what used in intermediate micro classes. But Iâ€™ve never seen a microeconomist use it in anything but teaching intermediate microeconomics. I have seen tons of macroeconomists use it for their research.”

Cobb-Douglas functions are used all the time in empirical micro papers, but in the log-linearized form that darsox64 mentioned.

I enjoyed the post, irrespective of the drill down. Solow Growth models are Macroeconomics, FYI.

duh.

“What youâ€™re saying seems to apply to 1960s-1970s macro more than modern micro.”

I wonder if with a little more work we could see if the Blue Jays led in strand rate because they didn’t get nearly as many runners on to begin with, it seemed like their team was built to either K or HR.

Frankly their team was built to wait until 2012, and this year’s team seems similarly constructed.

“The Blue Jays are an interesting case. They show the potential to be the leagueâ€™s top offense if they could more efficiently get on base.”

How predictable is Strand Percentage? I was under the impression that it fluctuates much more than other measures because of the importance of sequencing. If so, potential top offenses would be ones with high OBP but above-average Strand Percentage.

In other words, my hunch is that you could rename your axes skill (x) and luck (y).

To be fair, they look incredibly correlated.

While this makes sense as a theoretical construct, I’m wondering what quantifiable factors go into improving a strand rate? I mean, “RBI men don’t really exist” seems to be part of the eightfold path to sabermetric enlightenment, right? So what would a team do to improve its strand rate besides getting a “run producer”? Does “improved offensive strand rate” translate into “get more hits” and “also, if those hits could be homers, that would be great too?” I’m a bit of a recreational stats enthusiast, so I really am curious if there is some written research about an “offensive strand rate” that I’m not familiar with that suggest certain definable skills contribute to it besides the number of hits/extrabase hits a team gets.

If so, don’t .wOBA and more crudely OPS already account for what we’re looking at here? I feel like I must be missing something because it can’t be as simple as saying “A team with a low on-base percentage should try to improve it’s on base percentage” and “a team with low slugging percentage should get some guys with more power.” Is there some sort of optimal mix you are suggesting here (to borrow your analogy, that it is better to have more slightly capital than labor, or slightly more labor than capital)?

That’s why you need guys who make contact, too. Hits are greater than walks when it comes to advancement of runners. And avoiding Ks (putting balls in play with a runner on 3B and <2 out) is a usefull skill, as well.

It’s interesting to look at the top 4 OBP teams. 3 playoff teams plus the Red Sox.

Also, as a baseball simulation guy (big fan of Out of the Park Baseball series), I definitely agree that it takes balance. I’ve tried building teams based on speed and based on power, but the best teams are always the ones with specific players for specific roles. Just like an NBA teams needs an Alpha Dog, a Sidekick, and role players, baseball teams need the leadoff guy, the power-packed middle, and the reliable lower order.

On a bit of a different note – has Fangraphs ever looked at the various software available for baseball simulation and rated/reviewed them. I would really like to get back into baseball sims but my experience basically started/ended with the old board games. I would appreciate any opinions on my options before I go out and buy one. Thanks in advance!

First, as Steven Sanders says above, isn’t the idea that strand rate is a function of luck rather than skill a basic of sabermetrics? Yes, it’s nice that people can advance runners, but how much is that a repeatable skill?

Second, budget isn’t the only constraint. There is a natural constraint that there must be 9, and only 9 players. While this may seem obvious, it changes the equation completely. What you would want to maximise is runs scored using some sort of Lagrange multipliers so that you can subject the equation to two constraints.

Small fluctuations in strand rate are likely due to luck. But over the course of the entire year, a baseball TEAM probably accumulates enough data that a strand rate is a fairly good indicator of “run producers” (for lack of a better word). We talk about how a pitcher’s strand rate (LOB%) fluctuates a lot, but that’s one pitcher. Here we’re talking about an entire team of players playing every day. That’s a much larger sample size, and less susceptible to the random bug.

However, I would agree with Barkley Walker (above) that strand rate seems almost definitional. For example, RBI correlate very strongly with runs scored, but that doesn’t mean they should be included in a regression model. RBI are like a byproduct of run scoring. Perhaps a look at ISO power, or slugging (perhaps weighted differently) would be a better y-axis variable?

Just a thought.

Isn’t the real constraint here the 27 outs (or maybe the three outs per inning)? To put this point another way, where is the evidence that returns to obp diminish? The Yankees led the league in obp, and led the league in runs scored. The fact that hitting for power (if that’s what strand % represents, which seems plausible) is also valuable does not imply that additional increases in obp become less valuable at any point. After all, every way of reaching base that increases obp will also advance a runner on first, and thus all runners on adjacent bases. I’m not sure if this is a separate point, but in economics the independence of factor supply and demand is crucial to diminishing returns analysis, while its not at all clear that obp and strand % are genuinely independent (partly by virtue of the point I just made about reaching base=advancing runners on first and adjacent bases, partly because power hitters draw more walks). So while this is an interesting analysis, I don’t think it makes that case that any team is near the frontier where increased run scoring can only come from improvement on one of these measures rather than the other.

I agree with your last point–in fact every team could improve by improving both metrics–but that does not ruin the value of this analysis because teams can move diagonally as well.

Connect all the pairs of Strand% and OBP% that result in the same number of runs (on average). Suppose when you do so you get a bunch of circles with center at Strand% = 0% and OBP = 100%. Then, supposing (for convenience) that cost is proportional to the distance moved in the figure. Then every team would be best off if they moved in the direction directly towards Strand% = 0% and OBP = 100%.

Thus the best personnel moves would be those that move the team in that direction (it’s more complicated than this, but I don’t want to impose a realistic cost function).

How does this strand % relate to pitchers LOB%? I feel like if a pitcher had a LOB% of 56% which looks like league average, they’d be absolute terrible

Continuing your parallel with labor and capital, a Marxist view of capital implies that the sluggers derive profit solely through exploiting the value produced by OBP. Down with the elitist RBI producers!

On a more serious/worthy note, in the case of the Blue Jays, the OBP of that middle of the order was HORRENDOUS. Below .300 for Hill and Lind last year… I’m not sure that it’s so much a matter of a “unsuccessful leadoff role” as the fact that the team was generally poor in OBP from top to bottom. The prescription that follows from that isn’t for the Jays to “get a better leadoff hitter,” so much as “get better hitters.”

Also, I see the importance of lineup interaction in the NBA is based on the creative aspect of offensive play there, and even more so, sadly, the egos of professional athletes. I’m not sure that baseball lends itself to ball-hogging tendencies, and it is always better to have better offensive players than worse (as measured in shooting% and ability to create a decent shot through ball movement or beating ones man off the dribble) Not having a “sidekick” may manifest itself as not having a second reasonable scoring option, allowing a defense to key on the top man, but this would be more a question of team offensive talent than role play.

Whether balanced lineups or a more concentrated level of talent is beneficial to scoring runs, as well as whether the individuals hitters should have specialized or diverse skills, I think to be separate questions and not addressed by a look at team OBP/strand rate.

Just wanted to add that I really enjoyed this. Lookin forward to seeing more stuff from you at FG.

Completely agree. Looking forward to more!

With the exception of LAA and TOR, there also seems to be some sort of correlation between the two variables, which would make sense.

If people consider Strand Rate a luck thing, would the graph look very similar if Strand was simply replaced with BA w/RISP etc.

And does this graph mean Solo Home Runs don’t make any difference at the end of the day?

Yeah the correlation for strand rate and OBP makes sense, especially if you think about limits.

If you take the limit as OBP approaches 0, you’d expect the strand rate to be 1. If no one ever gets a hit, how would you expect a runner on base to ever score?

Also, as the limit of OBP approaches 1, the strand rate would be 0. If no one ever made an out, then no one would ever get stranded.

By my calculations the league average team scored 710 Runs, had a .325 On-Base Average and had 2,010 men reach base in 6185 total plate appearances, so 35% of all men who reached base scored for the average team and 65% did not. I wonder what the percentage of base-runners who scored for each team is….

I think the league average strand rate is something like 70-73%, due to runners that reach base, but are not stranded (double plays, CS, etc). My understanding was that any individual variance from that was by and large luck, although I would be interested to hear otherwise…

Yes. Liked this. Would like more on whether strand-rate is repeatable or just a fluctuating multiple of OBP (which could make this post non-sensical).

Still, the idea here is promising. Good work.

This is one of the most unique and thought-provoking articles on baseball I have read in a long time. It would actually be cool, as strange as it sounds, if you referenced some of your sources. Like where the 94% stat in the beginning come from? Nice work.

My guess is the r^2 value for trying to predict runs scored based solely on OBP and strand rate

the blue jays also lead the league in solo homeruns… by a large margin if im not mistaken

Wow, the team with lowest OBA also led in Strand %. (I’m not taking a shot at SEA, everyone knows they had a horrible year at the plate).

Hey look, SEA nearly matched their geographic location. All alone in the Northwest.

Who would have guess that KCR would be on the “right” side of the graph? … y’know over there with the good teams.

Isn’t strand rate more a function of power and team speed? Teams who steal a lot and rarely get caught, who take the extra base well, and hit for power it would seem were key here.

This is a nice analogy and a good topic. When a graph illuminates something you may have an inkling of, but you still say “cool, I never thought of it like that”, it is highly worthwhile presenting to the sabermetrics community, in my opinion. Great work!

The hypothesis is interesting (that you need a balanced lineup to maximize runs scored), but I don’t think you really demonstrated it.

WOW I DIDNT REALIZE YOU HAD TO HOLD A DOCTORATE IN ECONOMICS JUST TO ENJOY A FUCKING GAME. YOU PEOPLE MAKE ME SICK.

Well now you know.

if this makes you sick, you are NOT going to want to see the pictures of your mother floating around teh intarewebs.

Amen, Neil. What does OBP even stand for?

rush kinda sucks

lol any member of Rush has more creative and technical talent in one toe hair than you or any artist you listen to have or ever will have in your entire bodies, so that would be a no…as for these so-called “advanced stats” that require computer science degrees to understand, they take the fun out of the game. Boiling down players to numbers and formulas is just a dry, cold, detached way to view a sport. Not only that, but these numbers you come up with are all DESCRIPTIVE, not predictive. They don’t help you understand what a player is going to do down the road. I follow stats like ERA and RBIs because they’re good enough for all major publications, broadcasters/ex-players, etc. so they’re good enough for me.

Why can’t a person enjoy baseball in a variety of ways?

To me that’s one of the best parts of baseball, you can get as nerdy as you want … sometimes depending even on your mood that day. There’s no reason why a person cannot enjoy an article comparing UZR and Dewan, and then ooh and aah at majestic BP home runs 5 minutes later.

I’ve enjoyed this great game as a player, coach, administrator, umpire, fan, dad, and stats geek trapped in an athlete’s body. Each aspect is enjoyable in a different way.

Also, Rush only sucks in the “they’re not Metallica” kind of way … but again, that’s a very narrow view of things.

Hmm … think this was said, but

1) Isn’t there an autocorrelation (if that’s the right “technical” term) issue here? Higher OBP => by definition lower strand rate?

2) Isn’t this macro, not microeconomics?

Where did you find the offensive strand rate numbers at?

This is what I would like to see done. Supposing there is an underlying production function that can be approximated using variables like Strand % and OBP, one could reveal the contours of this function. The iso-run lines would be concave curves with positive slope everywhere except along axes. In so doing, the marginal rates of technical substitution will be revealed which would suggest which direction teams should move.

It would be merely suggestive because these aggregate team measures do not account for the interactions at the event level. (See comments by Steven J. Sanders III and MB).

For starters, would you put runs/gm on the graph instead of the team names?

Isn’t this basically “On-Base Times Slugging” as a measure of team offense? And I can enjoy baseball just fine without a PhD in Economics, I only have a Master’s :)

My problem with this article is that you seem to assert that the labor:capital::run producers: OBP guys analogy is true without providing any reasons why it’s true. I mean, I get that that you can’t have 50 guys waiting to work on 1 machine and still be efficient. Why does that mean that the addition of another high OBP guy at the expense of a high SLG guy is inefficient? In other words, if Toronto added a .400 wOBA guy to their lineup, would it really matter if he derived most of his value from his OBP or his SLG? If so, why?

I just don’t see why there’s a basis for suggesting that your analogy is a workable one.

Getting on base and advancing runners are interactive: at least in part an x times y relationship, not simply x plus y. For an individual batter, OPS is a reasonable estimator of value, but for teams, On Base Times Slugging is better. (It is likely that On Base should have a larger exponent than Slugging, but that doesn’t change the basic concept of interaction.) The author here used Strand %, which is subject to more luck than Slugging, but the essential concept is the same as team OTS. For a team with fairly average OBP and SLG, your “what does it matter” comment would be accurate, but a team whose offense is significantly skewed one way or the other will do better to improve the weaker element.

To take a concrete example, suppose you have a team with 9 sluggers who hit a home run 10% of the time and otherwise make an out: OBP = .100, SLG = .400, OPS = .500. The team would average approximately .33 runs per inning: .10 for each of the first three batters, plus extra plate appearances if any of them connect.

Now you have the option to acquire Eddie Gaedel. He draws a walk 50% of the time and otherwise strikes out: OBP = .500, SLG = .000, OPS = .500. Leading off the first inning, if he draws a walk, the team will average about .60 runs (2 runs at 3*10%, minus the interaction, plus the extra PA’s for additional batters.) I’m not going to work out the exact value, it will be in that vicinity. If Eddie strikes out, the team will average about .22 runs that inning. So, 50% of .6 plus 50% of .22 = .41, certainly better than .33 with all sluggers.

This example is extreme, and the results are not all that dramatic; so, for any real life team, it is not obvious how much difference a high OBP vs. high SLG guy would make. But the principle at least is there.

Regarding microeconomics vs. macro, a glance at an old textbook “Micreconomic Analysis” (Varian, 1984) certainly includes Cobb-Douglas type production functions; this seems basic to any discussion of the theory of the firm. The author here is concerned with individual choice by baseball teams, which is certainly micro in nature.

Mas-Colell is the current advanced micro theory textbook. Chapter five is on production. There is a Cobb-Douglas production function mentioned for _any_ two inputs, not necessarily capital and labor. It’s also clear that the example with two inputs is just a simplification performed so the math is tractable, not as an argument that everything is actually reducible to two such inputs, as this post suggested.

You don’t need to have a simply two good case. The model can be expanded to three inputs as Y= K^(alpha)*L^(beta)*T^(1-alpha-beta), where T is land.

But when the model is parameterized and values are obtained for alpha and beta are identified, that is done by taking logs in the manner I mentioned above. When you take the log of a production function, it LOOKS EXACTLY LIKE LINEAR WEIGHTS. So obviously, this is still “micro in nature”, but it’s just a bunch of complicated dress that gets you back to linear weights.

Example:

Runs = Onbase^(alpha)*Power^(beta)*Contact^(gamma)*Baserunning^(1-alpha-beta-gamma)

The simplification to nothing but capital and labor is done to make the math solvable because most economists believe it is benign. They believe that to be so because of OTHER empirical research, not because the simplification itself is important.

Sure, if you take the logarithm of a multiplicative function, it looks linear — but it isn’t. Take 4*2=8. Now add one to either the 4 or the 2; you get 5*2=10 or 4*3=12. Clearly, you gain more by improving the weak score, and log(12) is still greater than log(10). A linear weights scheme that used the same weighting for the two components would yield 5+2=7=4+3. Linear and log-linear are not the same thing.

Your more generalized production function might make sense; but it is less clear how the four components interact. It is not clear that a team which hits lots of singles will benefit more from extra power or extra walks than even more singles; and such a team may benefit disproportionately from good base-running, while a high power team will be the opposite. Contact, of course, adds to both OBP and SLG in the classic formulation; the proper way to adjust for baserunning may be to add bases gained into and adjusted SLG score, while subtracting outs-on-base from an adjusted OBP. I think I’ve seen formulas along those lines.

And economics uses logs because the relationships are often ones were % changes are more intuitive than the absolute values.

My point wasn’t that the production function I gave was “the” production function, but that nothing is gained by introducing production functions in the first place that wasn’t already captures by linear weights, unless the argument is that % changes are what matters.

I generally agree that sabermetrics is a form of economics — and is therefore a useless system, in my opinion, when it is used to point out how good high-priced players like Albert Pujols are; it’s about value and counterintuitiveness — but this seems to be pushing it. OBP & RBI’s as Labor and Capital? Ehhh… Don’t push it.

This seems like an overly cutesy and non-insightful line of analysis

Yes, I agree: Rajai Davis will most certainly improve the Blue Jays offense. And with better years from Aaron Hill and Adam Lind, I could see the BJ’s offense improving measurably.

But losing Marcum really, really sucks. 4th place, here they come.

Suprising that White Sox ended up in Q4, and were the only non-playoff team to be there. I can’t help but remember so many people taking about how poorly constructed their offense was in 2010. Using a simplistic approach and just looking at these two variables, it wasn’t bad.

Let me start by saying I absolutely love economics. I majored in economics in college, and am now employed by the Foundation for Economic Education (yes I love the Austrian stuff).

With that said this post is comical. You used microeconomic theory to show that teams who get the most runners on base and strand the least of those runners will score the most runs? DUH! You can replace “OBP” with labor and “strand %” with capital but really you just stated the obvious.

Economics has its place in baseball. For example, micro-theory can accurately predict contract length/salaries for free agents. But don’t bring economics into an area of the game where it’s not needed just to try and look smart.

And so, the nonsense came to a close. Thank you Phil. Thank you very much!