Saber-Friendly Tip #3: On Decimals

If you’ve missed my earlier Saber-Friendly Tips, you can find them here.

As I have alluded to in the not-so-distant past, I feel like sabermetric writing should not be all the same. If you’re writing a piece that’s geared for other saberists — or for a very knowledgeable audience, like this one — then obviously very different rules apply than if you’re trying to cater your analysis to a broader audience. You can toss around multiple acronyms and discuss statistical concepts without much worry, while doing the same thing in other places could have you denigrated by your audience as a know-it-all, pompous jerkface.

We all love to poke fun at television announcers — whether at ESPN, MLBN, or elsewhere during game broadcasts — but they face a very difficult task: how do you give insightful analysis while still appealing to the wide range of different viewers out there? There are plenty of announcers out there that love stats and analysis (hello, David Cone!), and it’s no easy task to try and mesh those numbers into a game broadcast without scaring off all the viewers out there who don’t like math.

These same challenges apply to us saberists. What sort of an audience are we trying to reach, and how can we best do so? Today I want to suggest another way in which people can help make saber-stats easier to digest: rounding your numbers.

All too often, saber writers and bloggers neglect to consider the aesthetics of their posts. When I sit back and look at this piece, does it look like something that I would want to read? Does it have large blocks of text? Are there multiple acronyms in each paragraph? Are there too many links? As silly as some of these things may sound, all them influence whether people are going to read a piece or not.

And one of the worst offenders of aesthetics in sabermetric pieces are decimal points. Many of our traditional baseball stats are based around decimals — batting average, on-base percentage, slugging, etc. — and many of the new stats list out decimals to the point of insane specificity. Paragraphs end up looking like giant strings of numbers, separated by odd acronyms, making the piece dense and tough to access unless you really, really like sabermetrics.

So here’s my question: Does it really matter if I know that Evan Longoria has a 2.66 wFB/C, or is it enough to know that he has a 2.7 wFB/C? What difference does it make if we say someone has a 12% walk rate instead of an 11.7% walk rate? Is it sacrilegious to say someone has thrown 57% fastballs, when really it’s only been 56.7%? These are all very small changes, but they can go a long way toward making pieces more readable.

Of course, it all depends what audience you want to try and reach. I’m glad that FanGraphs lists out statistics to such detail, as it’s perfect for research purposes and allows people to get as specific as they desire, but we sometimes forget that it’s okay to take these numbers and round a bit (especially any percentages). Don’t round those that require that extra precision — I’m thinking of WAR and wOBA, specifically — but for the vast majority of stats, you should be able to make the numbers shorter and easier to read without losing any meaning.

So cut loose. Be wild. And be conscious of the audience you’re trying to reach.



Print This Post



Steve is the editor-in-chief of DRaysBay and the keeper of the FanGraphs Library. You can follow him on Twitter at @steveslow.


Sort by:   newest | oldest | most voted
state school grad
Guest
state school grad
5 years 2 months ago

FIRST!!!! yay go fangraphs! all about carpenter and ubaldo! j danks too

mkd
Guest
mkd
5 years 2 months ago

Can’t we all agree that this is an offensive comment that needs to be removed? This must violate some part of the commenting guidelines.

evan
Guest
evan
5 years 2 months ago

And after considering significant digits, that “less precise” figure may actually be the more accurate.

An excellent contribution to the dialogue: more numbers do not equal a more meaningful discussion. Thank you for stressing the idea that the meaning depends on the reception.

Albert
Guest
Albert
5 years 2 months ago

Wrong. Not what “significant digits” means. SDs have to do with using gradated measuring devices, then performing calculations with those measured numbers. Baseball stats are calculated from quantized values, and may therefore be listed to an arbitrary standard of exactness.

Point of the article about readability is still true.

micah
Guest
micah
5 years 2 months ago

Except when they’re not (e.g., pitch speeds).

Also, any regression-based stat like wOBA or FIP has some degree of uncertainty built into the definition (running the regression on a slightly different population would give slightly different coefficients, and the coefficients themselves were presumably rounded off at some point). So reporting too many digits for those is also potentially misleading.

Telo
Guest
Telo
5 years 2 months ago

Wow. An entire article about rounding. Awesome.

Yirmiyahu
Member
Yirmiyahu
5 years 2 months ago

Tomorrow’s topic: leading zeros.
Monday’s topic: why fangraphs’ version of GB/FB allows you to divide by zero

spaldingballs
Member
5 years 2 months ago

Fangraphs doesn’t post on saturday, loser

state school grad
Guest
state school grad
5 years 2 months ago

it should, most fangraphers have nothing to do on the weekends

Yirmiyahu
Member
Yirmiyahu
5 years 2 months ago

I posted that comment on Thursday. Clearly. I am a time traveler.

fredsbank
Guest
fredsbank
5 years 2 months ago

we’re all time travelers, we’re just stuck in ‘drive’

Bill
Guest
Bill
5 years 2 months ago

Chuck Norris can divide by zero.

mcbrown
Member
mcbrown
5 years 2 months ago

I think there is never a target audience for which more than 2 decimals on a ratio statistic or 0 decimals on a percentage are useful or necessary. For example, the difference between 0.32 and 0.28 is meaningful when talking about batting average, both to statheads and a broad audience. Rounding to 0.3 would discard too much information. However there is no meaningful difference between 0.323 and 0.320, either to statheads or a broad audience – no will hear “he is hitting three-twenty” and think “well do they really mean three twenty three or an even three twenty?” The third decimal is pointless, other than for historical convention or for the purposes of ranking players for the batting title.

Jason B
Guest
Jason B
5 years 2 months ago

With BA I think three decimal places has become an ingrained convention so it doesn’t really “hurt the ears” to hear it, or the eyes to read it. (Same for SLG, OBP, OPS.) For the new-fangled metrics (newer-fangled?) I think the article is quite relevant and worth considering, to make the article/stat as accessible as possible.

I do some writing in my line of work and ditched a decimal place to promote readability when discussing bank credit quality indicators a few months back – does the board need to know that loans were up 6.83%, or would 6.8% do, or maybe just 7%?

Dwezilwoffa
Guest
Dwezilwoffa
5 years 2 months ago

I disagree for the most part on this subject. If your audience is skittish or unprepared where they cannot take an additional decimal point of accuracy then they cannot accept the stats in the first place. Baseball is supposed to be about the numbers and honestly its the acronyms that are confusing. I have no idea how some stats are created but I can recognize the usefulness if the concept is explained. Dips, fip, xfip,I cant describe the formula behind these but I understand the theory behind these.

state school grad
Guest
state school grad
5 years 2 months ago

baseball is about the fans! and the fans want baseball!

gnomez
Guest
gnomez
5 years 2 months ago

In almost every instance, that extra decimal point is not statistically significant.

JT Jordan
Guest
5 years 2 months ago

On that note, can FG please round UZR figures? Otherwise you’re implying a sense of certainty that really just isn’t there.

mickeyg13
Member
5 years 2 months ago

I actually hate that it is so accepted to rely upon the way in which a number is presented to determine the precision with which it was measured. To me, the notion of so-called “significant figures” is a lazy shortcut around a much broader issue. The UZR figures represent the best estimate for a number that carries with it a reasonable amount of uncertainty; that they show 3 digits should not imply that they are precise to within 0.1. The precision is an entirely separate discussion.

Yirmiyahu
Member
Yirmiyahu
5 years 2 months ago

3.4543, +/- 15.

acerimusdux
Guest
acerimusdux
5 years 2 months ago

UZR is already imprecise. Why make it even more imprecise by excessive rounding error?

You also have to consider that some of these numbers are used in calculations. UZR for example is used as a part of WAR. You never want to do much rounding until your last calculation is done.

Often the result should be rounded for presentation purposes. For example in writing a blog post discussing a particular player, you only included the level of precision meaningful to that discussion. But I’d prefer that FG make numbers available to as much precision as anyone would ever want. We all know how to round. Lets users decide when to do so.

Barkey Walker
Guest
Barkey Walker
5 years 2 months ago

Why not go two digits for wOBA and drop the factor? Then it is just the number of runs expected per at bat, that is a really easy to grasp number and definitely does not need three digits.

Also, why is FG so opposed to ever explaining the slash lines? I’ve never taken the 10 seconds to try to figure out what they are figuring that if the authors don’t want to take the 10 seconds to try to help me, slash lines can’t be that important.

GiantHusker
Guest
GiantHusker
5 years 2 months ago

Slash lines are the perfect example of something that takes way too many numbers (and slashes) to convey very little useful information. Why include BA at all, and why not just use wOBA except in the contexts where the difference between getting on base and power is relevant?

Llewdor
Member
Llewdor
5 years 2 months ago

I woud argue that the slash line is valuable simply because fans are used to it. They can automatically work out the rough value of any given slash line in their heads.

williams .482
Member
Member
williams .482
5 years 2 months ago

It gives a nice mix of showing both about how good the hitter was and how they are that good.

The only thing that annoys me about slashes is when people put OBP/SLG/OPS instead of AVE/OBP/SLG.

and a 350/400/500 hitter is not the same as a 250/400/500 hitter. Th first one is a slap hitter with average power and okay to poor plate discipline who looks like they ware probably lucky on balls in play. the second one is a power guy who walks a lot. You can’t tell any of that from 400/500.

Yirmiyahu
Member
Yirmiyahu
5 years 2 months ago

Williams, except you can’t actually tell that stuff from the triple slash. Maybe your slugger has an ungodly Mark Reynolds-esque K rate, while the slap hitter has Tony Gwynn-esque bat control. They very well could have the same BABIP (say, .350).

In which case, it’s more likely the slugger who is getting lucky, not the slap hitter (based on the fact that a speedy slap hitter tends to have a naturally higher BABIP than a slow-footed, all-or-nothing slugger).

williams .482
Member
Member
williams .482
5 years 2 months ago

This is true. On the other hand, you still see the differences in power and discipline, which I would say are significant.

RMR
Guest
RMR
5 years 2 months ago

If you’re talking about performance, the specificity is fine. But when you start using it to infer talent, you’re often on more solid ground and less liable to breed of a false sense of certainty with fewer digits. Too often that extra decimal suggests a level of precision that the measure simply doesn’t have.

matt w
Guest
matt w
5 years 2 months ago

I don’t think the problem with “Evan Longoria has a 2.66 wFB/C” is the .66 — I read this site pretty regularly, and I’m going to have to look up wFB/C.

matt w
Guest
matt w
5 years 2 months ago

Oh hey, you know what would really make that stat more friendly to non-SABRists? Putting its definition in the glossary on the page it appears on.

everdiso
Member
everdiso
5 years 2 months ago

great article.

And whoever thought that K/9, BB/9, K/BB needed more than one decimal should be shot.

Rose Valorie
Guest
4 years 9 months ago

I enjoy your writing style truly enjoying this site. “Outings are so much more fun when we can savor them through the children’s eyes.” by Lawana Blackwell.

wpDiscuz