Could Chris Davis Match Roger Maris?

Chris Davis, with 37 home runs so far this season, has been generating a lot of buzz lately — both on the field and more recently with some comments he made during the All-Star break. When he was asked about the all-time home run record, Davis said:

“In my opinion, 61 is the record, and I think most fans agree with me on that.”

I have no idea if most fans agree with him, but it probably shouldn’t be  surprising that a guy within spitting distance of a 61 home run season would view that as the mark to beat — rather than 73 home runs, which is essentially out of range. So, just for fun, let’s figure out what Davis’ chances are of reaching Roger Maris.

At Tom Tango’s website, there was a discussion that tried to put a number on Davis’ chances of reaching that mark. Tango performed a “quick back-of-envelope calculation” to do so, but today, I’ll be providing you with an interactive tool that might make it easy for you to perform a more sophisticated calculation for situations like this (and many other types of situations).

Retracing my footsteps: Davis needs at least 24 home runs to reach or surpass 61. That’s the first entry that needs to be made, in the “number of successes” box. The next step is to enter the number of trials — in this case, plate appearances (PA).  Next to that box, you’ll enter your best guess as to the probability of the player getting that number of PAs. You can enter as many as 10 possibilities, or as few as one; just make sure the probabilities you enter add up to 100%. Repeat the same procedure for the “true rate estimate,” which for this exercise is home runs per plate appearance.

For the assumptions I used, the calculator figures there’s only an 8.56% chance of Davis hitting at least 61 homers this season. But feel free to change the assumptions, either on this page or by downloading the spreadsheet.

What is Davis’ True Home Run Rate?

First of all, what is a “true rate?”  Well, for a fair flip of a fair coin, we intuitively know the true rate of heads is 50%. That doesn’t mean that if you flip a coin 100 times, it will come up heads 50 times; in fact, the binomial distribution that my calculator is based on says there’s only about an 8% chance of exactly 50 heads coming up over 100 flips. It also says there’s almost a 37% chance of the number of heads being at least five away from 50, after 100 flips.

You can try that yourself in the calculator by entering “100” as the number of trials, giving that a 100% probability, and setting “.5” as the true rate (also at 100% probability), with “55” as the number of successes (heads, in this instance). The answer box will say that there’s an 18.41% chance of getting at least 55 successes.  You can just double that number to get the chances that it will be at least five away from 50 in either direction.

Davis is not a coin, unfortunately, which makes it a lot harder to intuitively pinpoint his true home run rate. If he were a coin, his true rate would probably be very close to zero, since coins are very bad at hitting home runs. Anyway, here are some of his particulars to consider:

HR/PA
2013 to date .094
2012 .059
Steamer RoS .060
ZiPS RoS .062
Career .056

“RoS,” by the way, is the updated projection for the rest of the season for each projection system. Clearly, ZiPS and Steamer aren’t buying that he can  keep up this pace. For further context, the MLB leader in HR/PA last season was Josh Hamilton, at 0.068. Davis’ 0.059 was good enough for seventh in the majors. For my assumptions, I stuck pretty close to the ZiPS and Steamer RoS numbers that I believe to be the best guesses of his true ability — though I tried to err slightly on the side of, “He may have legitimately made a big improvement,” so the probability weighted average of my assumptions comes out to about .064.

One factor you have to worry about is  pitchers may start pitching around Davis or intentionally walking him more often as his reputation grows, which will of course hurt his HR/PA. I haven’t done the research to project how that might affect him, but yeah, it matters.

Projecting PAs

Chris Davis has been regularly hitting in the fifth spot in the Orioles’ lineup. If that continues, he’ll have fewer plate appearance to work with than if he’d hit higher in the lineup. All else equal, getting fewer PAs would certainly hurt his chances at reaching the 61-homer milestone.  The Book says the five-spot in the American League gets 4.39 PA per game. That could use a bit of updating, though, since it was based on numbers from 1999 to 2002 seasons (basically the most offense-heavy era in modern baseball).  The 1999 to 2002 AL teams averaged .340 OBP; the 2013 Orioles have a .316 OBP. Clearly, there will be fewer PAs to go around when players aren’t getting on base.

Some comparisons, which include the third lineup spot that Davis could hypothetically be moved up to:

OBP PA per Game
3rd spot 5th Spot
2013 Orioles .316 4.39 4.16
2012 Orioles .311 4.45 4.25
1999-2002 AL .340 4.61 4.39

You may be wondering why the 2012 Orioles had more PAs despite a lower OBP than this year’s team. I think the most obvious explanation is that last year’s Orioles got into a ridiculous number of extra-inning games — 11.1% of their total games, compared to only 8.3% this year. Last year, the Orioles pitched 9.15 innings per game, compared to 8.91 this year. I looked at Baltimore’s double play and caught stealing numbers on offense to see if they also contributed, but it turns out they’ve actually improved substantially in the GIDP department this year: 1.58% of PAs this year vs. 2.47% last year.

Anyway, the weighted average of my assumptions comes out to 253 remaining PAs for Davis this season, which means I expect him to average around 3.83 PAs for the team’s remaining 66 games. That’s the result of me unscientifically factoring in the chance that he’ll get some days off or get injured. Steamer and ZiPS are more pessimistic, figuring him for 199 and 241 PAs, respectively.

Results

As you see in the web app above, my assumptions predict an 8.56% chance of Davis reaching or surpassing that 61-homer mark this season. Here’s a more complete list of some of the other HR levels it expects for him:

Davis’ Final 2013 HR Minimum Estimated Probability
51 67.5%
56 30.2%
60 11.4%
61 8.6%
62 6.4%
66 1.7%
71 .2%

Thoughts

I hope you’ll be able to get some use out of the web app here, since I think it has a lot of uses in baseball stats (and other things). Maybe I didn’t look hard enough, but the ability to account for uncertainty in your estimates seems like something other online binomial calculators you might come across online don’t have. If you’re wondering — yes, entering the whole distribution can make a big difference over simply using the weighted averages — using only the weighted averages of my assumptions would produce just a 3.43% chance of Davis hitting at least 61.

Of course, there’s uncertainty within the uncertainty, especially with the assumptions I made in this exercise. These were just semi-educated guesses, without a ton of research put into them.

As always, thoughts are welcome. I’ll be hanging around the comments section where I’ll attempt to answer questions.



Print This Post



Steve is a robot created for the purpose of writing about baseball statistics. One day, he may become self-aware, and...attempt to make money or something?


Sort by:   newest | oldest | most voted
Chcago Mark
Guest
Chcago Mark
3 years 5 days ago

Nice article Robot Steve. My head just exploded. Anyway, why don’t you give YOUR best wag as to how many Davis ends up with. And please, no reference to Streamer or ZIPS. We already have that. Even if it might be the most accurate in your opinion.

Baltar
Guest
Baltar
3 years 5 days ago

He did. 8.56%. See above.

Hamba
Member
Hamba
3 years 5 days ago

That wasn’t the question.

Kogoruhn
Member
Kogoruhn
3 years 5 days ago

It was answered in the article. He projected .064 HR/PA with 253 PA’s. That projects Daivs to hit 16.192 HR’s the rest of the season.

Using this I would say Steve is projecting Davis to end up with 53 HR’s

Jason B
Guest
Jason B
3 years 5 days ago

This is neat – thanks for sharing the Excel spreadsheet and giving us the opportunity to fiddle around with the parameters.

VeveJones007
Guest
VeveJones007
3 years 5 days ago

Very cool stuff. I wanted to see how much Davis improved his odds with his four-straight games with HRs before the break, so I input 28 successes and 260 PAs. That put his odds at 2.4%.

That’s a pretty huge improvement over four days. If he has another binge or two like that the rest of the way, these odds could go up significantly.

filihok
Guest
3 years 5 days ago

Those 4 home runs are almost 7% of 61 home runs. That it increased his probability of getting to 61 by 6% isn’t surprising.

Hamba
Member
Hamba
3 years 5 days ago

Wait, I’m going to have to disagree on the grounds of logic. The All-Star break has always been the ceremonious “halfway point” of the season. Therefore, by simply taking 37 (current HR) x 2 (halfway point of season), we can all safely say there’s a good chance he will hit at least 74 home runs (37×2). This is all assuming he stays healthy and maintains his will to win.

Cuck city
Guest
3 years 5 days ago

yeah half of 162 is 90 something

good one math major

TKDC
Guest
TKDC
3 years 5 days ago

Very nice subtle trolling there champ.

Jeremy T
Guest
3 years 5 days ago

Including that “will to win” bit at the end there makes me think that this comment was posted with tongue planted firmly in cheek. Or at least, I certainly hope so.

Hamba
Member
Hamba
3 years 5 days ago

Thank you, I was hoping it was pretty obvious.

Felix Hernandez
Guest
Felix Hernandez
3 years 5 days ago

Your implicit assumption that all FanGraphs’ readers have a sense of humor is not supported by objective analysis. Please factor that in to your calculations next time.

Pedantic Grammar Douche
Guest
Pedantic Grammar Douche
3 years 5 days ago

You mean “Fangraphs’ readers have senses of humor.”

*Haughtily pushes taped-up glasses up the bridge of the nose*

bobabaloo
Guest
bobabaloo
3 years 5 days ago

just curious, but considering his little hr streak before the break artificially inflated his hr/pa numbers, what sort of projection would he have if we were to lay his hr and pa on a graph and get a best fit line to better estimate his hr/pa? its fun to say, look davis is on pace for 62 hrs, but really, he isnt

Anon
Guest
Anon
3 years 5 days ago

artificially inflated

How is hitting actual HRs considered artificial?

Jason B
Guest
Jason B
3 years 5 days ago

^^this. Can’t take away the ones already banked.

Unless you also take away, say 20 or 25 AB HR-less strings, since they’re “artificially low”.

(But I wouldn’t recommend doing either.)

VeveJones007
Guest
VeveJones007
3 years 5 days ago

I don’t follow your logic. His HR/PA is what it is. Are you just saying you expect that figure to regress? That’s accounted for in the article.

bobabaloo
Guest
bobabaloo
3 years 5 days ago

ya i did a crappy job explaining what i meant…yes his total hr/pa is what it is. all im saying is that he is coming off a 4 game hot streak, so his hr/pa is at a high right now. maybe the question should be; how many hrs is he trending to?

VeveJones007
Guest
VeveJones007
3 years 5 days ago

That’s perfectly fair and I feel like the article takes a good stab at estimating what his regression will be in HR/PA. The writer has it dropping more than 33%.

Bill
Guest
Bill
3 years 5 days ago

Wait what hot streak? He broke out of his season long slump the last three games before the All Star break. As long as he doesn’t fall into another slump, he’s going to hit 72 more giving him a total of 111, if my math is right.

Matt
Guest
Matt
3 years 5 days ago

As a statistician, I appreciate the level of analysis that went into this post. Good work.

TKDC
Guest
TKDC
3 years 5 days ago

This is awesome, my favorite FG article in a long time.

hmk
Guest
hmk
3 years 5 days ago

you clearly haven’t spent enough time on this website.

TKDC
Guest
TKDC
3 years 4 days ago

I read virtually everything that is posted on this website. What do you suggest, reading slower?

Mike Green
Guest
Mike Green
3 years 5 days ago

Hit Tracker provides some evidence respecting projected HR rates for the rest of the season. It shows 8 no-doubters and 9 just-enoughs for 2013. That, combined with Davis’ historical record, suggests some regression, as this article and ZIPS suggest.

I do think that the regression level suggested is awfully high, subjectively. Davis has a different approach at the plate in 2013, which manifests itself in several ways. His GB rate is down, his pop-up rate is down, his line drive rate is steady, his doubles rate is way up and his opposite-field home run rate is way up. Personally, I would use a figure something in the .07 to .075 range for his projected HR rate.

filihok
Guest
3 years 5 days ago

This tool needs more levels.

We need probabilities for PA’s, Fly ball%, and HR/FB %

Johnhavok
Guest
Johnhavok
3 years 5 days ago

Davis opinion is pretty silly, MLB still recognizes 73 as the record, whether it’s got an asterisk or not.

Just because individuals have an opinion that 61 is still a true or untainted record, doesn’t mean it’s still a record. Until MLB takes the 73 out of the record books, 73 is the number to beat. 62 could still be celebrated as an amazing feat, and absolutely should be, but to celebrate it as a record would be false.

Jeremy T
Guest
3 years 5 days ago

Well, it would be a record. Just an AL record, not an MLB record.

Roger Maris' ghost
Guest
Roger Maris' ghost
3 years 5 days ago

Actually, celebrating 73 homers as a record is false. Up here in Heaven we don’t count anything that Sosa, McGwire, Bonds, Lance Armstrong, or Marion Jones ever did — before or after they started drugcheating. That’s how Heaven rolls. My single-season record, Bi-yatches!

Tom B
Guest
Tom B
3 years 5 days ago

Did your own asterisk not follow you to heaven Roger?

Roger Maris
Guest
Roger Maris
3 years 5 days ago

I’m too drunk and on too many uppers to see something as small as an asterisk.

mickey mantle
Guest
mickey mantle
3 years 5 days ago

stop hogging all the good stuff, roger.

Literalist
Guest
Literalist
3 years 5 days ago

“If he were a coin, his true rate would probably be very close to zero, since coins are very bad at hitting home runs.”

What the factual basis for this assertion? Do you have any data or first-hand observations of coins attempting to bat?

Jason B
Guest
Jason B
3 years 5 days ago

Go here:

http://www.fangraphs.com/leaders.aspx?pos=all&stats=bat&lg=all&qual=y&type=8&season=2013&month=0&season1=1871&ind=0&team=0&rost=0&players=0&coin=y

Only seven coins show up in the list of those with over 50 career HR, with the all-time leader being a 1972 Kennedy half-dollar from the Denver mint, which collected 86 career HR from 1983-1992. Probably could have gone on to reach 100 but it was pulled out from behind a kid’s ear by his grandpa and has been in a piggy bank for the last 20+ years.

Sure it doesn’t sound like that that many, but when it started hitting homers the Kennedy half was only eleven. How many major league homers did YOU hit at eleven? Exactly.

CHRIS DAVIS UNBOUND
Guest
CHRIS DAVIS UNBOUND
3 years 5 days ago

67 hrs. Bank it, Dano.

GottaIch116
Guest
GottaIch116
3 years 5 days ago

Great Article Steve! Thanks!

Can someone explain to me why it is incorrect to just take the weighted averages of the # of trials and rate and just plug it into the Binomial Distribution? Isn’t this probability just as legit? I know the outputs are different, but why is one more correct than the other?

GottaIch116
Guest
GottaIch116
3 years 5 days ago

“If you’re wondering — yes, entering the whole distribution can make a big difference over simply using the weighted averages — using only the weighted averages of my assumptions would produce just a 3.43% chance of Davis hitting at least 61.”

Just to clarify, why do we prefer entering several distributions with all the possible pairs of parameters rather than just entering the weighted averages of the parameters into one distribution?

dustygator
Guest
dustygator
3 years 5 days ago

Because f(g(x)) =/= g(f(x)). Where f() = expected value function (average) and g() = multiple of HR rate * PA.

ex.
f = 2x
g = x^2

f(g(x))= x2^2
g(f(x))= (2x)^2 = 4x^2

GottaIch116
Guest
GottaIch116
3 years 5 days ago

Hi Dusty Gator, thanks for your response. My question was not clear enough.

I understand that those two are not equal. (As I stated above when I said that I know the outputs are different).

What I’m asking is this: Why not use the weighted average of PA’s and home run rate as the two parameters in your binomial distribution?

Above, Steve had the weighted averages as PA’s = n = 253 and rate = p = .0637.

Why not use f(x|n,p). Where f is the binomial dist. with parameters n, p, as a function of x successes (homeruns)?

Steve did this (I think):

w_11*f(x|n_1,p_1) + w_12*f(x|n_1,p_2) + etc…

Both clearly give different outputs. Why is the second method preferred to the first?

Ben
Guest
Ben
3 years 5 days ago

What happens to the probability if you just use league average HR/PA?

Jason B
Guest
Jason B
3 years 5 days ago

Why on earth would you do that?

Brandon T
Guest
Brandon T
3 years 5 days ago

I disagree with one part of this analysis. Strikeout and walk rates are generally one of the first values to stabilize for hitters — and Davis making MUCH better contact this year than last year. It might be interesting to looking at Davis’ HR/FB instead of his HR/PA, which includes his insanely high strike out rates from earlier seasons.

paul
Guest
paul
3 years 5 days ago

The most absurd part of this article was:

“Chris Davis has been regularly hitting in the fifth spot in the Orioles’ lineup. If that continues, he’ll have fewer plate appearance to work with than if he’d hit higher in the lineup.

He may become the first player ever to hit over .300 and 50 HRs from the 5 hole.

Robbie G.
Guest
Robbie G.
3 years 5 days ago

I’d love to see a complete list of players with a minimum of, say, 35 home runs at the All-Star Break from 1962 (i.e., the year after Roger Maris’ 61 home run season) to the present, along with what each player’s final tally wound up being. Also! I’d love to see each of these player’s career WAR. There seem to be a fair number of rather unlikely “Will he catch Roger Maris?” candidates over the years. I’d do it myself but I’m not sure how to access this information.

elgato7664
Guest
elgato7664
3 years 5 days ago

Barry Bonds / 2001 / 39 / 73 / 164 fWAR
Mark McGwire / 1998 / 37 / 70 / 66 fWAR
Chris Davis / 2013 / 37 / ?? / 6 fWAR
Reggie Jackson / 1969 / 37 / 47 / 73 fWAR
Ken Griffey Jr / 1998 / 35 / 56 / 77 fWAR
Luis Gonzalez / 2001 / 35 / 57 / 55 fWAR
Frank howard / 1969 / 34 / 48 / 39 fWAR

Jason B
Guest
Jason B
3 years 5 days ago

It may be more instructive to see how these players compared after, say, 90 games and then at year-end, since the All-Star break bounces around the calendar a bit (sometimes after 80 games, sometimes after 90, so not all pre-break totals are created equal).

Baseball Fanboy
Guest
Baseball Fanboy
3 years 5 days ago

If we’re saying that Bonds doesn’t count because of the roids, then Maris doesn’t count because of the 8 extra games. The real record is Babe with 63, since we want fair comparisons and whatnot.

Hamba
Member
Hamba
3 years 5 days ago

Is that a typo or are you assuming the Babe would have hit 3 more home runs had he played 8 additional games? Either way, Ruth played prior to the breaking of the color barrier and thus his record can’t be taken seriously. The true home run champion lies somewhere between 1947 and 1960 (American League) and 1961 (National League). My god I just put in a lot of effort to make a terrible joke. As you were…

Baseball Fanboy
Guest
Baseball Fanboy
3 years 5 days ago

Those extra 3 HR are based on his HR/PA extrapolated to a 162 game season. If you are really going to bring era into the discussion, Bonds faced juicing pitchers and had far less at bats with actual pitches to hit as pitchers pitched around him. He had a 26.1 % walk rate versus Maris’ 13.5 % walk rate causing a disparity of 476 ABs to 590 ABs. There really isn’t a way to claim the record is 61, it’s either 63 or 73, which means that Davis is looking at somewhere between a 3 and 5 % chance or less than a .1 % chance.

Ruki Motomiya
Guest
Ruki Motomiya
3 years 5 days ago

Don’t worry, Ben Revere will shatter this HR record any day now.

Word
Guest
Word
3 years 5 days ago

This is very nice work, Steve. If Davis stays on pace to challenge Maris, I’d love to see you revisit this around the 3/4 point of the season.

MLB Booster
Guest
MLB Booster
3 years 5 days ago

I’m perfectly happy in the case of Chris Davis to suspend my disbelief and impatiently wait for the unlikely and improbable.

pft
Guest
pft
3 years 5 days ago

The curse of the HR Derby ended any chance of that, he will not hit 50 HR this year let alone more than 61 .

chief00
Guest
chief00
3 years 5 days ago

Will he chase Maris or Reggie? Maris had The Mick driving him ’til he got hurt; Reggie hit 10 after the AS break. As a Jays’ fan, I say ‘a pox on you, Davis!’ As a baseball fan, I want him to hit 62, 67, or 74.

A Braves Fan
Guest
A Braves Fan
3 years 5 days ago

An extreme example to show why the second method might be a better choice than the method of using just the weighted averages as the parameters:

Lets say the number of trials for flipping a coin is determined first by flipping a coin. You’re going to do 50 trials if you get a heads or 50 if you get a tails. Therefore, you have a 50% chance of it being 150 and a 50% chance of it being 50. The weighted average for the number of trials would then be 100 (.5*150 + .5*50).

If you wanted to know the probability of getting 110 heads, if you used the weighted average the probability would be zero. Prob of 110 heads in 100 flips = 0.

If you used the method that Steve used, it would be .5*Prob of 110 heads in 150 flips + .5*Prob of 110 heads in 50 flips = .5(non-zero value) + .5(0)

A Braves Fan
Guest
A Braves Fan
3 years 5 days ago

^^ meant to say you either do 150 trials or 50 trials, sorry for the confusion

GottaIch116
Guest
GottaIch116
3 years 4 days ago

Thanks Braves Fan. That’s exactly what I was asking about.

astrostl
Member
astrostl
3 years 4 days ago

” If he were a coin, his true rate would probably be very close to zero, since coins are very bad at hitting home runs.”

<3

wpDiscuz