Could Chris Davis Match Roger Maris?

Chris Davis, with 37 home runs so far this season, has been generating a lot of buzz lately — both on the field and more recently with some comments he made during the All-Star break. When he was asked about the all-time home run record, Davis said:

“In my opinion, 61 is the record, and I think most fans agree with me on that.”

I have no idea if most fans agree with him, but it probably shouldn’t be  surprising that a guy within spitting distance of a 61 home run season would view that as the mark to beat — rather than 73 home runs, which is essentially out of range. So, just for fun, let’s figure out what Davis’ chances are of reaching Roger Maris.

At Tom Tango’s website, there was a discussion that tried to put a number on Davis’ chances of reaching that mark. Tango performed a “quick back-of-envelope calculation” to do so, but today, I’ll be providing you with an interactive tool that might make it easy for you to perform a more sophisticated calculation for situations like this (and many other types of situations).

Retracing my footsteps: Davis needs at least 24 home runs to reach or surpass 61. That’s the first entry that needs to be made, in the “number of successes” box. The next step is to enter the number of trials — in this case, plate appearances (PA).  Next to that box, you’ll enter your best guess as to the probability of the player getting that number of PAs. You can enter as many as 10 possibilities, or as few as one; just make sure the probabilities you enter add up to 100%. Repeat the same procedure for the “true rate estimate,” which for this exercise is home runs per plate appearance.

For the assumptions I used, the calculator figures there’s only an 8.56% chance of Davis hitting at least 61 homers this season. But feel free to change the assumptions, either on this page or by downloading the spreadsheet.

What is Davis’ True Home Run Rate?

First of all, what is a “true rate?”  Well, for a fair flip of a fair coin, we intuitively know the true rate of heads is 50%. That doesn’t mean that if you flip a coin 100 times, it will come up heads 50 times; in fact, the binomial distribution that my calculator is based on says there’s only about an 8% chance of exactly 50 heads coming up over 100 flips. It also says there’s almost a 37% chance of the number of heads being at least five away from 50, after 100 flips.

You can try that yourself in the calculator by entering “100” as the number of trials, giving that a 100% probability, and setting “.5″ as the true rate (also at 100% probability), with “55” as the number of successes (heads, in this instance). The answer box will say that there’s an 18.41% chance of getting at least 55 successes.  You can just double that number to get the chances that it will be at least five away from 50 in either direction.

Davis is not a coin, unfortunately, which makes it a lot harder to intuitively pinpoint his true home run rate. If he were a coin, his true rate would probably be very close to zero, since coins are very bad at hitting home runs. Anyway, here are some of his particulars to consider:

HR/PA
2013 to date .094
2012 .059
Steamer RoS .060
ZiPS RoS .062
Career .056

“RoS,” by the way, is the updated projection for the rest of the season for each projection system. Clearly, ZiPS and Steamer aren’t buying that he can  keep up this pace. For further context, the MLB leader in HR/PA last season was Josh Hamilton, at 0.068. Davis’ 0.059 was good enough for seventh in the majors. For my assumptions, I stuck pretty close to the ZiPS and Steamer RoS numbers that I believe to be the best guesses of his true ability — though I tried to err slightly on the side of, “He may have legitimately made a big improvement,” so the probability weighted average of my assumptions comes out to about .064.

One factor you have to worry about is  pitchers may start pitching around Davis or intentionally walking him more often as his reputation grows, which will of course hurt his HR/PA. I haven’t done the research to project how that might affect him, but yeah, it matters.

Projecting PAs

Chris Davis has been regularly hitting in the fifth spot in the Orioles’ lineup. If that continues, he’ll have fewer plate appearance to work with than if he’d hit higher in the lineup. All else equal, getting fewer PAs would certainly hurt his chances at reaching the 61-homer milestone.  The Book says the five-spot in the American League gets 4.39 PA per game. That could use a bit of updating, though, since it was based on numbers from 1999 to 2002 seasons (basically the most offense-heavy era in modern baseball).  The 1999 to 2002 AL teams averaged .340 OBP; the 2013 Orioles have a .316 OBP. Clearly, there will be fewer PAs to go around when players aren’t getting on base.

Some comparisons, which include the third lineup spot that Davis could hypothetically be moved up to:

OBP PA per Game
3rd spot 5th Spot
2013 Orioles .316 4.39 4.16
2012 Orioles .311 4.45 4.25
1999-2002 AL .340 4.61 4.39

You may be wondering why the 2012 Orioles had more PAs despite a lower OBP than this year’s team. I think the most obvious explanation is that last year’s Orioles got into a ridiculous number of extra-inning games — 11.1% of their total games, compared to only 8.3% this year. Last year, the Orioles pitched 9.15 innings per game, compared to 8.91 this year. I looked at Baltimore’s double play and caught stealing numbers on offense to see if they also contributed, but it turns out they’ve actually improved substantially in the GIDP department this year: 1.58% of PAs this year vs. 2.47% last year.

Anyway, the weighted average of my assumptions comes out to 253 remaining PAs for Davis this season, which means I expect him to average around 3.83 PAs for the team’s remaining 66 games. That’s the result of me unscientifically factoring in the chance that he’ll get some days off or get injured. Steamer and ZiPS are more pessimistic, figuring him for 199 and 241 PAs, respectively.

Results

As you see in the web app above, my assumptions predict an 8.56% chance of Davis reaching or surpassing that 61-homer mark this season. Here’s a more complete list of some of the other HR levels it expects for him:

Davis’ Final 2013 HR Minimum Estimated Probability
51 67.5%
56 30.2%
60 11.4%
61 8.6%
62 6.4%
66 1.7%
71 .2%

Thoughts

I hope you’ll be able to get some use out of the web app here, since I think it has a lot of uses in baseball stats (and other things). Maybe I didn’t look hard enough, but the ability to account for uncertainty in your estimates seems like something other online binomial calculators you might come across online don’t have. If you’re wondering — yes, entering the whole distribution can make a big difference over simply using the weighted averages — using only the weighted averages of my assumptions would produce just a 3.43% chance of Davis hitting at least 61.

Of course, there’s uncertainty within the uncertainty, especially with the assumptions I made in this exercise. These were just semi-educated guesses, without a ton of research put into them.

As always, thoughts are welcome. I’ll be hanging around the comments section where I’ll attempt to answer questions.




Print This Post



Steve is a robot created for the purpose of writing about baseball statistics. One day, he may become self-aware, and...attempt to make money or something?


79 Responses to “Could Chris Davis Match Roger Maris?”

You can follow any responses to this entry through the RSS 2.0 feed.
  1. Chcago Mark says:

    Nice article Robot Steve. My head just exploded. Anyway, why don’t you give YOUR best wag as to how many Davis ends up with. And please, no reference to Streamer or ZIPS. We already have that. Even if it might be the most accurate in your opinion.

    Vote -1 Vote +1

    • Baltar says:

      He did. 8.56%. See above.

      Vote -1 Vote +1

    • Kogoruhn says:

      It was answered in the article. He projected .064 HR/PA with 253 PA’s. That projects Daivs to hit 16.192 HR’s the rest of the season.

      Using this I would say Steve is projecting Davis to end up with 53 HR’s

      Vote -1 Vote +1

    • Thanks! So people actually do read those bio lines! Maybe I’ll have to think of a real one now…

      Yeah, like Kogoruhn says, a little over 53 is the average for my assumptions. Since I tend very slightly towards optimism, I’ll round up and say 54, though.

      Vote -1 Vote +1

      • Commanding Ramorda says:

        Also, no major leaguer has ever hit exactly 53 home runs in a season, while there are 7 seasons with exactly 54.

        Vote -1 Vote +1

      • Chicago Mark says:

        Can a head expload twice. I was going to ask why the totals of the probabilities added to >100%. But after reading this it seems pretty clear I wouldn’t understand anyway. Ha! However, isn’t 8.56% the probability he hits 24 hr’s? And now looking a little closer I see the 0.0637*253=16…
        Although I wasn’t 100% certain Steve would agree with his own math and might simply have his own opinion. That sounds a little insulting. It isn’t meant to be.
        Do we re-visit after the season to maybe explain any outlier that may have occured?
        Good stuff and thanks

        Vote -1 Vote +1

        • 8.56% is supposed to be the probability that he hits 24 or more homers the rest of the season, if that’s unclear. It adds up the probabilities that he hits exactly 24, or 25, or 26, etc. That’s called the cumulative binomial distribution.

          The main thing I did in that response to Ramorda, FYI, was to change the “TRUE”s in my BINOMDIST formulas to “FALSE”s (by find and replace) because “TRUE” makes it use the cumulative binomial distribution. I added 1 to the number of successes, then looked at cell S17 for the answer (long story). That method tells me there’s a 2.2% chance he hits exactly 24 the rest of the way.

          Which probabilities are you talking about that add to >100%?

          Vote -1 Vote +1

        • Chicago Mark says:

          In the “results” sheet. But now I think those are the % he hits at least 51 or 56…So I’m catching on.
          Thanks again

          Vote -1 Vote +1

  2. Jason B says:

    This is neat – thanks for sharing the Excel spreadsheet and giving us the opportunity to fiddle around with the parameters.

    +5 Vote -1 Vote +1

  3. VeveJones007 says:

    Very cool stuff. I wanted to see how much Davis improved his odds with his four-straight games with HRs before the break, so I input 28 successes and 260 PAs. That put his odds at 2.4%.

    That’s a pretty huge improvement over four days. If he has another binge or two like that the rest of the way, these odds could go up significantly.

    Vote -1 Vote +1

  4. Hamba says:

    Wait, I’m going to have to disagree on the grounds of logic. The All-Star break has always been the ceremonious “halfway point” of the season. Therefore, by simply taking 37 (current HR) x 2 (halfway point of season), we can all safely say there’s a good chance he will hit at least 74 home runs (37×2). This is all assuming he stays healthy and maintains his will to win.

    +37 Vote -1 Vote +1

  5. bobabaloo says:

    just curious, but considering his little hr streak before the break artificially inflated his hr/pa numbers, what sort of projection would he have if we were to lay his hr and pa on a graph and get a best fit line to better estimate his hr/pa? its fun to say, look davis is on pace for 62 hrs, but really, he isnt

    Vote -1 Vote +1

    • Anon says:

      artificially inflated

      How is hitting actual HRs considered artificial?

      +11 Vote -1 Vote +1

      • Jason B says:

        ^^this. Can’t take away the ones already banked.

        Unless you also take away, say 20 or 25 AB HR-less strings, since they’re “artificially low”.

        (But I wouldn’t recommend doing either.)

        Vote -1 Vote +1

    • VeveJones007 says:

      I don’t follow your logic. His HR/PA is what it is. Are you just saying you expect that figure to regress? That’s accounted for in the article.

      Vote -1 Vote +1

    • bobabaloo says:

      ya i did a crappy job explaining what i meant…yes his total hr/pa is what it is. all im saying is that he is coming off a 4 game hot streak, so his hr/pa is at a high right now. maybe the question should be; how many hrs is he trending to?

      Vote -1 Vote +1

      • VeveJones007 says:

        That’s perfectly fair and I feel like the article takes a good stab at estimating what his regression will be in HR/PA. The writer has it dropping more than 33%.

        Vote -1 Vote +1

    • Bill says:

      Wait what hot streak? He broke out of his season long slump the last three games before the All Star break. As long as he doesn’t fall into another slump, he’s going to hit 72 more giving him a total of 111, if my math is right.

      Vote -1 Vote +1

  6. Matt says:

    As a statistician, I appreciate the level of analysis that went into this post. Good work.

    Vote -1 Vote +1

  7. TKDC says:

    This is awesome, my favorite FG article in a long time.

    Vote -1 Vote +1

  8. Mike Green says:

    Hit Tracker provides some evidence respecting projected HR rates for the rest of the season. It shows 8 no-doubters and 9 just-enoughs for 2013. That, combined with Davis’ historical record, suggests some regression, as this article and ZIPS suggest.

    I do think that the regression level suggested is awfully high, subjectively. Davis has a different approach at the plate in 2013, which manifests itself in several ways. His GB rate is down, his pop-up rate is down, his line drive rate is steady, his doubles rate is way up and his opposite-field home run rate is way up. Personally, I would use a figure something in the .07 to .075 range for his projected HR rate.

    Vote -1 Vote +1

  9. Johnhavok says:

    Davis opinion is pretty silly, MLB still recognizes 73 as the record, whether it’s got an asterisk or not.

    Just because individuals have an opinion that 61 is still a true or untainted record, doesn’t mean it’s still a record. Until MLB takes the 73 out of the record books, 73 is the number to beat. 62 could still be celebrated as an amazing feat, and absolutely should be, but to celebrate it as a record would be false.

    Vote -1 Vote +1

    • Jeremy T says:

      Well, it would be a record. Just an AL record, not an MLB record.

      Vote -1 Vote +1

    • Roger Maris' ghost says:

      Actually, celebrating 73 homers as a record is false. Up here in Heaven we don’t count anything that Sosa, McGwire, Bonds, Lance Armstrong, or Marion Jones ever did — before or after they started drugcheating. That’s how Heaven rolls. My single-season record, Bi-yatches!

      Vote -1 Vote +1

  10. Literalist says:

    “If he were a coin, his true rate would probably be very close to zero, since coins are very bad at hitting home runs.”

    What the factual basis for this assertion? Do you have any data or first-hand observations of coins attempting to bat?

    Vote -1 Vote +1

  11. CHRIS DAVIS UNBOUND says:

    67 hrs. Bank it, Dano.

    Vote -1 Vote +1

  12. GottaIch116 says:

    Great Article Steve! Thanks!

    Can someone explain to me why it is incorrect to just take the weighted averages of the # of trials and rate and just plug it into the Binomial Distribution? Isn’t this probability just as legit? I know the outputs are different, but why is one more correct than the other?

    Vote -1 Vote +1

    • GottaIch116 says:

      “If you’re wondering — yes, entering the whole distribution can make a big difference over simply using the weighted averages — using only the weighted averages of my assumptions would produce just a 3.43% chance of Davis hitting at least 61.”

      Just to clarify, why do we prefer entering several distributions with all the possible pairs of parameters rather than just entering the weighted averages of the parameters into one distribution?

      Vote -1 Vote +1

      • dustygator says:

        Because f(g(x)) =/= g(f(x)). Where f() = expected value function (average) and g() = multiple of HR rate * PA.

        ex.
        f = 2x
        g = x^2

        f(g(x))= x2^2
        g(f(x))= (2x)^2 = 4x^2

        Vote -1 Vote +1

        • GottaIch116 says:

          Hi Dusty Gator, thanks for your response. My question was not clear enough.

          I understand that those two are not equal. (As I stated above when I said that I know the outputs are different).

          What I’m asking is this: Why not use the weighted average of PA’s and home run rate as the two parameters in your binomial distribution?

          Above, Steve had the weighted averages as PA’s = n = 253 and rate = p = .0637.

          Why not use f(x|n,p). Where f is the binomial dist. with parameters n, p, as a function of x successes (homeruns)?

          Steve did this (I think):

          w_11*f(x|n_1,p_1) + w_12*f(x|n_1,p_2) + etc…

          Both clearly give different outputs. Why is the second method preferred to the first?

          Vote -1 Vote +1

    • Yeah, that’s an interesting one. Well, the only way I know how to explain that is by showing an example.

      I’ll hold the true rate constant at 0.05, and will look for the chance of 20 successes over a weighted mean of 250 trials:

      Trial 1: 100% probability of 250 trials –> 2.71% chance of 20+ successes

      Trial 2: 25% chance of 300 trials, 50% chance of 250 trials, 25% chance of 200 trials –> 4.40% chance of 20+ successes.

      What’s happening in Trial 2 is this:
      300 trials = 11.90% chance of 20+ successes
      250 trials = 2.71% chance of 20+ successes
      200 trials = 0.27% chance of 20+ successes

      Now, weight those by the chances of getting that many trials: 0.25*11.9% + 0.5*2.71% + 0.25*0.27% = 4.40%

      So, it’s the asymmetry of 11.9 vs. 2.71 vs. 0.27 that’s doing that.

      The reverse is true if you’re talking about a high-percentage thing — e.g. same circumstance, but you’re looking for the chance of 8+ successes: Trial 1 would give 93.5%, but Trial 2 gives 91.02%. Now the uncertainty works against you, as the downside of getting only 200 trials is considerably worse than the upside of getting 300 is helpful.

      Vote -1 Vote +1

      • GottaIch116 says:

        Hi Steve, thanks for the example.

        So are you weighting the different binomial distributions rather using the average of the parameters so that the extreme parameters will have more say in the final probability?

        To me, the 4.40% chance in trial 2 is no more correct than the 2.71% chance in trial 1.

        Vote -1 Vote +1

      • GottaIch116 says:

        Hi Steve,

        Sheesh, I think I finally figured out why you picked method 2. Thanks for your response (and yours too gator). Again great article. Sorry that I kept asking about this.

        Vote -1 Vote +1

  13. Ben says:

    What happens to the probability if you just use league average HR/PA?

    Vote -1 Vote +1

  14. Brandon T says:

    I disagree with one part of this analysis. Strikeout and walk rates are generally one of the first values to stabilize for hitters — and Davis making MUCH better contact this year than last year. It might be interesting to looking at Davis’ HR/FB instead of his HR/PA, which includes his insanely high strike out rates from earlier seasons.

    Vote -1 Vote +1

    • Well, he’s batting the ball in 60.3% of his PAs this year, compared to 62.1% last year. He is, however, hitting fly balls a lot more often — 26.5% of his PAs, vs. 23.3% last year, so good catch there. His HR/FB is way up as well. Will he keep that up, though? Who knows.

      Vote -1 Vote +1

  15. paul says:

    The most absurd part of this article was:

    “Chris Davis has been regularly hitting in the fifth spot in the Orioles’ lineup. If that continues, he’ll have fewer plate appearance to work with than if he’d hit higher in the lineup.

    He may become the first player ever to hit over .300 and 50 HRs from the 5 hole.

    Vote -1 Vote +1

  16. Robbie G. says:

    I’d love to see a complete list of players with a minimum of, say, 35 home runs at the All-Star Break from 1962 (i.e., the year after Roger Maris’ 61 home run season) to the present, along with what each player’s final tally wound up being. Also! I’d love to see each of these player’s career WAR. There seem to be a fair number of rather unlikely “Will he catch Roger Maris?” candidates over the years. I’d do it myself but I’m not sure how to access this information.

    Vote -1 Vote +1

    • elgato7664 says:

      Barry Bonds / 2001 / 39 / 73 / 164 fWAR
      Mark McGwire / 1998 / 37 / 70 / 66 fWAR
      Chris Davis / 2013 / 37 / ?? / 6 fWAR
      Reggie Jackson / 1969 / 37 / 47 / 73 fWAR
      Ken Griffey Jr / 1998 / 35 / 56 / 77 fWAR
      Luis Gonzalez / 2001 / 35 / 57 / 55 fWAR
      Frank howard / 1969 / 34 / 48 / 39 fWAR

      Vote -1 Vote +1

    • Jason B says:

      It may be more instructive to see how these players compared after, say, 90 games and then at year-end, since the All-Star break bounces around the calendar a bit (sometimes after 80 games, sometimes after 90, so not all pre-break totals are created equal).

      Vote -1 Vote +1

  17. Baseball Fanboy says:

    If we’re saying that Bonds doesn’t count because of the roids, then Maris doesn’t count because of the 8 extra games. The real record is Babe with 63, since we want fair comparisons and whatnot.

    Vote -1 Vote +1

    • Hamba says:

      Is that a typo or are you assuming the Babe would have hit 3 more home runs had he played 8 additional games? Either way, Ruth played prior to the breaking of the color barrier and thus his record can’t be taken seriously. The true home run champion lies somewhere between 1947 and 1960 (American League) and 1961 (National League). My god I just put in a lot of effort to make a terrible joke. As you were…

      Vote -1 Vote +1

      • Baseball Fanboy says:

        Those extra 3 HR are based on his HR/PA extrapolated to a 162 game season. If you are really going to bring era into the discussion, Bonds faced juicing pitchers and had far less at bats with actual pitches to hit as pitchers pitched around him. He had a 26.1 % walk rate versus Maris’ 13.5 % walk rate causing a disparity of 476 ABs to 590 ABs. There really isn’t a way to claim the record is 61, it’s either 63 or 73, which means that Davis is looking at somewhere between a 3 and 5 % chance or less than a .1 % chance.

        Vote -1 Vote +1

    • Ruki Motomiya says:

      Don’t worry, Ben Revere will shatter this HR record any day now.

      Vote -1 Vote +1

  18. Word says:

    This is very nice work, Steve. If Davis stays on pace to challenge Maris, I’d love to see you revisit this around the 3/4 point of the season.

    Vote -1 Vote +1

  19. MLB Booster says:

    I’m perfectly happy in the case of Chris Davis to suspend my disbelief and impatiently wait for the unlikely and improbable.

    Vote -1 Vote +1

  20. pft says:

    The curse of the HR Derby ended any chance of that, he will not hit 50 HR this year let alone more than 61 .

    Vote -1 Vote +1

  21. chief00 says:

    Will he chase Maris or Reggie? Maris had The Mick driving him ’til he got hurt; Reggie hit 10 after the AS break. As a Jays’ fan, I say ‘a pox on you, Davis!’ As a baseball fan, I want him to hit 62, 67, or 74.

    Vote -1 Vote +1

  22. A Braves Fan says:

    An extreme example to show why the second method might be a better choice than the method of using just the weighted averages as the parameters:

    Lets say the number of trials for flipping a coin is determined first by flipping a coin. You’re going to do 50 trials if you get a heads or 50 if you get a tails. Therefore, you have a 50% chance of it being 150 and a 50% chance of it being 50. The weighted average for the number of trials would then be 100 (.5*150 + .5*50).

    If you wanted to know the probability of getting 110 heads, if you used the weighted average the probability would be zero. Prob of 110 heads in 100 flips = 0.

    If you used the method that Steve used, it would be .5*Prob of 110 heads in 150 flips + .5*Prob of 110 heads in 50 flips = .5(non-zero value) + .5(0)

    Vote -1 Vote +1

  23. astrostl says:

    ” If he were a coin, his true rate would probably be very close to zero, since coins are very bad at hitting home runs.”

    <3

    Vote -1 Vote +1

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>