Nate Silver and Imperfect Modeling

If you’re reading FanGraphs, you’re probably familiar with Nate Silver. He’s known nationally now for his political projections at Five Thirty Eight, but of course he made his name on the internet writing about baseball, creating the PECOTA projections, and penning some of the best articles about the economics of baseball written over the last decade.

Even if you’re not a political junkie, it was hard to get away from discussions about Nate Silver over the last few weeks. The final few weeks of the election saw a Nate vs Pundits fight that looked like something straight out of Moneyball. Last night, my Twitter feed probably had more references to Nate Silver than either Barack Obama or Mitt Romney. Needless to say, the performance of his model was a major storyline during last night’s election, especially if you were following the election through the eyes of people who write about baseball for a living.

If you haven’t already heard, Nate’s model did pretty darn well. As in, he got every projection right, going 49 for 49 in states that have projected winners and nailing the fact that Florida was basically a coin flip. But, I’m not writing this post to talk about how Nate Silver is a witch or to bleed political discussion over into yet another area of your life, but instead, I think that there’s an important takeaway from this that applies to baseball and what we do here at FanGraphs: the fact that imperfect models with questionable inputs can still be quite useful.

Nate’s model was similar in structure to many other polling aggregators, including one from Princeton that was even more aggressive with its conclusions. In general, the argument against these models is that the inputs they were using — the polls themselves — were of questionable value and that they were essentially guessing at things like voter turnout based on assumptions that might not hold true anymore.

Even Nate acknowledge the truth in some of these criticisms, as polling data can problematic, and the people collecting the data can have biases that skew the results one way or another. No one thinks polling data is perfect, nor should we think Nate’s model perfectly corrects for these biases because of the results of the voting last night. The “it’s a projection, not a prediction” line cuts both ways – we can’t note that the model could have still been right had the results been different last night while believing that the results prove that the model was clearly right to begin with. The critiques of the model that were true a few days ago are still true today. Criticisms of Nate’s methodology are still valid, as a perfect result in one election does not prove that the model is without flaw.

But, hopefully, we can note that a model does not have to be perfect to be useful, and perhaps we can move away from the idea that imperfect — and even biased — data should be discarded until it can be perfected. In baseball, we deal with a lot of biased data and imperfect models. Colorado is a perfect example. The raw numbers from games a mile high can’t be taken at face value because of the atmosphere, and changes to the environment — such as the introduction of the humidor — make applying park factors to that data a bit of a guessing game. We’ve seen offensive levels in Denver shift back and forth over the years, and we certainly don’t have a perfect way of explaining or accounting for those shifts. If we were to project the 2013 run environment in Coors Field, we’d have to deal with a lot of moving parts, many of which require assumptions that we can’t test, and there’s a decent amount of uncertainty that would surround that projection. But that doesn’t mean we shouldn’t try.

Whether it’s FIP, UZR, ZIPS, the Fan’s Scouting Report, or especially WAR, pretty much every statistical model that we host here on FanGraphs contains some inputs that can be legitimately questioned and requires some assumptions that don’t always hold. These models are imperfect, and the data that goes into them can be biased. But, that doesn’t mean that the alternative of discarding them and just accepting any conclusion as equally valid is an improvement.

That’s essentially where the pundits went wrong with Nate’s model. They didn’t like the conclusions, and some of them raised valid concerns about polling data and whether Nate’s adjustments added or subtracted from simpler, more transparent techniques. But to discard the model entirely was silly, and to pretend like the race was a toss-up was simply wrong. Throwing out the imperfect model with biased data was worse than taking it at face value.

In reality, we shouldn’t do either. The models showed their usefulness last night, but they’re still not perfect, and we shouldn’t just blindly accept every conclusion they spit out in the future. But, we don’t need to discard these models simply because we’ve figured out where their weak points are either. It’s not an either/or situation. We can be informed by imperfect models without being slaves to them.

WAR can inform our opinion of Mike Trout‘s value relative to Miguel Cabrera without us turning the MVP Award into the Whoever Has The Highest WAR Award. We can acknowledge the shortcomings of defensive metrics and park factors while also applying the lessons they can teach us in an intelligent way. We can note that FIP doesn’t work very well for Jim Palmer without using that as a reason to keep evaluating pitchers by ERA instead.

Last night was undoubtedly a win for data-based analysis, but let’s be honest, the results don’t always turn out that well. Just as we shouldn’t have discarded Nate’s model had the results been different, we shouldn’t believe his model is perfect because the results did line up with what he projected. His model is still imperfect, but it’s also still useful.

Let’s not let the perfect be the enemy of the good. If we want a takeaway from the Nate Silver vs Pundits argument, let’s note that the pundits went wrong when they discarded his insights because they didn’t like the results and because they assumed the data was too biased to be useful. If a model doesn’t occasionally challenge our preconceived notions of what’s true, it’s not helpful to begin with, and even a model with problematic datasets can still provide useful information that can help inform our decisions.

The takeaway from last night shouldn’t be “always trust Nate Silver” or “always trust the data”. The takeaway should be that even mediocre data is often better than no data, and when you put mediocre data in the hands of smart people who understand its limitations and adjust accordingly, it can become quite useful indeed.



Print This Post



Dave is the Managing Editor of FanGraphs.


Sort by:   newest | oldest | most voted
Steve 1
Guest
Steve 1
3 years 9 months ago

Too bad the mainstream ‘baseball pundits’ won’t read a word of this.

Jack Weiland
Guest
Jack Weiland
3 years 9 months ago

The correct term is “lamestream.”

Gary York
Guest
Gary York
3 years 9 months ago

Also it’s too bad that the mainstream “political pundits” won’t read a word of this.

jason B
Guest
jason B
3 years 9 months ago

I think the mainstream punidtry wasn’t questioning the model or the results – they were pretty firmly entrenched in the Obama camp (this isn’t meant to be controversial or inflammatory, most members of the media looked favorably on Obama). But you’re correct in that the right-leaning punditry who should read this and glean some lessons from this whole episode likely won’t (read, or learn).

David
Guest
David
3 years 9 months ago

Of ALL the asinine lazy language that’s made it into political discourse over the last decade (and that’s obviously a tremendously large pool in which to fish), “mainstream media” has got to be the stupidest.
In the last week alone, I have read columns by people who are paid to write about politics by the five largest circulation newspapers in the country, as well as columnists/opinion writers/foisters of drivel from papers in three other of the 10 largest cities in the nation assert in various pieces that the very topic they are addressing will not be addressed in the “mainstream media.”
Newsflash… the fact that your sentence was published in a major newspaper of record by definition means you are wrong. And whiny.
[/offtopicrant]

channelclemente
Guest
channelclemente
3 years 9 months ago

Their Least Coast media’s motto, beware of geeks bearing gifts.

the flu
Member
Member
the flu
3 years 9 months ago

Given that this Fangraphs, shouldn’t that be “bearing GIFs.”

Average_Casey
Guest
Average_Casey
3 years 9 months ago

I think it’s funny that Geoff Baker on twitter last night was touting Nate Silver despite the fact that he has shown routinely that he doesn’t value sabermetrics.

joser
Guest
joser
3 years 9 months ago

Geoff Baker doesn’t value intellectual consistency either, and that is neither funny nor new.

Seattle Times Reader
Guest
Seattle Times Reader
3 years 9 months ago

Fuck Geoff Baker.

nllspc
Guest
nllspc
3 years 9 months ago

“…a perfect result in one election does not prove that the model is perfect.” This is misleading since his model has been highly accurate not in just this election but also in 2008 (Dem win) and 2010 (GOP win).

Not sure how many people are claiming his model is “perfect” and that his or any model should be completely “trusted.” I think he deserves a little more credit than this article gives him but I agree god-like status isn’t warranted.

The Real Neal
Guest
The Real Neal
3 years 9 months ago

He predicted the Democrates would win in 2008… you’re right, he’s a genius.

All the models depended on voter turnout. He did a pretty good job at accurtely nailing the voter turnout (for whatever reason), but that doesn’t meant that the numbers will stay the same, in particular when you move away from the senate, and if the conservatives actually start targetting non-whites.

John Thacker
Guest
John Thacker
3 years 9 months ago

From what I understand, he didn’t predict the voter turnout, the state polls did and he decided how trustworthy each one was. Though his model is opaque so it’s hard to know.

One example where his model did poorly was the recent UK general election, where he had the right vote totals but was way off on the Lib Dems (and other forecasters got it much better.) But, hey, he doesn’t know the UK.

Bob
Guest
Bob
3 years 9 months ago

As the saying goes, “All models are wrong, but some are useful.” And I’d say Nate’s models have certainly been useful, both in it’s predictive value and its ability to point out where the flaws in the data and data collection may show up. I actually wish people would pay a little more attention to the latter point, the places where the model fails can often be even more informative than the correct predictions it gives.

Klatz
Member
Klatz
3 years 9 months ago

The biggest takeaway from me is that you need to avoid getting attached to a specific outcome. Those who disagreed with Nate Silver could not accept an Obama victory and therefore any data that reached that conclusion must be flawed.

The best parts about Nate’s analysis are the transparency and foundation in solid statistical methods. In reality his conclusions weren’t the most accurate but he’s got the best track record so far.

salo
Guest
salo
3 years 9 months ago

Yep, when discussing with friends the AL MVP they dismissed the WAR argument not because they do not agree with the Model, the dismissed it because the outcome does not favor Cabrera. Sad

brian
Guest
brian
3 years 9 months ago

Confirmation bias is the biggest flaw of the human condition.

Viliphied
Guest
Viliphied
3 years 9 months ago

I dunno, gambler’s fallacy and other statistical ignorance is up there.

Paul
Guest
Paul
3 years 9 months ago

War, murder, rape up there maybe?

Justin Bailey
Guest
Justin Bailey
3 years 9 months ago

Let’s not forget the ability to deliberately deceive. That’s a pretty big problem for us.

Erm
Guest
Erm
3 years 9 months ago

Paul, I thought we liked WAR on this site…

B N
Guest
B N
3 years 9 months ago

I’m putting my money on myopia, but confirmation bias is a close second.

John Thacker
Guest
John Thacker
3 years 9 months ago

No real problem with Nate’s model’s result, but I wish he’d be more open and less of a black box, just like I feel with PECOTA and with Jeff Sagarin’s models. I don’t really see a lot of value-add for his model over far more open polling aggregations like Prof. Wang’s model at Princeton, and it’s significantly less open.

zach
Guest
zach
3 years 9 months ago

More open? the guy went out of his way over and over on his blog to explain his methodology and address criticisms.

John Thacker
Guest
John Thacker
3 years 9 months ago

He explained his methodology, yes, but he didn’t explain exactly how he arrived at some of his weights and numbers, such as how certain polls got various confidence rankings. I didn’t say he was totally opaque, just that there are more open competitors with the same results.

Gomez
Guest
3 years 9 months ago

I think Thacker’s arguing that he was not transparent with the raw data and methodologies used to determine his projections.

ns978
Guest
ns978
3 years 9 months ago

news pundits want us to believe it’s a tight race so we tune in to the results. also, they’re selling a product. that product is the confirmation of one’s predispositions so if silver’s model went against those predispositions it’s their job to say the model is wrong.

MikeS
Guest
MikeS
3 years 9 months ago

Silver actually said something about this a few days ago. Something along the lines of “if you ignore the data then you must question whether your mission is to inform or entertain.”

Brian
Guest
Brian
3 years 9 months ago

Unlikely that many adamant politico-types get much exposure to advanced baseball stats, but the inverse is probably true in the case of the election – just about everyone involved in baseball has been exposed to “political advanced statistics”, if you will, after Silver. And maybe the traditional baseball community will find some perspective and begin / continue to make the connections that seem so obvious if you’re on this side of the debate – “if traditional politicos appear pretty out-of-date / foolish in comparison with this data-driven analysis, maybe my baseball thoughts are a little out-of-date as well?” Dave’s cautions are well-taken (modeling isnt perfect, find your middleground) – but is a close parallel between baseball’s SABR/traditional debate and something as widespread as the US presidential election what it takes to ring a bell in some baseball minds?

joser
Guest
joser
3 years 9 months ago

I doubt it. Baseball (and sports in general) is where they “math is hard” types go to hide from stuff like this, so they’re just going to stick their fingers in their ears and make “I can’t hear you” noises even louder to avoid having their assumptions questioned and their musty little refuge aired out. Fortunately, they’re getting older and less relevant by the day, so increasingly we can’t hear them either.

Joe
Guest
Joe
3 years 9 months ago

I knew we’d get a piece on Nate. thanks dave.

Danny
Guest
Danny
3 years 9 months ago

Excellent piece- you touched on the issue, which is that stats are valuable if imperfect.

oolander
Guest
oolander
3 years 9 months ago

that’s not the point at all. the issue is how to maximize the utility of limited, imperfect data and in what frame the resulting stats can be best seen. the conclusion is not that all imperfect stats have value…

Anon
Guest
Anon
3 years 9 months ago

I’m interested in (and closely follow) politics, but this is a baseball site.

Please don’t bring politics here.

TKDC
Guest
TKDC
3 years 9 months ago

If you are not interested, you are free to not click on the article, which from the title was obviously going to involve politics. This is very much related to baseball statistics, and is worth discussing.

Anon
Guest
Anon
3 years 9 months ago

This is related to statistics. It has no relation to baseball.

Nick44
Guest
Nick44
3 years 9 months ago

And we all know that statistics have no place in baseball discussions – especially here at Fangraphs.

Steve
Guest
Steve
3 years 9 months ago

This article wasn’t about politics. It was about statistical modeling.

jim
Guest
jim
3 years 9 months ago

reading: it’s hard

Scott Ferris
Guest
Scott Ferris
3 years 9 months ago

I only remember this bc I had to comment on it when I saw it.

http://www.fangraphs.com/blogs/index.php/dylan-bundy-versus-taijuan-walker-best-baseball-pitching-prospects/

Anon – “Is he from a rural area? Did he grow up on a farm? Being from Oklahoma isn’t specific enough to classify someone as ‘country strong’. That would be like saying someone from San Francisco is ‘homosexually fashionable’.”

And later in response to Newman using a decent political analogy

Anon – “No, that is obvious from the tone of the political comments in the article.”

I promise they both sound far more politically charged in context but regardless, it seems like you get off on dealing with non-baseball issues. I’m not sure if you’re just winning hard at trolling and I’m adding fuel to the fire or if you’re genuinely just a downer of a person but either way, you’ve managed to rack up down votes at breakneck speeds so, ya know, at least you’ve got that.

bcp33bosox
Guest
bcp33bosox
3 years 9 months ago

Lol, noice, dude, noice…”downer of a person” …I am in stitches.

TKDC
Guest
TKDC
3 years 9 months ago

I’ve always thought that advanced statistics made baseball more fun. It’s still a game and is played on the field, and all the stats do is help people make more sound decisions. Thinking about these decisions has really changed how I watch baseball, and has made it more interesting to me than it was before I started reading about SABR.

I’m not positive the same applies in politics. If this is taken to the extreme, it could mean political parties making poor decisions for the country overall in order to shift the model in their favor. It might be fun to look at, it might tell us a lot, but national policy is not a game.

zach
Guest
zach
3 years 9 months ago

“political parties making poor decisions for the country overall in order to shift the model in their favor” Are you suggesting this doesn’t happen? I think it’s been happening for a while and to great extent.

TKDC
Guest
TKDC
3 years 9 months ago

I’d say that the political parties in the past have mostly made decisions that appealed to their base and independent voters, without thinking specifically about white, middle-class, single mothers in Ohio. Of course, for many, those decisions are still viewed as bad for the country overall, but at least the pandering was to a substantially sized group.

Paul
Guest
Paul
3 years 9 months ago

Well, in general, members of the House “pander” to the constituents of their district while members of the Senate “pander” to the people of their state. If they don’t, they likely won’t get re-elected. Appropriations and “pork” are what we send our representatives to Congress for (ostensibly)- to pass laws that benefit our district/state, bringing money to us for infrastructure or businesses or schools. Of course, we want them to pass laws that benefit the country as a whole too, but the whole idea of voting for a representative is to have someone who represents YOUR and your area’s interests. So, using advanced statistics, especially census data, but also data mined from other sources, can perhaps help a savvy representative’s office in figuring out who s/he is representing and how best to curry their favor, bringing in projects and money that targets THEM. And even in the past, they’ve known what the makeup of their constituencies are, and have worked in a targeted manner to work for them. How else does a Congressman from MI get elected unless he targets the middle-class workers in the auto industry? How does a Congressman get elected from SoCal except by pandering to the Hollywood players? And, yes, how else does a Congressman get elected in the suburbs of Ohio except by pandering to the white, middle-class, single mothers in his/her district?

Justin
Guest
Justin
3 years 9 months ago

TKDC,

Silver has specifically addressed this in interviews and made the point that the internal models of the campaign are far more sophisticated than what he does. He mentioned that they actually have individual voter data they use to guide their campaigns.

TKDC
Guest
TKDC
3 years 9 months ago

I know. This was in no way an indictment and especially not of Silver. And obviously this is something that cannot be put back in the bottle. If advanced statistics are used to win elections (especially elections with an electoral college), poor outcomes for the majority of the country could be the result of beneficial outcomes for those that “matter.”

And of course in different ways you could say this has happened for years, i.e. the super rich, if you think that.

But essentially what I’m saying is while I’d be happy if advanced statistics were used more in baseball, I’m not sold on their benefit to our country when they are used to win elections.

Jay Stevens
Guest
Jay Stevens
3 years 9 months ago

There are significant differences between baseball statistics and polling data, but I don’t think it’s because one’s a game and the other’s real life. I think the big difference is what is being measured, and what effect they have on the game.

In baseball, we use advanced stats primarily to discern luck from ability, IMHO. That is, we’re measuring value. How much a player is worth, baseball-wise.

Polls don’t measure value. They measure how people are likely to vote at a given time; that is, they measure probable outcomes. It’s more like measuring win probability than WAR. More accurately, it’s as the election were a game, but the players don’t know what the score is until the game is over; polls are estimates of what the score is at any given time during the game. Polls are crucial to help plan strategy, but they aren’t as directly related to strategy or game management as baseball statistics are.

Here’s another thing. What Silver is doing could possibility affect the outcome of the election. There are studies that show voters both pay attention to the polls and let the polls influence their vote, and whether they vote. I guess you could argue baseball statistics could affect the games; managers, say, might base their lineup on what statistics say about his players’ value. But baseball stats don’t change player ability. Pointing out a batter has a low strikeout rate doesn’t mean his plate discipline will improve or degrade because of the measurement….

/ramble

Viliphied
Guest
Viliphied
3 years 9 months ago

More accurately, it’s as the election were a game, but the players don’t know what the score is until the game is over; polls are estimates of what the score is at any given time during the game.

I am suddenly overcome with a desire to design a game like this.

Jay29
Guest
Jay29
3 years 9 months ago

“But baseball stats don’t change player ability. Pointing out a batter has a low strikeout rate doesn’t mean his plate discipline will improve or degrade because of the measurement….”

Usually, no, but some players do react to their statistics and try to tweak their game for the better (like Brandon McCarthy and his groundball rate).

chris_d
Guest
chris_d
3 years 9 months ago

Great piece Dave.

Dan
Guest
Dan
3 years 9 months ago

My issue with Nate Silver’s predictions is not whether they are accurate or who he predicts will win. I have an issue with any predictions/projections because of how they might affect voter turnout or they way people vote.

For example, Nate Silver’s predictions have proven to be extremely accurate and are predicting a win for candidate X and I support candidate Y. Why should I bother spending time waiting in line to vote when I could be spending time with my family or spending time at work earning a living when it nearly certain my candidate will lose? Additionally, I am an undecided voter and Nate’s extremely accurate predictions are predicting a victory of candidate X, I like the idea of voting and would rather feel like a winner so I will also vote for candidate X (essentially bandwagon effect). Everyone wants to feel like their vote “counted.”

Obviously, Nate has every right to make predictions and he is a really smart guy so they are going to be accurate. I’m just not a fan of them because I think they can potentially become somewhat self-fulfilling.

TKDC
Guest
TKDC
3 years 9 months ago

I think it has been clear to all but the delusional for years that their specific one vote out of millions was not going to be a difference-maker.

As for the bandwagon effect, it can also go in reverse – as in, “that other guy isn’t so bad as this model says (or in the past, this poll), so I’ll vote for him.”

This has sometimes cropped up in voting for baseball awards and the hall of fame. For instance, many writers didn’t vote for Robby Alomar his first year on the ballot but did vote for clearly inferior players, who they almost had to actually know where clearly inferior. If there were a predictive model that showed Robby Alomar were going to get 73.7% of the vote (or whatever the exact number he got was), would that sway voters to bring him over the top? Would it sway voters to make sure he didn’t get it?

rusty
Guest
rusty
3 years 9 months ago

I agree completely. Does projecting outcomes have an effect on turnout and/or voter preferences? I’m sure it does. Can the magnitude and direction of that effect be determined? No.

I don’t think the response to this potential introduction of “projection knowledge” bias should be to stop doing predictive modeling, though. When the media climate is so heavily tilted toward predictions based on gut feelings or opaque models (e.g. the use of exit polling, especially of early voters, in “calling” a state for a candidate), I don’t see anything wrong with fivethirtyeight adding some additional transparency and statistical assessment of certainty to the chatter.

Buck Turgidson
Guest
Buck Turgidson
3 years 9 months ago

First Nate is not making predictions, he is just using readily available data to produce rational analysis. OMG.

If you’ll tary to improving turnout you’ll look into the electoral college which allows suburban Cincinnati to determine results before 2/3 of the country is done voting. And you could make Election Day a holiday like in a normal democracy or Puerto Rico. And while your at it make registration compulsory, again like a real democracy.

What Nate started is the demystification of our system. One in which television personalities and billionaires can conduct a high stakes horse race which only vaguely resembles democracy. The pundits don’t appreciate that it undermines their psychodrama. Alot of other haters seem like they wished they’d thought of it first.

Paul
Guest
Paul
3 years 9 months ago

Well, a prediction saying “my candidate is going to lose” might discourage me from voting, or it might energize me into trying to be that “one extra vote” that pushes my candidate into the win column. There’s a whole bunch of political science studies about that kind of thing, and you see campaigns using that tactic to try to scare/encourage their party to vote. It’s all about perception and human game-theory on some level, and if you’re worried about predictions swaying the electorate, you’re probably even more worried about campaigning swaying the electorate.

George Resor
Member
3 years 9 months ago

First i’d like to say that this is a great article and the Dave makes a great point. In response to Dan’s comment, i doubt that Nate Silver’s projections where that much of a self-fulfilling this election cycle because of the the amount of Nate Silver bashing going on before the election but it could definitely be a problem in the future. For this election they quarantined exit polling data until after the polls closed because exit polls in the past had begun to suppress turnout and become self-fulfilling, so there is already a precedent for well accepted models having the effects that Dan is talking about.
And on a personal note i like to feel that my vote counted, even if it would not decide an election but because by voting i have in some way impacted models of likely voter and will get pandered to slightly more by politicians in the future.

TKDC
Guest
TKDC
3 years 9 months ago

At least as of now, Nate Silver is still an inside the bubble guy. I live in DC and most people I know that are not really into politics don’t know who he is. I’d guess that elections have never turned on the predictions of pundits, and that is not likely to change just because the predictions are now more sound. Commercials paid for secretly by large special interests have a much greater effect on elections. But that is a different story.

Alan Nathan
Guest
3 years 9 months ago

Excellent article, Dave. I agree with your key point:

Learning how to deal with imperfect data (i.e., extracting the signal from the noise, to quote the title of Nate’s book) is not always easy and requires smart people to figure out how to do it. In particular, I have a lot of confidence in Nate’s ability to do that. I would also emphasize that is important to keep a steady hand when doing forecasting. Nate created his model earlier in the year and did not change it. Of course, the input data to the model was constantly updated, but the algorithms for weighting, adjusting, etc. were not.

As an aside, it was amusing last night to see Karl Rove openly challenge the professional forecasters at Fox News when Ohio was called for Obama. Rove demonstrated that he understands very little about forecasting and ended up looking quite foolish.

jcxy
Guest
jcxy
3 years 9 months ago

This is actually an important point, given the theme of the article. If you were following Florida county results in realtime on Politico’s county map, you could immediately tell that Rove was not using correct or up-to-date data starting around 9pm. I’m not sure what data he was using (Yahoo and HuffPo’s reporting were both pretty far behind Politico) but his results for Hillsborough, Ocean, Orange, and I believe Polk counties were simply wrong. If that data were correct, his conclusion would have been completely defensible.

Rove’s bewilderment and, in turn, your conclusion about it are instructive illustrations of what can happen when data inputs are poor. Of course, yes, it did make for entertaining TV.

DMZ
Guest
DMZ
3 years 9 months ago

“The takeaway should be that even mediocre data is often better than no data, and when you put mediocre data in the hands of smart people who understand its limitations and adjust accordingly, it can become quite useful indeed.”

Wouldnt it be neat to see this happen at Fangraphs? A fella can dream…

DMZ
Guest
DMZ
3 years 9 months ago

In all seriousness, you are no Nate Silver.

jason B
Guest
jason B
3 years 9 months ago

I like that you thought your first criticism was too opaque to be distilled by the dullards around here, so you decided to come back and just stick out your tongue and call him stupid for good measure. Boy, you sure showed Dave!

BenH
Guest
BenH
3 years 9 months ago

This made me think weather modeling and how people get upset when it rains when there was a 20% chance that day.

Nostradamus
Guest
Nostradamus
3 years 9 months ago

In 2008, Silver correctly called 49 of 50 states in the Presidential election, and nailed every one of the 35 Senate races. Then, last night, either 49-for-50, or a perfect record, depending on how Florida goes.

I’m jealous.

Matt Hunter
Member
Member
3 years 9 months ago

He correctly called 50 out of 50 because he predicted that Florida would be almost literally a tie. Just because he had Florida as 50.3% Obama doesn’t mean he predicted Florida would go to Obama. It doesn’t matter what happens with Florida – Nate was dead on.

Cara
Guest
Cara
3 years 9 months ago

Article needs a .gif of a cat licking a lollipop.

Grant Brisbee
Guest
Grant Brisbee
3 years 9 months ago

I was just thinking the exact same thing.

MikeS
Guest
MikeS
3 years 9 months ago

People kept asking me who I thought would win the election and why. I kept telling them “Obama, because Nate Silver says he has an 80 or 90% chance and he understands the numbers better than all the pundits who wouldn’t know a standard deviation from a standard poodle.”

Otis
Guest
Otis
3 years 9 months ago

Let’s be honest about this. Most people thought Obama was the favorite. The betting markets had him around 80%. The people loudly criticizing Silver were Republican pundits whose JOB demands them to criticize the suggestion that Romney was a big underdog.

And touting 49 out of 50 is annoying, and it’s similar to the people that tout Lundari getting 63 out of 65. The overwhelming majority of these predictions are gimmes. You wanna tout that he nailed Ohio or the 7th best team in the Big Ten, that’s fine, but don’t brag about calling California for Obama.

Dave
Guest
Dave
3 years 9 months ago

But calling Virginia and Florida for Obama was pretty impressive, yeah?

Butters
Guest
Butters
3 years 9 months ago

You start with “Let’s be honest about this,” and then you immediately fib.

You claim that “the betting markets had Obama around 80%.” Nobody anywhere had Romney a 4-1 underdog. In Britain, he was less than 2-1 in late October.

Also, you gripe about Silver getting credit for 49-of-50. It “annoys” you, since it’s really not all that difficult, huh? And yet, somehow, you couldn’t come up with a *single* other name of anyone who’d managed the same achievement.

Or maybe you yourself got all 50 correct, and I’m being unfair to you. If so, my apologies.

Paul
Guest
Paul
3 years 9 months ago

Actually, the RCP average on the day of the election for all the swings were all for Obama, so that simple poll aggregate method got it 50/50 also http://www.realclearpolitics.com/epolls/2012/president/2012_elections_electoral_college_map.html

And there was a prediction market, in Ireland I believe, that was up around 80% the day before the election. It was of course immediately dismissed. What do the Irish know about politics!?

The criticism of SIlver that I found most compelling was just that his method, like PECOTA, is overly complicated for what it does. PECOTA is not any better than Marcels, and Silver’s modeling did not predict any better than a simple aggregation of polls.

If people learn something about either politics or statistics from reading his blog, fine. But using Silver to peddle the notion that biased data is useful is not being intellectually honest. Aggregating many polls, each using different methodologies and weights, conducted at different times, and adding up to a very large sample, dramatically reduces the effects of any bias.

In other words, Silver’s “biased” data turning out to be highly predictive is not the same thing as accepting UZR or FIP as anything more than interesting trivia.

Otis
Guest
Otis
3 years 9 months ago

Pinnacle, the best sports betting website on the planet, was a tad over 80% the entire day. I was checking it often.

I didn’t get all 50. I went 0-0. I could have gone 38-38 with 2 mins of work if I wanted. That’s my point.

Michael Lewis
Guest
3 years 9 months ago

Nate,

I’m writing my next book about you. Call me.

xxoo,
MLew

GMH
Guest
GMH
3 years 9 months ago

Now if only baseball players were as predictable as voters. Nevertheless, to Rush Limbaugh, David Brooks, Dean Chambers, Dan Shaugnessy, Murray Chass, Josh Jordan, Jennifer Rubin, Joe Scarborough, and all the other knuckle-dragging, inbred, semi-literate halfwits who are paid ridiculous sums of money to be terminally wrong: FUCK YOU.

umm
Guest
umm
3 years 9 months ago

why so serious?

Ben Hall
Member
Member
Ben Hall
3 years 9 months ago

Really odd that you put David Brooks in with the rest of that group.

Jason H.
Guest
Jason H.
3 years 9 months ago

The problem isn’t data versus no data, or what kind of data, the problem is how you treat data.

In science, we decide how we think the world works (hypothesis) and then try our best to *disprove* it. That is good process.

In pseudo-science you decide how you think the world works (ideology) and then try your best to *prove* it. The problem with Dick Morris, et al. was not that they didn’t use data, it was that their process was backwards.

Unfortunately, lots of sports analyses suffer from this same problem. So the analogy between Nate Silver and the pundits should not be to SABR and the “old school”, it should be to good SABR and bad SABR in my opinion.

Paul
Guest
Paul
3 years 9 months ago

I tried badly to say this above. Agree.

Tim_the_Beaver
Member
Tim_the_Beaver
3 years 9 months ago

An article I really enjoyed that gives credit where it’s due, but also tempers all the man-love:
http://www.economist.com/blogs/democracyinamerica/2012/11/politics-and-statistics
it’s also a great shout-out to this community that we’re a part of.

Math Nerd
Guest
Math Nerd
3 years 9 months ago

Everyone’s ignoring the obvious. Conservatives don’t like Silver because he’s gay.

@steveh603
Guest
@steveh603
3 years 9 months ago

#6org

Sam
Guest
Sam
3 years 9 months ago

By far my favorite thing on fivethirtyeight this year was when Silver used the phrase “Replacement-level Republican Candidate” in the primaries

wpDiscuz