This article originally appeared on FiveThirtyEight and is reprinted with permission.
How do you predict a general election with Donald Trump?
We can think of a few basic approaches. One of them is to assert that precedent doesn't apply to this election and that Trump's case is sui generis. It's not clear where that leads you, however.
If Trump is "unpredictable," a phrase we heard used to describe him so often during the primaries, does that mean his chances of defeating Hillary Clinton are 50/50? If that's what you think, you have the opportunity to make a highly profitable wager. Betting markets put Trump's chances at only 20 percent to 25 percent instead.
In fact, despite (or perhaps because of) the unusual nature of his candidacy, the conventional wisdom holds that Trump is a fairly substantial underdog. In contrast to 2012, when there were frequent arguments over how solid President Obama's lead in the polls was, there hasn't been much of a conflict between "data journalists" and "traditional journalists" on this question of Trump's chances. Nor has there been one between professionals who cover the campaign and the public; most experts expect Trump to lose, but so do most voters.
But should this seeming consensus give us more confidence -- or make us nervous that we're underestimating Trump again?
Giving Trump a 20 percent or 25 percent chance of becoming president means that Clinton has a 75 percent to 80 percent chance. That might seem generous given that, under ordinary circumstances, the background conditions of this election (no incumbent running and a mediocre economy) would seem to suggest a tossup. Are Clinton's high odds justified on the basis of the polls? Or do they require making heroic assumptions about Trump, the same ones that got everyone, emphatically including yours truly, in trouble during the primaries?
The short answer is that 20 percent or 25 percent is a pretty reasonable estimate of Trump's chances based on the polls and other empirical evidence. In fact, that's quite close to where FiveThirtyEight's statistical models, which are launching today, have the race. Our polls-only model has Trump with a 19 percent chance of beating Clinton as of early Wednesday afternoon. (The forecasts will continually update as new polls are added.) Our polls-plus model, which considers economic conditions along with the polls, is more optimistic about Trump, giving him a 26 percent chance.
Still, Trump faces longer odds and a bigger polling deficit than John McCain and Mitt Romney did at the same point in their respective races. He needs to look back to 1988 for comfort, when George H.W. Bush overcame a similar deficit against Michael Dukakis to win. Our models are built from data since 1972, so the probabilities we list account for elections such as 1980, 1988 and 1992, when the polls swung fairly wildly, along with others, such as 2004 and 2012, where the polls were quite stable.
Two empirical approaches to forecasting Trump
For me, the lesson of the primaries is that one needs to be more rigorous, not less so, when forecasting elections. That means building a model instead of winging it. In contrast to our early, back-of-the-envelope skepticism about Trump, FiveThirtyEight's forecast models were largely accurate in the primaries, with our polls-only model calling 53 of 58 races correctly and our polls-plus model calling 52 of 58. The polls were a long way from being perfect, but they were wrong within normal parameters: Upsets happened about as often as they were supposed to happen, according to our models.
Nor did the polls underestimate Trump. National polls had him leading the field all along. State-by-state polls overestimated him slightly -- Trump lost states where the polls had him favored, such as Iowa and Oklahoma -- although not by enough to cost him the nomination.
So what's our approach this time around? Actually, we'll take two approaches: polls-plus and polls-only.
The model we call polls-plus abides by the principle of "stick with what works." It's pretty much the same model that we used to successfully forecast the 2012 election, blending polls with an index of economic performance. As the election approaches, the weight assigned to the economic index will fade to zero, meaning that polls-plus and polls-only will converge. (Although they won't be identical; there are some other, more subtle differences, as I describe in the guide to the forecast's methodology.) For the time being, however, polls-plus assumes the race will probably tighten somewhat. That's because it shows the economy as being almost exactly average and assumes there's neither an advantage nor a penalty to the incumbent party in a year like this one, when an incumbent president is retiring. In other words, it sees the fundamentals of the race as showing a tossup, and reduces Clinton's current lead of about 7 percentage points to a projected Election Day win of about 4 percentage points as a result.
Polls-only's maxim is "keep it simple, stupid." This is often a good strategy when faced with novel situations; instead of adding new assumptions, you should remove questionable assumptions. Historically in presidential elections, for instance, polls tend to converge toward the fundamentals down the stretch run. Usually that means they tend to tighten as Election Day approaches.
That tightening, however, may occur because the major-party presidential candidates are usually fairly equal in terms of factors such as fundraising, messaging and other basic aspects of running a campaign. That might not be true for Trump and Clinton this year, however. Trump is woefully behind Clinton in fundraising and campaign infrastructure. He's also a less experienced politician, and faces more intraparty opposition, than any Republican nominee in a long time.
The polls-only model doesn't make any assumptions about this. It doesn't use fundraising or political experience as a factor in making its forecasts - instead, it just uses polls and demographic data. It doesn't treat Trump any differently than it would treat Marco Rubio or Mitt Romney. But polls-only assumes the candidates' current standing in the polls -- meaning, Clinton ahead of Trump by 7.3 percentage points -- is the best estimate of the Nov. 8 result. It also accounts for more uncertainty than polls-plus. Even so, it has Trump as a heavier underdog than polls-plus does.
I know some readers will be frustrated by our having two versions of the model; it seems like we're trying to have it both ways. My view is this: The choice of assumptions can matter quite a bit when building models, especially in cases such as presidential elections where the historical sample sizes are small. We want to be more transparent about that. If polls-plus and polls-only show radically different results, that suggests the choice of assumptions matters a lot -- something you deserve to know. And if they show highly similar results, that's useful information also; it might give you more confidence in the robustness of our approach.
Which model is the official version of FiveThirtyEight's forecast? They both are, and keep in mind that this will become something of a moot point because the models will converge toward each other. However, when you click on our forecast home page, you'll see that polls-only is the default; we think it's a better starting point in an election such as this one.
In addition to the polls-only and polls-plus forecasts, we're also publishing something called a now-cast. The now-cast isn't a forecast; it's a hypothetical projection of what would happen if the election were held today. The now-cast is designed to be extremely aggressive about identifying trends in the polls, more aggressive than is optimal when forecasting an election several months out. (One of the big lessons of our model, in fact, is that you want to be fairly conservative about declaring turning points early in the race. Apparent shifts in the polls often reverse themselves.) As a result, the now-cast is subject to some fairly violent swings and will sometimes pick up more noise than signal. Still, it can provide for an interesting diagnostic of which candidate has momentum, however fleeting.
An Electoral College challenge for ... Clinton?
If the middling economy is one silver lining for Trump, another is his swing state polls, which don't seem to be as bad for him as his national polls. They aren't good by any means, either, but whereas Trump trails Clinton by 6.7 percentage points in our average of national polls, according to our polls-only model, he's down 4.8 points in our adjusted polling average of Ohio, 5.7 points in Florida, 3.9 points in Iowa, and 2.0 points in Colorado, for instance.
Again, we don't mean to suggest that these are great numbers for Trump; the Florida result, for example, would represent the worst loss by a Republican there since 1948. Nonetheless, and somewhat in contrast to the conventional wisdom, our model suggests that Trump is more likely to win the Electoral College while losing the popular vote than the other way around. (Though the chances of either scenario are small.)
Some of this may be because we just haven't had all that much swing state polling; it's possible we'll see leads for Clinton in the mid- to high single digits as these states are polled more often. Just this morning, for example, the firm Evolving Strategies published a set of polls in swing states showing Clinton leading Trump by 10 percentage points, on average. If there are more numbers like those, the model will adjust accordingly.
But there's another potential explanation, which is that Trump is badly underperforming in red states, presumably as a result of having failed to consolidate the Republican base. That may put some traditionally red states into play for Clinton. For instance, Arizona, Missouri, North Carolina and the 2nd Congressional District of Nebraska are all tossups, according to the polls-only model. (Polls-plus has Trump narrowly favored in these places.)
Some of these states could be useful to Clinton. Arizona, in particular, could help Clinton put together some winning maps based on Western or heavily Hispanic states, even if she loses much of the industrial Midwest. Others, such as Missouri, are probably more superfluous. They could potentially add to Clinton's Electoral College margin, but they aren't likely to be tipping-point states that make the difference between her winning and losing.
That goes doubly for states such as Texas, Utah, Kansas and Alaska, where polls have often shown a single-digit margin for Trump and have occasionally even had Clinton winning. Republicans are used to racking up huge numbers of votes in these states, bolstering their standing in the national popular vote. If Trump wins Texas by only 6 percentage points instead of 16, that will hurt his popular-vote margin without affecting his Electoral College odds much.
Is the reverse also true? Is Trump overperforming in blue states, relative to how a Republican usually does? It depends on where you look. The Northeast was Trump's strongest region in the primaries, and he's gotten relatively good numbers -- although he still trails Clinton -- in polls of New Jersey, Connecticut and Maine. (He also leads Clinton in one poll of Maine's 2nd Congressional District, which would be worth one electoral vote.) However, he's losing by typical margins in New York and California, where he has vowed to compete.
Overall, the polls so far suggest a slightly less polarized electorate and a somewhat wider playing field than we've gotten used to in recent years. That's a potentially refreshing change, although it may prove to be ephemeral as both Clinton and Trump have room to grow with their party bases and could gain ground in traditionally blue and red states as a result.
Undecideds are abundant, so uncertainty is high
Giving Clinton a 70 percent or 75 percent chance of winning might seem bold. It's actually fairly cautious, however, compared with what the model would normally say about a candidate with a 7-point lead.
That's because Trump is at only 36 percent in our national polling average, while Clinton is at only 43 percent. Gary Johnson, the Libertarian Party candidate whom our model explicitly includes in the forecast, is polling in the double digits in some polls, while we're seeing a significant undecided vote and some votes for other candidates, such as Jill Stein of the Green Party.
Historically, high numbers of undecided voters contribute to uncertainty and volatility. So do third-party candidates, whose numbers sometimes fade down the stretch run. With Clinton at only 43 percent nationally, Trump doesn't need to take away any of her voters to win. He just needs to consolidate most of the voters who haven't committed to a candidate yet.
By the same token, there's the possibility of a landslide against Trump, whose floor is unusually low given that he's getting only 36 percent of the vote now. Polls-only gives Clinton a 35 percent chance of winning by double digits nationally, which would make her the first candidate to do so since Ronald Reagan in 1984 (and the first Democrat since Lyndon B. Johnson in 1964).
A 20 percent or 25 percent chance of Trump winning is an awfully long way from 2 percent, or 0.02 percent. It's a real chance: about the same chance that the visiting team has when it trails by a run in the top of the eighth inning in a Major League Baseball game. If you've been following politics or sports over the past couple of years, I hope it's been imprinted onto your brain that those purported long shots -- sometimes much longer shots than Trump -- sometimes come through.
But the polls establish Clinton as a fairly clear favorite. And in contrast to almost everything else this election cycle, the polls have mostly been right so far.
Donald Trump has a 20 percent chance of becoming president