6.5/7 US presidential elections predicted from language use

I couldn’t do a formal analysis of Trump/Clinton language because Trump didn’t put his speeches online — indeed many of them weren’t scripted. But, as I posted recently, his language was clearly closer to our model of how to win elections than Clinton’s was.

So since 1992, the language model has correctly predicted the outcome, except for 2000 when the model predicted a very slight advantage for Gore over Bush (which is sort of what happened).

People judge candidates on who they seem to be as a person, a large part of which is transmitted by the language they use. Negative and demeaning statements obviously affect this, but so does positivity and optimism.

Voting is not rational choice

Pundits and the media continue to be puzzled by the popularity of Donald Trump. They point out that much of what he says isn’t true, that his plans lack content, that his comments about various subgroups are demeaning, and so on, and so on.

Underlying these plaintive comments is a fundamental misconception about how voters choose the candidate they will vote for. This has much more to do with standard human, in the first few seconds, judgements of character and personality than it does about calm, reasoned decision making.

Our analysis of previous presidential campaigns (about which I’ve posted earlier) makes it clear that this campaign is not fundamentally different in this respect. It’s always been the case that voters decide based on the person who appeals to them most on a deeper than rational level. As we discovered, the successful formula for winning is to be positive (Trump is good at this), not to be negative (Trump is poor at this), not to talk about policy (Trump is good at this), and not to talk about the opponent (Trump is poor at this). On the other hand, Hillary Clinton is poor at all four — she really, really believes in the rational voter.

We’ll see what happens in the election this week. But apart from the unusual facts of this presidential election, it’s easy to understand why Trump isn’t doing worse and Hillary Clinton isn’t doing better from the way they approach voters.

It’s not classified emails that are the problem

There’s been reporting that the email trove, belonging to Huma Abedin but found on the laptop of her ex-husband, got there as the result of automatic backups from her phone. This seems plausible; if it is true then it raises issues that go beyond whether any of the emails contain classified information or not.

First, it shows how difficult it is for ordinary people to understand, and realise, the consequences of their choices about configuring their life-containing devices. Backing up emails is good, but every user needs to understand what that means, and how potentially invasive it is.

Second, to work as a backup site, this laptop must have been Internet-facing and (apparently) unencrypted. That means that more than half a million email messages were readily accessible to any reasonably adept cybercriminal or nation-state. If there are indeed classified emails among them, then that’s a big problem.

But even if there are not, access to someone’s emails, given the existence of textual analytics tools, means that a rich picture can be built up of that individual: what they are thinking about, who they are communicating with (their ego network in the jargon), what the rhythm of their day is, where they are located physically, what their emotional state is like, and even how healthy they are.

For any of us, that kind of analysis would be quite invasive. But when the individual is a close confidante of the U.S. Secretary of State, and when many of the emails are from that same Secretary, the benefit of a picture of them at this level of detail is valuable, and could be exploited by an adversary.

Lawyers and the media gravitate to the classified information issue. This is a 20th Century view of the problems that revealing large amounts of personal text cause. The real issue is an order of magnitude more subtle, but also an order of magnitude more dangerous.

Biometrics are not the answer for authentication

I’ve pointed out before that biometrics are not a good path to follow to avoid the obvious and growing issues with authentication using passwords.

Many biometrics suffer from being easy to spoof: pictures of someone’s iris, appropriately embedded in a background, can fool iris readers, a sheet of clingfilm can often cause a fingerprint reader to ‘see’ the last real fingerprint used on it, and so on.

But there’s a more pervasive problem with biometrics. The fact that a biometric is something you are is, on the one hand, a positive because you don’t have to remember anything, and wherever you go, there you are.

But, on the other hand, a biometric cannot be changed, and this turns out to be a huge problem.

Suppose you go to authenticate using a biometric. The device that captures your biometric must convert it to something digital, and then compare that digital value to a previously recorded value associated with you.

There are two problems:

  1. For a while, the device has your biometric data as plaintext. It may be encrypted very close to the place where it is captured, but there is a gap, and the unencrypted version can potentially be grabbed in the gap. There is always a temptation/pressure to use low-power sensors for capture, and they may not be able to handle the encryption.
  2. The previously recorded values must be kept somewhere. If this location can be hacked, then the encrypted versions of the biometric can be copied. These encrypted versions can then be used for replay attacks.

Of course, there are defences. But, for example, if e-passports are to be used to enter multiple countries, then they must use the same repertoire of encryption techniques so that passports from multiple countries can be read by the same system. So it’s not enough to say that different encryptions of biometric plaintext to its encrypted versions will prevent these issues.

And if one person’s encrypted biometric is stolen, there’s no practical way to update the system’s that rely on it (since they must continue to use the same mapping so that everyone else’s biometrics will still work). More importantly, there’s no way to issue a fresh identity for the person whose data was stolen (“Go and have plastic surgery so that we can restore your use of facial recognition”).

The real problem with the Clinton email server

Every intelligence person I’ve talked to has told me that the probability that the Russians and Chinese (at least) hacked Hillary Clinton’s email server is 100%.

While the question of whether any of the emails were classified, about to be classified, or should have been classified is interesting, the real risk created by the use of this server is that it provided a real-time look at the communications of the Secretary of State (and the people she was talking to).

Even the unclassified emails provided insight into the Secretary’s state of mind, plans, location, and intentions. Some of these might have been obvious; others would follow from examining email headers; and others by carrying out textual analysis (which is getting quite good at reverse engineering mental state, as regular readers will know).

Access to your entire email stream + some analytic capacity = fairly complete understanding of your life.

(Note that Google already does this for everyone who has a gmail account, and also for anyone who sends or receives email from anyone with a gmail account.)

Added 2016/05/06: A new problem now arises: control of the presidential election is in the hands of any country that can claim to have hacked the server. While hacking by a foreign power remains a (virtually certain) hypothetical, it is clearly having no impact on the election. But if a foreign power were to leak that they had hacked the server and exploited that somehow, the impact will surely be catastrophic. And I can imagine several of America’s enemies who might prefer a President Trump to a President Clinton II.

Come back King Canute, all is forgiven

You will remember that King Canute held a demonstration in which he showed his courtiers that he did not have the power to hold back the tide.

Senior officials in Washington desperately need courtiers who will show them, with equal force, that encryption has the same sort of property. If it’s done right, encrypted material can’t be decrypted by fiat. And any backdoor to the encryption process can’t be made available only to the good guys.

The current story about Apple and the encrypted phone used by one of the San Bernadino terrorists is not helping to make this issue any clearer to government, largely because the media coverage is so muddled that nobody could be blamed for missing the point.

The basic facts seem to be these: the phone is encrypted, the FBI have been trying to get in to it for some time, and there’s no way for anyone, Apple included, to burn through the encryption without the password. This is all as it was designed to be.

The FBI is now asking Apple to alter the access control software so that, for example, the ten-try limit on password guesses is disabled. Apple is refusing on two grounds. First, this amounts to the government compelling them to construct something, a form of conscription that is illegal (presumably the FBI could contract with Apple to build the required software but presumably Apple has no appetite for this).

Second, Apple argues that the existence proof of such a construct would make it impossible for them to resist the same request from other governments, where the intent might be less benign. This is an interesting argument. On the one hand, if they can build it now, they can build it then, and nobody’s claiming that the required construct is impossible. On the other hand, there’s no question that being able to do something in the abstract is psychologically quite different from having done it.

But it does seem as if Apple is using its refusal as a marketing tool for its high-mindedness and pro-privacy stance. Public opinion might have an effect if only the public could work out what the issues are — but the media have such a tenuous grasp that every story I saw today guaranteed greater levels of confusion.

“It’s going to be really great”

Donald Trump continues to be the poster child for our election-winning-languageĀ  model: high positive language, as little negative language as possible, and appeals to policy goals without getting into details. The media and pundits are tearing their hair out because he refuses to talk about specifics but, as we predict, it’s working! (Interestingly, I went back and looked at Perot’s language in the 1992 election, and he had more or less the same patterns — and he led the party contenders in national polls for a period in 1992.)

What the media and pundits don’t realise is that incumbent presidents running for a second term use language very similar to Trump’s. It’s just that, with a first-term track record, it’s not as glaringly obvious, and they don’t notice.