Posts Tagged 'computing in compromised environments'

Secrets and authentication: lessons from the Yahoo hack

Authentication (I’m allowed to access something or do something) is based on some kind of secret. The standard framing of this is that there are three kinds of secrets:

  1. Something I have (like a device that generates 1-time keys)
  2. Something I am (like a voiceprint or a fingerprint), or
  3. Something I know (like a password).

There are problems with the first two mechanisms. Having something (a front door key) is the way we authenticate getting into our houses and offices, but it doesn’t transfer well to the digital space. Being something looks like it works better but suffers from the problem that, if the secret becomes widely known, there’s often no way to change the something (“we’ll be operating on your vocal cords to change your voice, Mr. Smith”). Which is why passwords tend to be the default authentication mechanism.

At first glance, passwords look pretty good. I have a secret, the password, and the system I’m authenticating with has another secret, the encrypted version of the password. Unfortunately, the system’s secret isn’t very secret because the encrypted version of my password is almost always transmitted in clear because of the prevalence of wifi. Getting from the system’s secret to mine is hard, which is supposed to prevent reverse engineering my secret from the system’s.

The problem is that the space of possible passwords is small enough that the easy mapping, from my secret to the system’s, can be tried for all strings of reasonable length. So brute force enables the reverse engineering that was supposed to be hard. Making passwords longer and more random helps, but only at the margin.

We could instead make the secret a function instead of a string. As the very simplest example, the system could present me with a few small integers, and my authentication would be based on knowing that I’m supposed to add the first two and subtract the third. My response to the system is the resulting value. Your secret might be to add the first and the third and ignore the second.

But limitations on what humans can compute on the fly means that the space of functions can’t actually be very large, so this doesn’t lead to a practical solution.

Some progress can be made by insisting that both I and the system must have different secrets. Then a hack of either the system or of me by phishing isn’t enough to gain access to the system. There are a huge number of secret sharing schemes of varying complexity. But for the simplest example, my secret is a binary string of length n, and the system’s secret is another binary string of length n. We exchange encrypted versions of our strings, and the system authenticates me if the exclusive-or of its string and mine has a particular pattern. Usefully, I can also find out if the system is genuine by carrying out my own check. This particular pattern is (sort of) a third secret, but one that neither of us have to communicate and so is easier to protect.

This system can be broken, but it requires a brute force attack on the encrypted version of my secret, the encrypted version of the system’s secret, and then working out what function is applied to merge the two secrets (xor here, but it could be something much more complex). And that still doesn’t get access to the third secret.

Passwords are the dinosaurs of the internet age; secret sharing is a reasonable approach for the short to medium term, but (as I’ve argued here before) computing in compromised environments is still the best hope for the longer term.