Posts Tagged 'cybersecurity'

Lessons from Wannacrypt and its cousins

Now that the dust has settled a bit, we can look at the Wannacrypt ransomware, and the other malware  that are exploiting the same vulnerability, more objectively.

First, the reason that this attack vector existed is because Microsoft, a long time ago, made a mistake in a file sharing protocol. It was (apparently) exploited by the NSA, and then by others with less good intentions, but the vulnerability is all down to Microsoft.

There are three pools of vulnerable computers that played a role in spreading the Wannacrypt worm, as well as falling victim to it.

  1. Enterprise computers which were not being updated in a timely way because it was too complicated to maintain all of their other software systems at the same time. When Microsoft issues a patch, bad actors immediately try to reverse engineer it to work out what vulnerability it addresses. The last time I heard someone from Microsoft Security talk about this, they estimated it took about 3 days for this to happen. If you hadn’t updated in that time, you were vulnerable to an attack that the patch would have prevented. Many businesses evaluated the risk of updating in a timely way as greater than the risk of disruption because of an interaction of the patch with their running systems — but they may now have to re-evaluate that calculus!
  2. Computers running XP for perfectly rational reasons. Microsoft stopped supporting XP because they wanted people to buy new versions of their operating system (and often new hardware to be able to run it), but there are many, many people in the world for whom a computer running XP was a perfectly serviceable product, and who will continue to run it as long as their hardware keeps working. The software industry continues to get away with failing to warrant their products as fit for purpose, but it wouldn’t work in other industries. Imagine the discovery that the locks on a car stopped working after 5 years — could a manufacturer get away with claiming that the car was no longer supported? (Microsoft did, in this instance, release a patch for XP, but well after the fact.)
  3. Computers running unregistered versions of Microsoft operating systems (which therefore do not get updates). Here Microsoft is culpable for an opposite reason. People can run an unregistered version for years and years, provided they’re willing to re-install it periodically. It’s technically possible to prevent (or make much more difficult) this kind of serial illegality.

The analogy is with public health. When there’s a large pool of unvaccinated people, the risk to everyone increases. Microsoft’s business decisions make the pool of ‘unvaccinated’ computers much larger than it needs to be. And while this pool is out there, there will always be bad actors who can find a use for the computers it contains.

Advertisements

Secrets and authentication: lessons from the Yahoo hack

Authentication (I’m allowed to access something or do something) is based on some kind of secret. The standard framing of this is that there are three kinds of secrets:

  1. Something I have (like a device that generates 1-time keys)
  2. Something I am (like a voiceprint or a fingerprint), or
  3. Something I know (like a password).

There are problems with the first two mechanisms. Having something (a front door key) is the way we authenticate getting into our houses and offices, but it doesn’t transfer well to the digital space. Being something looks like it works better but suffers from the problem that, if the secret becomes widely known, there’s often no way to change the something (“we’ll be operating on your vocal cords to change your voice, Mr. Smith”). Which is why passwords tend to be the default authentication mechanism.

At first glance, passwords look pretty good. I have a secret, the password, and the system I’m authenticating with has another secret, the encrypted version of the password. Unfortunately, the system’s secret isn’t very secret because the encrypted version of my password is almost always transmitted in clear because of the prevalence of wifi. Getting from the system’s secret to mine is hard, which is supposed to prevent reverse engineering my secret from the system’s.

The problem is that the space of possible passwords is small enough that the easy mapping, from my secret to the system’s, can be tried for all strings of reasonable length. So brute force enables the reverse engineering that was supposed to be hard. Making passwords longer and more random helps, but only at the margin.

We could instead make the secret a function instead of a string. As the very simplest example, the system could present me with a few small integers, and my authentication would be based on knowing that I’m supposed to add the first two and subtract the third. My response to the system is the resulting value. Your secret might be to add the first and the third and ignore the second.

But limitations on what humans can compute on the fly means that the space of functions can’t actually be very large, so this doesn’t lead to a practical solution.

Some progress can be made by insisting that both I and the system must have different secrets. Then a hack of either the system or of me by phishing isn’t enough to gain access to the system. There are a huge number of secret sharing schemes of varying complexity. But for the simplest example, my secret is a binary string of length n, and the system’s secret is another binary string of length n. We exchange encrypted versions of our strings, and the system authenticates me if the exclusive-or of its string and mine has a particular pattern. Usefully, I can also find out if the system is genuine by carrying out my own check. This particular pattern is (sort of) a third secret, but one that neither of us have to communicate and so is easier to protect.

This system can be broken, but it requires a brute force attack on the encrypted version of my secret, the encrypted version of the system’s secret, and then working out what function is applied to merge the two secrets (xor here, but it could be something much more complex). And that still doesn’t get access to the third secret.

Passwords are the dinosaurs of the internet age; secret sharing is a reasonable approach for the short to medium term, but (as I’ve argued here before) computing in compromised environments is still the best hope for the longer term.

The growing role of data curation

My view of Data Science, or Big Data if you prefer, is that it divides naturally into three different subfields:

  1. Data curation, which involves focusing on the issues of managing large amounts of heterogeneous data, but is primarily concerned about provenance, that is tracking the metadata about the data.
  2. Computational science, which builds models of the real-world inside computer systems to study their properties.
  3. Analytics, which infers the properties of systems based on data about them.

I’ve posted about these ideas previously (https://skillicorn.wordpress.com/2015/05/09/why-data-science/),

Data curation might have seemed like the poor cousin among these three, and certainly gets the least funding and attention.

But issues of provenance have suddenly become mainstream as everyone on the web struggles to figure out what to do about fake news stories. So far, the Internet has not really addressed the issues of metadata. Most of the big content providers know who generated the content that they create and distribute, but they don’t necessarily make this information known or available for those who read the content to leverage. It’s time for the data curation experts, who tend to come from information systems and library science, to step up.

Data curation is also about to become the front line in cyberattack. As I’ve suggested (Skillicorn, DB, Leuprecht, C, and Tait, V. 2016. Beyond the Castle Model of Cybersecurity.  Government Information Quarterly.), a natural cyberdefence strategy is replication. Data exfiltration is made much more difficult if there many, superficially similar, versions of any document or data that might be a target. However, progress in assigning provenance becomes the cyberattack that matches this cyber defence.

So here’s the research question for data curation: how can I tell, from the internal evidence, and partial external evidence, whether this particular document is legitimate (or is the legitimate version of a set of almost-replicates)?

It’s not classified emails that are the problem

There’s been reporting that the email trove, belonging to Huma Abedin but found on the laptop of her ex-husband, got there as the result of automatic backups from her phone. This seems plausible; if it is true then it raises issues that go beyond whether any of the emails contain classified information or not.

First, it shows how difficult it is for ordinary people to understand, and realise, the consequences of their choices about configuring their life-containing devices. Backing up emails is good, but every user needs to understand what that means, and how potentially invasive it is.

Second, to work as a backup site, this laptop must have been Internet-facing and (apparently) unencrypted. That means that more than half a million email messages were readily accessible to any reasonably adept cybercriminal or nation-state. If there are indeed classified emails among them, then that’s a big problem.

But even if there are not, access to someone’s emails, given the existence of textual analytics tools, means that a rich picture can be built up of that individual: what they are thinking about, who they are communicating with (their ego network in the jargon), what the rhythm of their day is, where they are located physically, what their emotional state is like, and even how healthy they are.

For any of us, that kind of analysis would be quite invasive. But when the individual is a close confidante of the U.S. Secretary of State, and when many of the emails are from that same Secretary, the benefit of a picture of them at this level of detail is valuable, and could be exploited by an adversary.

Lawyers and the media gravitate to the classified information issue. This is a 20th Century view of the problems that revealing large amounts of personal text cause. The real issue is an order of magnitude more subtle, but also an order of magnitude more dangerous.

Come back King Canute, all is forgiven

You will remember that King Canute held a demonstration in which he showed his courtiers that he did not have the power to hold back the tide.

Senior officials in Washington desperately need courtiers who will show them, with equal force, that encryption has the same sort of property. If it’s done right, encrypted material can’t be decrypted by fiat. And any backdoor to the encryption process can’t be made available only to the good guys.

The current story about Apple and the encrypted phone used by one of the San Bernadino terrorists is not helping to make this issue any clearer to government, largely because the media coverage is so muddled that nobody could be blamed for missing the point.

The basic facts seem to be these: the phone is encrypted, the FBI have been trying to get in to it for some time, and there’s no way for anyone, Apple included, to burn through the encryption without the password. This is all as it was designed to be.

The FBI is now asking Apple to alter the access control software so that, for example, the ten-try limit on password guesses is disabled. Apple is refusing on two grounds. First, this amounts to the government compelling them to construct something, a form of conscription that is illegal (presumably the FBI could contract with Apple to build the required software but presumably Apple has no appetite for this).

Second, Apple argues that the existence proof of such a construct would make it impossible for them to resist the same request from other governments, where the intent might be less benign. This is an interesting argument. On the one hand, if they can build it now, they can build it then, and nobody’s claiming that the required construct is impossible. On the other hand, there’s no question that being able to do something in the abstract is psychologically quite different from having done it.

But it does seem as if Apple is using its refusal as a marketing tool for its high-mindedness and pro-privacy stance. Public opinion might have an effect if only the public could work out what the issues are — but the media have such a tenuous grasp that every story I saw today guaranteed greater levels of confusion.

Cybersecurity training — the contrasts

I think there must be wide agreement that skills in the cybersecurity domain are highly valuable in the 21st century, but also in extremely short supply.

It’s interesting to compare the number of graduate-level programs focused on cybersecurity in the U.S. and in Canada. A quick search in the U.S. finds that more than 30 colleges (probably a lot more) offer at least a Master’s degree specialising in cybersecurity. There are also at least a handful in the U.K..

The identical search in Canada finds exactly zero such programs (there is one, but it’s not open to civilians). In fact, there are almost no graduate programs in Canada that offer even a single course in cybersecurity.

Country                    Population                Number of programs

U.S.                        319,000,000                      30+

Canada                     30,000,000                        0

U.K.                         64,000,000                        6+

Australia                   23,000,000                       3+

New Zealand              4,000,000                        2+

Part of the problem is structural. The Canadian federal government has the greatest interest in a well-trained cybersecurity pool (to supply the Communications Security Establishment Canada and to provide a path to hardening infrastructure, finance, and high-tech businesses). But Canadian universities are provincially funded, and the provinces don’t have much interest in cybersecurity.

The differences between the U.S. and Canada are stark, and make it clear that Canada is going to have a hard time pulling its weight in the Five Eyes collaboration. And it’s a difficult problem to solve because of the need to bootstrap: there aren’t enough faculty to teach and do research in cybersecurity, because there aren’t enough opportunities to learn how to.

Bridging airgaps for amateurs

I’ve pointed out before that air gapping (for example, keeping military networks physically separated from the internet) is a very weak mechanism in a world where most devices have microphones and speakers. Devices can communicate using audio, at frequencies humans in the room can’t hear; so that real air gapping requires keeping the two networks separated by distances or soundproofing good enough to prevent this kind of covert channel. The significance of this channel is underappreciated — it’s common even in secure environments to find internet-connected devices in the same room as secure devices.

The ante has been upped a bit by Google’s introduction of Tone, a Chrome add-on that communicates via the audio channel to allow sharing of URLs, in sort of the same way that Palm Pilots used to communicate using infrared. Adapting this app to communicate even more content is surely straightforward, so even amateurs will be able to use the audio channel. Quite apart from the threat to military and intelligence systems, there are many other nasty possibilities, including exfiltrating documents and infecting with malware that can exploit this new channel. And it doesn’t help that its use is invisible (inaudible).

The introduction of LiFi, which will bring many benefits, also introduces a similar side channel when most devices have a camera and a screen.

A world in which cybersecurity is conceived of as a mechanism of walls and gates is looking increasingly obsolete when the network is everywhere, and every gate has holes in it.