Terrorist incidents come in only a few flavors

Terrorist attacks are different in many ways: they take place in different countries, with different motivations behind them, using different mechanisms, and with varying degrees of success. But are there any commonalities that could be used, for example, to categorize them and so to defend against them in more focused ways? The answer is yes, there are large-scale similarities.

To do this analysis, I started from the Global Terrorism Database developed by START, the National Consortium for the Study of Terrorism and Responses to Terrorism. The database contains details of all incidents that meet their coding standards since the beginning of 1970, and I used the version released at the end of 2012. There was one major discontinuity where new fields were added but overall the coding has been consistent over the entire 40+ year period.

The image below shows the clustering of all attacks over that time period:

attackslabelledThe large structure looks like a hinge with clusters A and B at the top, clusters C and D forming the hinge itself, and clusters E, F, G, and H at the bottom. There’s also a distinction between the clusters at the front (B, D, F, and H) and those at the back (A,C,E, and G). (You’ll have to expand the figure to see the labels clearly.)

The first thing to notice is that there are only8 clusters and, with the exception of H which is quite diffuse, they clusters are fairly well defined. In other words, there are 8 distinctive kinds of terrorist attack (and only 8, over a very long time period).

Let’s dig into these clusters and see what they represent. The distinction between the front and the back is almost entirely related to issues of attribution: whether the attack was claimed, how clear that claim is (for example, are there multiple claim of responsibility for the same incident), and whether the incident should be properly claimed as terrorism or something else (quasi-military, for example).

The structure of the hinge differentiates between incidents involving capturing people (hijackings or kidnappings in A and B) and incidents that are better characterized as attacks (C, D, E, F, G, H).  The extremal ends of A and B (to the right) are incidents that lasted longer and/or the ransom was larger.

The differences between C/D, E/F, and G/H arise from the number of targets (which seems to be highly correlated with the number of different nationalities involved). So C and D are attacks on a single target, E and F are attacks on two targets, and G and H are attacks on three targets. Part of the diffuse structure of H happens because claims are always murkier for more complex attacks and part because there is a small group of incidents involving 4 targets that appears, as you’d expect, even further down and to the right.

Here are some interesting figures which overlay the intensity of a property on the clustering, so that you can see how it’s associated with the clusters:

overlayclaimed

This figure shows whether the incident was claimed or not. The color coding runs from dark red to bright yellow; I’m not specifying the direction, because it’s complicated, but the contrast shows differences. In each case, the available color spectrum is mapped to the range of values.

overlaynhostkid

This figure shows the differences between incidents where there were some hostages or kidnapped and those where there weren’t.

overlaycountryThis figure shows that the country in which the incident took place is mostly unrelated to other properties of the incident; in other words, attacks are similar no matter where they take place.

This analysis shows that, despite human variability, those designing terrorist incidents choose from a fairly small repertoire of possibilities. That’s not to say that there couldn’t be attacks in which some people are also taken hostage; rather that those doing the planning don’t seem to conceptualize incidents that way, so when it happens it’s  more or less by accident. Perhaps some kind of Occam’s razor plays a role: planning an incident is already difficult so there isn’t a lot of brainpower to try for extra cleverness, and there’s probably also a perception that complexity increases risk.

Facial recognition

Most of the drama shows on television build on some kind of facial recognition, a set of faces flickering rapidly in the background, getting a match just as the main characters rendezvous in front of the screen.

I looked into the performance of facial recognition for my book “Knowledge Discovery for Counterterrorism and Law Enforcement (Taylor and Francis, available from all good booksellers) but it’s been a while so I thought I would go back and look at the current state-of-the-art.

First, it probably doesn’t have to be said, but real systems don’t display all of the faces as they process them — if they did it would slow them up by a factor of more than a thousand.

Second, what is their performance? There are many variables: camera angle, lighting, amount of space (and so detail) available for image storage.

There are also different versions of the problem. One important one is deciding if this specific stored image matches this just-captured image or not. This is what is used with biometric data stored in passports; there’s a digitized version of the photo you submitted in the chip in your passport; when you cross a border, a photo is taken of you (again, under quite controlled conditions) and that new photo is matched to the old. Even for such a 1-to-1 match the error rate is not trivial — I’ve seen 25% quoted which seems high, but agrees with my own experience.

The more common problem (in tv shows) is that an image has been captured from, say, CCTV and the goal is to determine if the person with that image is in a large database of identified images. In the jargon, the database images are called enrolled, and the newly collected one is called a probe.

Performance is usually characterized by giving a False Match Rate (FMR), the rate of matching a probe to an enrolled image when they aren’t actually the same person. So, for an access control system, this is the rate at which the system would let an intruder in. At present, values of around 0.001 (1 in a thousand) are typical. For this value, then, the dependent variable is the False Non Match Rate (FNMR) which is the rate at which someone who does match gets missed. So, for an access control system, this means that a legitimate entrant gets locked out.  These are typically in the range 0.03 to about twice that (3 in a hundred).

You can see that these results are much, much weaker than those portrayed on tv. If the database contains a million images, then it’s not a case of exclaiming “We found a match” but “we found a thousand matches (and now we have to go through them and see if we think any of them is actually a match)”. Not finding a match would be much more surprising. Some systems seem to be much worse; you don’t have to look far to find stories of facial recognition systems that have never matched anyone, even when their images are known to have been captured as probes — part of the problem being that one person in a hoodie looks pretty much like any other person in a hoodie.

From an access control point of view, these rates mean that there’s a 1 in a thousand chance of an intruder getting is (which is probably acceptable for many situations), but 3% or more of the time, legitimate users will have to try again. I haven’t seen any data, but presumably the false non matches are not uniformly distributed, so that some people have to try again much more often than others (i.e. not all faces are equally recognisable).

Of course, these performance numbers are, if not in ideal conditions, then in reasonable conditions, whereas real images tend to be much more variable (weather, dust on the camera lens, shake on the mounting,…). And, of course, in real systems you can’t zoom in and miraculously produce more pixels as they seem to be able to do on tv. So there’s quite a long way to go. I think it’s fair to say that progress is being made — but facial recognition is a long way from production use.

Verbal mimicry isn’t verbal (well, not lexical anyway)

One of my students, Carolyn Lamb, has been looking at deception in interrogation settings.

The Pennebaker model of deception, as devoted readers will know, is robust only for freeform documents. Sadly, the settings in which deception is often most interesting tend to be dialogues (law enforcement, forensic) and it’s known that the model doesn’t extend in any straightforward way to such settings.

We started out with the idea that responses would be mixtures of language elicited by the words in a question and freeform language from the respondent, and developed a clever method to separate them. Sadly, it worked, but it didn’t help. When the effect of question language was removed from answers, the differences between deceptive and truthful responses decreased.

Digging a little deeper, we were able to show that the influence of words from the question must impact response language at a higher level (i.e. earlier in the answer construction process than simply the lexical). Those who are being deceptive respond in qualitatively different ways to prompting words than those being truthful. A paper about this has been accepted for the IEEE Intelligence and Security Informatics Conference in Seattle next month.

Part of the explanation seems to be mirror neurons. There’s a considerable body of work on language acquisition, and on responses to single words, that uses mirror neurons as a big part of the explanation; I haven’t seen anything at an intermediate level where these results fit.

There are some interesting practical applications for interrogators. One strategy would be to reduce the presence of prompting words (and do so consistently across all subjects) so that responses become closer to statements, and so closer to freeform. My impression from my acquaintance is that smarter law enforcement personnel already know this and act on it.

But our results also suggest a new strategy: increase the number of prompting words because that tends to increase the separation between the deceptive and the truthful. This needs a good understanding of what kinds of response words to look for (and, for most, this has to be done offline because we as humans are terrible at estimating rates of words in real-time, especially function words). But it could be very powerful.

Radicalisation as infection

I’ve argued in previous posts that the process of radicalisation is one that depends largely on properties of the individual, rather than on grand social or moral drivers — personality rather than society — and that it depends on the presence of an actual person (already radicalised) who makes the potential ideas real.

There is an alternative. Woo, Son, and Chen (J. Woo, J. Son, and H. Chen. An SIR model for violent topic diffusion in social media. In Proceedings of 2011 IEEE International Conference on Intelligence and Security Informatics, ISI 2011, July 2011) show that radicalisation behaves a little bit like an infection (at least in the domain of ideas which they measure from forum postings). They show that the SIR (Susceptible-Infected-Recovered) model of disease transmission fits the data fairly well. In this model, members of a population begin in the susceptible state; they become infected with some probability A, and then recover with some probability B. After they’ve recovered they are no longer susceptible.

For the data they looked at, A was of the magnitude of 10^-4, so about 1 in 10,000 becomes infected. Once infected, B varied depending on the intensity of the topic from around 0.65 to 0.96. In other words, the probability of a ‘cure’ is well above a half, sometimes virtually certain.

This model suggests some interesting probabilities. First, it suggest that radicalisation is a state that can cure itself; in other words, we shouldn’t necessarily assume that once radicalised means always radicalised. Second, there may be a greater pool of people who pass through the stage of being radicalised but do not get it together to actually act on it before the fever breaks — perhaps because they don’t get the right training or the right opportunity at the time when they would exploit it if they could.

The numbers work out about right. There are around a million Muslims in the U.S. but the number who have (attempted to) carry out attacks is in the small number of dozens.

Questions are data too

In the followup investigation of the Boston Marathon bombings, we see again the problem that data analytics has with questions.

Databases are built to store data. But, as Jeff Jones has most vocally pointed out, simply keeping the data is not enough in adversarial settings. You also need to keep the questions, and treat them as part of the ongoing data. The reason is obvious once you think about it — intelligence analysts need not only to know the known facts; they also need to know that someone else has asked the same question they just asked. Questions are part of the mental model of analysts, part of their situational awareness, but current systems don’t capture this part and preserve it so that others can build on it. In other words, we don’t just need to connect the dots; we need to connect the edges!

Another part of this is that, once questions are kept, they can be re-asked automatically. This is immensely powerful. At present, an analyst can pose a question (“has X ever communicated with Y?”), get a negative answer, only for information about such a communication to arrive a microsecond later and not be noticed. In fast changing environments, this can happen frequently, but it’s implausible to expect analysts to remember and re-pose their questions at intervals, just in case.

We still have some way to go with the tools and techniques available for intelligence analysis.

Language learning as a model of radicalisation

The Canadian Prime Minister said today, in response to the arrests for the planned Via Rail attacks, and perhaps to the Boston Marathon bombings as well, that these are not a reason to “commit sociology”. I think he’s exactly right. As I said in the previous post, I’m dubious that levels of dissatisfaction with societies, or even with religions, play a major role in radicalisation — it’s a much more individual-specific process. This is why only a tiny fraction of people in exactly the same social, religious, and even family setting become radicalised.

I’m also deeply skeptical that anyone becomes radicalised via the Internet. Our survey results indicated that variations in access to the Internet, or to mass media channels that have a frankly jihadist orientation have no correlation with attitudes on radicalisation-relevant subjects or dissatisfaction of any kind. I’m convinced that it always takes contact with a person, perhaps only one and perhaps only once, for radicalisation to happen.

Here’s where the analogy with language learning comes in. I learned French (in Australia) the same way I learned Latin (declensions, conjugations, agreement). I read French well and could speak it after a fashion. But the first time I heard French radio and then met people who actually spoke French, there was a kind of click in my brain and something changed about the way I used and learned French. I don’t think this is just autobiography; as I mentioned in the last post, learning languages via TV programs doesn’t work nearly as well as you might expect it to.

I’m fairly convinced something similar happens with radicalisation. An individual can watch the videos, talk the talk, fantasise the actions, but unless/until they make contact with someone who has actually done something, there isn’t any danger. Once this happens, of course, radicalisation can proceed very quickly indeed, which explains (I guess) the several cases where apparent changes have been very swift.

Radicalisation

(Yes, it is spelled like that.)

The events of the past ten days have revived interest in the process of radicalisation. Why and how do people move from apparently normal to wanting to blow other people up, especially for a goal that most of them would be hard put to explain coherently?

There are plenty of grand theories of radicalisation. Often they are derived from and supported by the narratives of those who have become radicalised, been arrested, and then interviewed. The trouble with building theories this way is that they fail to explain all of the apparently identical people in the same situation who didn’t become radicalised. From a population of 10,000, maybe 1,000 will turn out to demonstrate in favour of some apparent injustice; of these, maybe 100 will find out more and acquire ideas supporting violent extremism, and of these 10 will even consider actually doing something. Even then, only a couple of these will get very far down the path to planning and carrying out an attack. So what makes the difference? Which ones never get beyond vague feelings of support? The grand theories have little to say about what doesn’t cause radicalisation.

By the way, many of you probably filled in an Islamist back story for the previous paragraph. But it’s important to remember that violent extremism is also used by animal rights supporters, anti-globalisation protesters, anarchists, right-wing ideologues, and others. So explanations of radicalisation have to work for these other groups too.

What is becoming clear is that the explanations for radicalisation have a small ball component, that is they derive from the personalities and immediate social settings of the people involved (family, rather then social group). Many teenagers and young adults feel out of place in society as they grow up. If you’re the child of an immigrant, especially one whose parents didn’t manage to settle cleanly in their new country, it’s easy to blame the “out of place” feeling on the society rather than on yourself. Even for those who aren’t children of immigrants, there’s a certain attraction to this extrinsic view. If there’s nothing wrong with you, there must be something wrong with the society around you. Narratives that play to this are naturally attractive, whether Islamist, or anarchist.

Overwhelmingly, also, those who radicalise are men; and, if they have post-secondary education, it tends to be in engineering and the hard sciences rather than in the humanities and social sciences. Both of these suggest a lack of nuance in thinking socially.

Although it might be conceivable for an individual to self-radicalise, this seem extremely difficult. It’s not obvious, but it’s really hard to learn a foreign language by watching TV, and something similar seems to be true of visiting jihadist web sites. Those who have become radicalised seem to need to be involved with a small group (2 is enough, but 5 is better) to amplify each others thoughts. And there seems to be a need for some kind of mentor figure, preferably an actual person, although here it’s just possible that an online figure can serve.

We have a paper coming out (eventually — the Dec 2012 issue still to appear) in the Canadian Journal of Political Science reporting on the results of a survey of attitudes in the Islamic community in Canada. These results make it fairly clear that dissatisfaction, broadly defined, has little to do with radicalisation. People don’t come to support violent extremism because they’re unhappy with government programs; nor, much, because they’re unhappy with the morality of society around them. Explanations have more to do with them as individuals than about society as a whole.



Follow

Get every new post delivered to your Inbox.