Posts Tagged 'crime'

Looking for Bad Guys III: Using manipulation

Bad guys who are aware the knowledge-discovery tools will be used to look for them may also try to actively manipulate the process to their own advantage.

One way to do this is to get an insider working for them, someone who can alter the data or the results of the analysis to this benefit. This is probably the most common method: over all of history, probably more sieges have been successful because someone opened the gates from the inside than because the walls were broken through. It’s easy to get caught up in the cleverness of technology and forget that sometimes suborning someone is the easiest attack.

However, the focus of this blog is knowledge discovery, so let me concentrate on that. Before we talk about how manipulation can be exploited as a discovery tool, we need to talk about what manipulation looks like; and before we can do that, we need to think about the structure of the knowledge-discovery process.

It’s helpful to divide up the stages of knowledge discovery into:

  1. Collecting the data (CCTV images, transaction logs);
  2. Analysing the data (the part that’s usually thought of as the heart of knowledge discovery);
  3. Deciding on what to do with the results and taking action;

Although an adversary can only attack the process via the data that is collected (assuming they don’t have an insider), it is helpful to think of three different kinds of attacks, directed against each of the three stages. The different attacks require understanding different aspects of the knowledge-discovery system.

Manipulating the data collection stage is probably the easiest, because it’s often possible to see and understand how the data is being collected. For example, the fields of view of CCTV cameras can usually be inferred from their positions (even if they are enclosed in black plastic bubbles) and so ways to move around them without coming into view can be worked out. Alternatively, disguises can be used to conceal who is being seen, even though an image is captured. One of the reasons identity theft is a big business is that it provides a way to have data captured about you, but data that is useless because it doesn’t connect to the real you.

Manipulating the decision and action stage is done using social engineering. This means trying to create the impression in the minds of the people who are making the decisions and taking the actions that the analysis system has made an error.

Manipulating the analysis stage is surprisingly easier than it should be. This is because most knowledge-discovery technology has been tuned to give good results in data with natural variation. This gives an opportunity to insert data that is the worst possible from the point of view of the algorithms, and so enable bad guys to hide their traces.

The technology used for knowledge discovery needs to be completely rethought to take manipulation into account. This is primarily why adversarial knowledge discovery is not just another application of knowledge discovery, but a completely different problem.

The good part about this is that attempts at manipulation also create an abnormal signature in the data; and the process can be tuned to look for this signature as well.

What this blog is about

All of us leave traces in the data that we create, either intentionally or as a side-effect of the things we do in the world — walking in front of a CCTV camera, turning on a cell phone, or whatever.

Lots of this data is analyzed, for example by businesses that want to build a relationship to customers.

I’m interested in the special case where some of the people about whom data is collected want to hide their existence, what they are like, and what they are doing, usually because they are up to no good.

In such situations, the way in which the data is collected, and then analyzed, and then the decisions that are taken as a result have to be rethought to take account of the adversarial nature of the situation.

I’m interested in how to do knowledge discovery in these adversarial situations, and this blog will talk about the issues, the techologies, and some of the known results.

Adversarial situations include:

  • crime;
  • fraud (medical, insurance);
  • money laundering;
  • organizational malfeasance;
  • industrial espionage;
  • national defence; and
  • counterterrorism.

What bad guys do in these situations has huge costs. The cost of terorrism is obvious, but it’s less well-known that fraud costs an estimated 12% of GDP in developed economies.

Of course, the process of collecting and analyzing data is not necessarily benign, and many people have privacy concerns. We’ll talk about them too.