Posts Tagged 'interestingness'

Modelling expectations to help focus

I’ve argued for, and am struggling to build, knowledge-discovery systems that can inductively decide which parts of the available data, and which emergent knowledge from the data are likely to be most ‘interesting’ so that an intelligence analyst can be guided to focus his/her limited attention there.

One important way of approaching this, which has the added advantage that it hardens the system against being systematically misled (which is especially a problem in adversarial settings) is to build in ways of considering what should happen. In other words, as well as the ‘main’ modelling process, there should also be added models that are constantly projecting what the incoming data, the main model, and the results should be like.

So I was interested to see the discussion in New Scientist of work showing that the human brain appears to do exactly this — we predict what the scenes we are looking at should look like, presumably so that we can divert resources to aspects of the scene that don’t match this expectation. Now if only we could make this computational…

The article is here.

Estimating the significance of a factoid

You only have to mention Palantir to attract lots of traffic — oops, I did it again 🙂

Those of you who’ve been following along know that I’m interested in tools that help an analyst decide how to treat a new piece of information that arrives in an analysis system from the outside world. Many analysis tools provide exactly nothing to support analysts with this task  — new data arrives, and is stored away in the system, but analysts can only discover this by querying the system with a query that includes the new data as part of the result.

The next level of tool allows persistent queries; so an analyst can ask about some topic, and the system remembers the query. If new data appears that would have matched the query, the system notifies the analyst (think Google Alerts). This is a big step up in performance from an analyst point of view. Jeff Jonas has argued that, in fact, queries should be thought of as a symmetric form of data that can be accessed explicitly as well. For example, it may be signficant for an analyst that another analyst has made the same or a similar query.

However, this still requires analysts to manage their set of interests quite explicitly and over potentially long periods of time. We humans are not very good at managing the state of multiple mental projects, a fact that has made David Allen a lot of money. In the simplest case, if a system tells an analyst about a new result for a query that was originally made months ago, it may take a long time to recreate the situation that led to the original query being made, and so a long time to estimate the significance of the new information.

I don’t have a silver bullet to solve this class of problems. But I do think that it’s essential that tools become more proactive so that some of the judgement of how significant a newly arrived fact is can be made automatically and computationally. Of course, this is deeply contextual and very difficult.

It does seem helpful, though, to consider the spectrum of significance that might be associated with a new factoid. Let me suggest the following spectrum:

  • Normal.  Such a factoid is already fully accounted for by the existing mental model or situational awareness of the analysis. Its significance is presumptively low. Often this can be estimated fairly well using the equivalences common = commonplace = normal. In other words, if it resembles a large number of previous normal factoids, then it’s a normal factoid.
  • Anomalous. Such a factoid lies outside the normal but is ‘so close’ that it is best accounted for as a small deviation from normal. It’s the kind of factoid for which a plausible explanation is easy to come up with in a very short time frame.
  • Interesting. Such a factoid calls into question the accuracy or completeness of the existing model or situational awareness — something has been missed, or the structure of the model is not what it appeared to be.
  • Novel. Such a factoid does not resemble any that were used to build the model or situational awareness in the first place so its significance cannot be assessed in the current framework. The model must be incomplete in a substantial way.
  • Random. Stuff happens and some factoids will be so unusual that they have nothing to say about the existing model.

This is a spectrum, so there are no natural boundaries between these categories — and yet the actions that follow do depend on which of these five categories a factoid is placed in.

What makes estimating the significance of a new factoid difficult is that significance is greatest for the middle categories, and lowest for the extremal ones. Both normal and random are not signficant, while interesting and novel are the most significant. Many of the natural technologies tend to take a more monotonic view, for example intrusion detection systems. But we know several techniques for measuring significance that have the right qualitative properties, and these make it plausible that we can build systems that can present analysts with new factoids along with an indication of their presumptive significance.

Low Hanging Fruit in Cybersecurity III

Any attempt to decide whether a particular action is “bad” or “good” requires some model of what “good” actually means. The only basis for intelligent action in almost any setting is to be able to have a plan for the expected, but also a mechanism for noticing the unexpected — to which some kind of meta-planning can be attached. This is, of course, a crucial part of how we function as humans; we don’t hang as software often does, because if we encounter the unexpected, we do something about it. (Indeed, an argument along this line has been used by J.R. Lucas to argue that the human mind is not a Turing machine.)

But most cybersecurity applications do not try (much) to build a model of what “good” or “expected” or “normal” should be like. Granted, this can be difficult; but I can’t help but think that often it’s not as difficult as it looks at first. Partly this is because of the statistical distribution that I discussed in my last post — although, on the internet, lots of things could happen, most of them are extremely unlikely. It may be too draconian to disallow them, but it seems right to be suspicious of them.

Actually, three different kinds of models of what should happen are needed. These are:

  1. A model of what “normal” input should look like. For example, for an intrusion detection system, this might be IP addresses and port numbers; for a user-behavioral system, this might be executables and times of day.
  2. A  model of what “normal” transformations look like. Inputs arriving in the system lead to consequent actions. There should be a model of how these downstream actions depend on the system inputs.
  3. A model of what “normal” rates of change look like. For example, I may go to a web site in a domain I’ve never visited before; but over the course of different time periods (minutes, hours, days) the rate at which I encounter brand new web sites exhibits characteristic patterns.

An exception to the first model shows that something new is happening in the “outside” world — it’s a signal of novelty. An exception to the second model shows that the system’s model of activity is not rich enough — it’s a signal of interestingness. An exception to the third model shows that the environment is changing.

Activity that does not fit with any one of these models should not necessarily cause the actions to be refused or to sound alarms — but it does provide a hook to which a meta-level of analysis can be attached, using more sophisticated models with new possibilities that are practical only because they don’t get invoked very often.

Again think of the human analogy. We spent a great deal of our time running on autopilot/habit. This saves us cognitive effort for things that don’t need much. But, when anything unusual happens, we can quickly snap into a new mode where we can make different kinds of decisions as needed. This isn’t a single two-level hierarchy — in driving, for example, we typically have quite a sophisticated set of layers of attention, and move quickly to more attentive states as conditions require.

Cybersecurity systems would, it seems to me, work much more effectively if they used the combination of models of expected/normal behavior, organized in hierarchies, as their building blocks.