Questionoids as well as factoids

In the previous post I talked about the problem of “connecting the dots” and how this innocuous-sounding phrase conceals problems that we don’t yet know how to solve — some we don’t even know how to attack.

There’s another side to the story, and that’s the questions that are applied to the collection of factoids. These are important for two reasons.

1.  Asking a question for which a particular factoid is the answer should perhaps have some impact on the importance/interestingness of that factoid. This isn’t a magic bullet (because unknown unknowns might also be important, but won’t be looked for). But it’s a start.

(Google presumably is using some variant of this idea to weight the importance of web pages since Pagerank is based on explicitly created links as indicators of importance, but few of us create explicit links any more because it’s easier to go via Google — so some other indicators must surely come into play. But I haven’t seen anything public about this.)

2.  Asking a question for which there is no matching factoid does not mean that the question should be discarded (as it is in e.g. database systems). Rather such unanswered questions should become data themselves (and so should the answered ones). New factoids should be considered against the aggregrate of these questions to see if they match — in other words, all queries should be persistent. That way, if someone asks about X and information about X is not known, the appearance of a factoid about X should cause a response to be generated, long after the analyst originally posed the question. Even if there was a factoid about X and so a response to the query, a new factoid about X will automatically generate a supplementary response.

In this view, which was first and most clearly enunciated by Jeff Jonas as part of the NORA and EAS systems, there are two forms of data: factoids and questionoids. Pairs of one of each “match” and cause a response to the outside world. But both kinds are worthy of meta-analysis, and the results of this analysis can be used to change the way the opposite kind of data is weighted.

The question data is also interesting in its own right. For example, an analyst may be interested to know that someone else has asked the same question as they did, even if it was asked months ago.


