Democratic debates strategy

In an analysis of the language used by US presidential candidates in the last 7 elections, Christian Leuprecht and I showed that there’s a language pattern that predicts the winner, and even the margin. The pattern is this: use lots of positive language, use no negative language at all (even words like ‘don’t’ and won’t’), talk about abstractions not policy, and don’t talk about your opponent(s). (For example, Trump failed on the fourth point, but was good on the others, while Hillary Clinton did poorly on all four.)

In some ways, this pattern is intuitive: voters don’t make rational choices of the most qualified candidate — they vote for someone they relate to.

Why don’t candidates use this pattern? Because the media hates it! Candidates (except Trump) fear being labelled as shallow by the media, even though using the pattern helps them with voters. You can see this at work in the way the opinion pieces decide who ‘won’ the debates.

The Democratic debates show candidates using the opposite strategy: lots of detailed policy, lots of negativity (what’s wrong that I will fix), and lots of putting each other down.

Now it’s possible that the strategy needed to win a primary is different to that which wins a general election. But if you want to assess the chances of those who might make it through, then this pattern will help to see what their chances are against Trump in 2020.

Incumbency effects in U.S. presidential campaigns: Language patterns
matter, Electoral Studies, Vol 43, 95-103.
https://www.sciencedirect.com/science/article/pii/S0261379416302062

Advertisements

Tips for adversarial analytics

I put togethers this compendium of thngs that are useful to know for those starting out in analytics for policing, signals intelligence, counterterrorism, anti-money-laundering, cybersecurity, and customs; and which might be useful to those using analytics when organisational priorities come into conflict with customers (as they almost always do).

Most of the content is either tucked away in academic publications, not publishable by itself, or common knowledge among practitioners but not written down.

I hope you find it helpful (pdf):  Tips for Adversarial Analytics

Unexplained Wealth Orders

Money laundering conventionally focuses on finding the proceeds of crime. It has two deterrent effects: the proceeds are confiscated so that ‘crime doesn’t pay’; and discovering the proceeds can be used to track back to find the crime, and the criminals that produced it.

Since crimes prefer not to leave traces, the proceeds of crime used to be primarily in cash — think drug dealers. As a result, criminals tended to accumulate large amounts of cash. To get any advantage from it, they had three options: spend it in a black economy, insert it into the financial system, or ship it to another country so that its origin was obscured.

Money laundering detection used to concentrate on these mechanisms. Many countries have made an effort to stamp out the cash economy for large scale purchases (jewels, cars, houses, art) by requiring cash transactions of size to be reported, and by removing large denomination currency from circulation (so that moving cash requires larger, more obtrusive volume). Most countries also require large cash deposits to banks to be reported. Preventing transport of cash across borders is more difficult — many countries have exit and entry controls on cash carried by travellers, but do much less well interdicting containers full of cash.

One reason why much of current money laundering detection is ineffective is that there are now wholesale businesses who provide money laundering as a service: give them your illicit money, and they’ll give you back some fraction of that money in a way that makes it seem legitimate. These businesses break the link between the money and the crime, making it almost impossible to prosecute since there’s no way to draw a line from the crime.

Unexplained wealth orders target the back end of the process instead. They require people who have and spend money in quantity to explain how they came by the money, even if the money is in the financial system and apparently plausible. This is extremely effective, because it means that criminals cannot easily spend their ill-gotten gains without risking their confiscation.

Of course, this is not a new idea. Police have always kept a look out for people who seemed to have more money than they should when they wanted to figure out who had committed a bank robbery or something similar.

The new factor in unexplained wealth orders is that the burden of proof shifts to the person spending the money to show that they came by it legitimately, rather than being on law enforcement to show that the money is proceeds of crime (which no longer works, because of the middemen mentioned above). This creates a new problem for criminals.

Of course, the development and use of unexplained wealth orders raises questions of civil liberties, especially when the burden of proof shifts from one side to the other.  However, unexplained wealth has always attracted the attention of taxation authorities and so these orders aren’t perhap as new as they seem. Remember, Al Capone was charged with tax evasion, not racketeering.

Unexplained wealth orders seem like an effective new tool in the arsenal of monay laundering detection. They deserve to be considered carefully.

What causes extremist violence?

This question has been the subject of active research for more than four decades. There have been many answers that don’t stand up to empirical scrutiny — because the number of those who participate in extremist violence is so small, and because researchers tend to interview them, but fail to interview all those identical to them who didn’t commit violence.

Here’s a list of the properties that we now know don’t lead to extremist violence:

  • ideology or religion
  • deprivation or unhappiness
  • political/social alienation
  • discrimination
  • moral outrage
  • activism or illegal non-violent political action
  • attitudes/belief

How do we know this? Mostly because, if you take a population that exhibits any of these properties (typically many hundreds of thousand) you find that one or two have committed violence, but the others haven’t. So properties such as these have absolutely no predictive power.

On the other hand, there are a few properties that do lead to extremist violence:

  • being the child of immigrants
  • having access to a local charismatic figure
  • travelling to a location where one’s internal narrative is reinforced
  • participation in a small group echo chamber with those who have similar patterns of thought
  • having a disconnected-disordered or hypercaring-compelled personality

These don’t form a diagnostic set, because there are still many people who have one or more of them, and do not commit violence. But they are a set of danger signals, and the more of them an individual has, the more attention should be paid to them (on the evidence of the past 15 years).

You can find a full discussion of these issues, and the evidence behind them, in ““Terrorists, Radicals, and Activists: Distinguishing Between Countering Violent Extremism and Preventing Extremist Violence, and Why It Matters” in Violent Extremism and Terrorism, Queen’s University Press, 2019.

 

Detecting abusive language online

My student, Hannah Leblanc, has just defended her thesis looking at predicting abusive language. The document is

https://qspace.library.queensu.ca/handle/1974/26252

Rather than treat this as an empirical problem — gather all the signal you can, select attributes using training data, and then build a predictor using those attributes — she started with models of what might drive abusive language. In particular, abuse may be associated with subjectivity (objective language is less likely to be abusive, even if it contains individual words that might look abusive) and with otherness (abuse often results from one group targeting another). She also looked at emotion and mood signals and their association with abuse.

All of the models perform almost perfectly at detecting non-abuse; they struggle more with detecting abuse. Some of this comes from mislabelling — documents that are marked as abusive but really aren’t; but much of the rest comes from missing signal — abusive words disguised so that they don’t match the words of a lexicon.

Overall the model achieves accuracy of 95% and F-score of 0.91.

Software quality in another valley

In the mid-90s, there was a notable drop in the quality of software, more or less across the board. The thinking at the time was that this was a result of software development businesses (*cough* Microsoft) deciding to hire physicists and mathematicians because they were smarter (maybe) than computer scientists and, after all, building software was a  straightforward process as long as you were smart enough. This didn’t work out so well!

But I think we’re well into a second version of this pattern, driven by a similar misconception — that coding is the same thing as software design and engineering — and a woeful ignorance of user interface design principles.

Here are some recent examples that have crossed my path:

  • A constant redesign of web interfaces that doesn’t change the functionality, but moves all of the relevant parts to somewhere else on the screen, forcing users to relearn how to navigate.
  • A complete ignorance of Fitt’s Law (the bigger the target, the faster you can hit it with mouse or finger), especially, and perhaps deliberately, those that kill a popup. Example: CNN’s ‘breaking news’ banner across the top.
  • My android duckduckgo app has a 50:50 chance, when you hit the back button, of going back a page or dropping you out of the app.
  • The Fast Company web page pops up two banners on every page (one inviting you to sign up for their email newsletter and one saying that they use cookies) which together consume more than half of the real estate. (The cookies popup fills the screen in many environments; thanks, GDPR!)
  • If you ask Alexa for a news bulletin, it starts at the moment you stopped listening last time — except that that was typically yesterday, so it essentially starts at a random point. (Yes, it tells you that you can listen to the latest episode, but the spell it requires to make this happen is so unclear I haven’t worked it out.)
  • And then there’s all the little mysteries: Firefox addins that seem to lose some of their functionality, Amazon’s Kindle book deal of the day site doesn’t list the deals of the day.

There are several possible explanations for this wave of loss of quality. The first is  the one I suggested above: that there’s an increase in unskilled software builders who just are not able to build robust products, especially apps. About a third of our undergraduates seem to have side hustles where they’re building apps, and the quality of the academic work they submit doesn’t suggest that these apps represent value for money, even if free.

Second, it may be that the environments in which new software is deployed have reached a limit where robustness is no longer plausible. This could be true at the OS level (e.g. Windows), or phone systems (e.g. Android) or web browsers. In all of these environments the design goal has been to make them infinitely extensible but also (mostly) backwards compatible — and maybe this isn’t really possible. Certainly, it’s easy to get the impression that the developers never tried their tools — “How could they not notice that” is a standard refrain.

Third, it may be that there’s a mindset among the developers of free-to-the-user software (where the payment comes via monetising user behaviour) that free software doesn’t have to be good software — because the punters will continue to use it, and how can they complain?

Whichever of these explanations (or some other one) is true, it looks like we’re in for a period in which our computational lives are going to get more irritating and expensive.

‘AI’ performance not what it seems

As I’ve written about before, ‘AI’ tends to be misused to refer to almost any kind of data analytics or derived tool — but let’s, for the time being, go along with this definition.

When you look at the performance of these tools and systems, it’s often quite poor, but I claim we’re getting fooled by our own cognitive biases into thinking that it’s much better than it is.

Here are some examples:

  • Netflix’s recommendations for any individual user seem to overlap 90% with the ‘What’s trending’ and ‘What’s new’ categories. In other words, Netflix is recommending to you more or less what it’s recommending to everyone else. Other recommendation systems don’t do much better (see my earlier post on ‘The Sound of Music Problem’ for part of the explanation).
  • Google search results are quite good at returning, in the first few links, something relevant to the search query, but we don’t ever get to see what was missed and might have been much more relevant.
  • Google News produces what, at first glance, appear to be quite reasonable summaries of recent relevant news, but when you use it for a while you start to see how shallow its selection algorithm is — putting stale stories front and centre, and occasionally producing real howlers, weird stories from some tiny venue treated as if they were breaking and critical news.
  • Self driving cars that perform well, but fail completely when they see certain patches on the road surface. Similarly, facial recognition systems that fail when the human is wearing a t-shirt with a particular patch.

The commonality between these examples, and many others, is that the assessment from use is, necessarily, one-sided — we get to see only the successes and not the failures. In other words (HT Donald Rumsfeld), we don’t see the unknown unknowns. As a result, we don’t really know how well these ‘AI’ systems really do, and whether it’s actually safe to deploy them.

Some systems are ‘best efforts’ (Google News) and that’s fair enough.

But many of these systems are beginning to be used in consequential ways and, for that, real testing and real public test results are needed. And not just true positives, but false positives and false negatives as well. There are two main flashpoints where this matters: (1) systems that are starting to do away with the human in the loop (self driving cars, 737 Maxs); and (2) systems where humans are likely to say or think ‘The computer (or worse, the AI) can’t be wrong’; and these are starting to include policing and security tools. Consider, for example, China’s social credit system. The fact that it gives low scores to some identified ‘trouble makers’ does not imply that everyone who gets a low score is a trouble  maker — but this false implication lies behind this and almost all discussion of ‘AI’ systems.


Advertisements