Latest Report from the UK on Huawei Quality

You my recall that the Uk set up, now 10 years ago, a joint centre between its National Cybersecurity Centre and Huawei to build confidence in the quality of Huawei’s products in secure applications. I’ve posted on the contents of previous annual reports which have commented on some serious weaknesses in Huawei’s products.

This year’s report is now out: https://www.gov.uk/government/publications/huawei-cyber-security-evaluation-centre-hcsec-oversight-board-annual-report-2021

It reports that the issues described in previous years have been addressed but then concludes, curiously: “there has been no overall improvement over the course of 2020 to meet the product software engineering and cyber security quality expected by the NCSC”. I can only conclude that Huawei addressed the symptoms (identified problems from previous reports) without carrying out the wider process changes that would have produced better quality in the first place. One step forward, one step back.

Lessons from the recent large-scale cyber attacks

There have been three major cyber attacks in the past year: Solar Winds, the Microsoft Exchange attack, and the Pulse Secure VPN attack. These attacks all appear to have been (a) state sponsored, (b) long lasting, (c) high impact, and (d) carefully engineered.

What can be learn from what seems to be a new level of active cyber operations? Here are some of the lessons:

  • The fact that such consequential targets exist (and are vulnerable) is the result of natural but dangerous incentives to develop monocultures. It’s a truism that convenience trumps security. And one consequence is that it’s always easier to piggyback on something that’s already been done than to develop a new solution. So we have only a few processor technologies, a few operating systems, a few transport protocols, a few cloud providers; and now only a few large-scale systems management environments. Choosing to use the existing systems is almost always cost effective, but the hidden cost is the loss of resilience that comes from putting all of our computatonal eggs in one basket.

    Those who build these pervasive systems have failed to appreciate that with great power comes great vulnerability, and so great requirements for security. Instead these systems rely on the same barely workable mechanisms that are used in the rest of cyberspace.
  • Government cybsersecurity orgainsations have, with the best of intentions, worked themselves into an untenable situation with respect to protecting against such attacks. It was natural in, say, the Five Eyes countries to give the responsibility for cybersecurity to the signals intelligence organisations (NSA, GCHQ, ASD, CSE). After all, they had the expertise with digital communication of all kinds, and they used cyber tools for their own surveillance and espionage purposes, and so they were experts in many of the issues and technologies.

    However, these signals intelligence agencies are constrained to act only against those outside the countries they belong to (with some brief exceptions post 9/11). When they spun off cybersecurity centres, to protect their domestic environments against cyber attacks, they found themselves in a legal and procedural no-mans-land where they, in general, didn’t have the ability to act in any meaningful way, and were reduced to an educational mandate. So we have the (interesting and novel) spectacle of remediation of systems affected by the Microsoft Exchange hack by the FBI, not the NSA (although surely rhe NSA was involved).

    Western countries need to (re)think the role that their cybersecurity centres will play in the face of serious, large-scale, state-driven cyber attacks — not just practically but legislatively.

Are beneficial ownership registries the solution to money laundering?

Prosecuting money laundering requires making a connection between a crime (the ‘predicate offence’) and somebody who owns the proceeds.

Criminals hide the connection between the origin of the proceeds and their availability for use by creating a chain (and often now a network) of movements, each step of which makes it harder to connect the outcome with the origin. Traditionally, this meant moving cash into the financial system, shuffling it around, possibly internationally, and using it to buy expensive objects (cars, houses, jewels, businesses) that seemed increasingly innocent as their distance from the original crime increased.

Financial intelligence units try to follow these chains, but it’s easy to see that the advantage lies with the criminals. Adding another link to the chain is easy, but finding that it exists is much harder.

Criminals are especially interested in mechanisms that completely obscure links in the chain. Changing ownership is one way to do this. A simple way to change ownership is to put the value in someone else’s name, perhaps a spouse or relative. But the mechanism of corporate entities makes this much easier. Networks of businesses can be constructed, and own one another, making it extremely difficult to work out who is the ‘real’ owner — the so-called beneficial owner. (Concealing who really owns value is also a common tax evasion strategy.)

A proposal which is becoming increasing popular as a way to make connections more visible is to create beneficial ownership registries. These are managed by governments, and require any corporate entity to explain who the beneficial owners are; usually with some threshold requirement that this applies to any beneficial owner who owns more than 25% of the entity.

There are some practical problems with this idea. The beneficial owners have to be identifiable in a world of 7 billion people, which means that quite a lot of detail must be provided about each owner. How much of this detail should be public? The idea works better if people in other countries can see and identify beneficial owners, because moving value to another country is a good way to break the chain. But there are countervailing privacy issues, especially as most beneficial owners are not doing anything nefarious.

Each country must build and manage its own beneficial ownership registry; and how can all governments be convinced to participate? Any that do not become magnets for money laundering which may be morally objectionable, but also very profitable for their financial institutions.

The threshold also creates issues. If it is 25%, which is typical today, then only four people have to get together to create an entity whose ownership can be legitimately concealed. If it is 10% then ten participants are required to conceal ownership. This might seem cumbersome for criminals but sadly “money laundering as a service” now exists, and the organisations that provide this service have the resources to aggregate value from many different criminals and mix and match it.

So beneficial ownership registries may help stamp out a good deal of shady practice, but they may not help much with stopping money laundering.

The other way to break chains is the use cryptocurrencies. All transactions within a blockchain are visible, but what is hidden is the identity of those making the transactions. So a criminal can put value into a cryptocurrency using one identity (really a public-private key pair) and take it out again using another identity, and the chain has been completely broken (as long as the key pairs, the identities, are kept separate).

Fortunately, cryptocurrencies are not ideal as places to hold value, even briefly, because the exchange rates between them and the financial system tend to fluctuate wildly and unpredictably. Inserting and removing value is also a relatively slow process.

The bottom line is that it is easy for criminals to disconnect the crimes they commit from the value that they produce, and so the current legal basis for prosecuting money laundering is almost unusable — which the prosecution statistics bear out.

The other approach to limiting money laundering is to regulate the financial system more carefully. This includes things like limiting the ability to move cash in quantity, requiring an increasing array of agents who handle value to report transactions above a certain size or that seem suspicious, and requiring banks and financial institutions to keep track of who they’re dealing with. The problem is that the entities that are required to make reports have considerable discretion about when to do so, and considerable incentives not to because they make money from the transactions involved. The recent spate of breathtaking fines for banks that have violated money laundering rules strongly suggests that they have decided that transactions that are probably illicit but have a fig leaf of deniability are worth doing; and the fines, if they are caught, are simply a cost of doing business.

In the end, the only mechanism that can actually prevent money laundering is unexplained wealth orders, which I’ve written about before. These target the end of the chain, where criminals want to take the value produced by their crimes and use it for their lifestyle. UWOs force the recipients of value to account for where it came from, so the size and complexity of the chain doesn’t matter.

Getting election winning right

In the previous post I reviewed our model for how to win a U.S presidential election:

  1. Use high levels of positive language;
  2. Avoid negative language completely;
  3. Stay away from policy;
  4. Don’t mention your opponent.

Joe Biden’s speech at Gettysburg was a textbook example of how to do this (and it’s no easy feat avoiding mentioning your opponent when it’s Trump).

He should have stopped after the first five minutes (HT Bob Newhart “On the backs of envelopes, Abe”, also Lincoln himself, 271 words).

After the first five minutes it got rambling and repetitive. The media hates speeches that fit our model, and so the only sound bites came from the second half, which was much less well-written.

How to win a US presidential election — reminder

As the US presidential election ramps up, let me remind you of our conclusions about the language patterns used by winners. Since 1992, the winner is the candidate who:

  1. uses high levels of positive language;
  2. avoids all negative language;
  3. stays away from policy and talks in generalities
  4. doesn’t talk about the opposing candidate

https://www.sciencedirect.com/science/article/pii/S0261379416302062

The reason this works is that the choices made by voters are not driven by rational choice but by a more immediate appeal of the candidate as a person. The media doesn’t believe in these rules, and constantly tries to drive candidates to do the opposite. For first time candidates this pressure often works, which is partly why incumbents tend to do well in presidential elections.

But wait, you say. How did Trump win last time? The answer is that, although he doesn’t do well on 2 and 4, Hillary Clinton did very poorly on all four. So it wasn’t that Trump won, so much as that Hillary Clinton lost.

Based on this model, and its historical sucess, Biden is doing pretty much exactly what he needs to do.

Huawei: Malice or incompetence?

I’ve written before about the reports from the UK’s centre set up to vet Huawei products (the most recent one here: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/790270/HCSEC_OversightBoardReport-2019.pdf

Their conclusion was that, although they had become suspicious of attempts to include malicious code in switches and other products, they couldn’t actually conclude that there had been such attempts because the code was so poorly constructed.

Now a different case has come to light. Huawei was contracted to build a repository for the Papua-New Guinea government’s data and operations. It opened in 2018.

A report was commissioned by the PNG government, and carried out by the Australian Strategic Policy Institute (paid for by Australia’s DFAT). Those who’ve seen the report say that it points out that:

  • Core switches were not behind firewalls;
  • The encryption used an algorithm known to be broken two years earlier;
  • The firewalls had also reached the end of their lives two years earlier.

In other words, the installation was not fit for service.

The article (below) takes the view that this was malice. But Huawei’s track record again makes it impossible to tell.

As well as making it easy for Huawei to access the system illicitly, the level of security also made it possible for any other country to gain access as well. This is one of the major undiscussed issues around Huawei — maybe they are beholden to the Chinese government and might have to share data with them, but the quality of their security means that the threat surface of their equipment is large. So using Huawei equipment risks giving access to Russia, Iran, and North Korea, as well as China.

The PNG project was paid for by a loan from a Chinese bank. Sadly there was no budget for maintenance so the entire system degraded into uselessness before it could even get seriously started. But the PNG government still owes China $53 million for building it (Belt and Road = Bait and Switch?).

https://www.afr.com/companies/telecommunications/huawei-data-centre-built-to-spy-on-png-20200810-p55k7w

(behind a paywall, but there are other versions).

Huawei vs Nortel

There’s a long summary of the evolution of Huawei and the early hacks against Nortel here:

https://www.bnnbloomberg.ca/did-a-chinese-hack-kill-canada-s-greatest-tech-company-1.1459269

Most headlines that contain a question can be answered “No” without reading the article, but this one is an exception.

Provenance — a neglected aspect of data analytics

Provenance is defined by Merriam-Webster as “the history of ownership of a valued object or work of art or literature“, but the idea has much wider applicability.

There are three kinds of provenance:

  1. Where did an object come from. This kind of provenance is often associated with food and drink: country of origin for fresh produce, brand for other kinds of food, appellation d’origine contrôlée for French wines, and many other examples. This kind of provenance is usually signalled by something that is attached to the object.
  2. Where did an object go on its way from source to destination. This is actually the most common form of provenance historically — the way that you know that a chair really is a Chippendale is to be able to trace its ownership all the way back to the maker. A chair without provenance is probably much less valuable, even though it may look like a Chippendale, and the wood seems the right age. This kind of provenance is beginning to be associated with food. For example, some shipments now have temperature sensors attached to them that record the maximum temperature they ever encountered between source and destination. Many kinds of shipments have had details about their pathway and progress available to shippers, but this is now being exposed to customers as well. So if you buy something from Amazon you can follow its progress (roughly) from warehouse to you.
  3. The third kind of provenance is still in its infancy — what else did the object encounter on it way from source to destination. This comes in two forms. First, what  other objects was it close to? This is the essence of Covid19 contact tracing apps, but it applies to any situation where closeness could be associated with poor outcomes. Second, where the objects that it was close to ones that were expected or made sense?

The first and second forms of provenace don’t lead to interesting data-analytics problems. They can be solved by recording technologies with, of course, issues of reliability, unforgeability, and non-repudiation.

But the third case raises many interesting problems. Public health models of the spread of infection usually assume some kind of random particle model of how people interact (with various refinements such as compartments). These models would be much more accurate if they could be based on actual physical encounter networks — but privacy quickly becomes an issue. Nevertheless, there are situations where encounter networks are already collected for other reasons: bus and train driver handovers, shift changes of other kinds, police-present incidents; and such data provides natural encounter networks. [One reason why Covid19 contact tracing apps work so poorly is that Bluetooth proximity is a poor surrogate for potentially infectious physical encounter.]

Customs also has a natural interest in provenance: when someone or something presents at the border, the reason they’re allowed to pass or not is all about provenance: hard coded in a passport, pre-approved by the issue of a visa, or with real-time information derived from, say, a vehicle licence plate.

Some of clearly suspicious, but hard to detect, situations arise from mismatched provenance. For example, if a couple arrive on the same flight, then they will usually have been seated together; if two people booked their tickets or got visas using the same travel agency at the same time then they will either arrive on different flights (they don’t know each other), or they will arrive on the same flight and sit together (they do know each other). In other words, the similarity of provenance chains should match the similarity of relationships, and mismatches between the two signal suspicious behaviour. Customs data analytics is just beginning to explore leveraging this kind of data.

Pandemic theatre

Canada is the latest country to announce that they are rolling out a contact tracing app for covid-19. There are so many issues with this idea, and its timing, that we have to consider it as pandemic theatre.

These contact tracing apps work as follows: each phone is given a random identifier. Whenever your phone and somebody else’s phone get close enough, they exchange these identifiers. If anyone is diagnosed with Covid, their identifier is flagged and all of the phones that have been close to the flagged phone in the past 2 weeks are notified so that users know that they have been close to someone who subsequently got the disease.

First, Canada is very late to the party. This style of contact tracing app was first designed by Singapore, Australia rolled its version out at the end of April, and many other countries have also had one available for a while. Rather than using one of the existing apps (which require very little centralised and so specialised infrastructure), Canada is developing its own — sometime soon, maybe.

Second, these apps have serious drawbacks, and might not be effective, even in principle. Bluetooth, which is used to detect a nearby phone, is a wireless system and so detects any other phone within a few metres. But it can’t tell that the other phone is behind a wall, or behind a screen, or even in a car driving by with the windows closed. So it’s going to detect many ‘contacts’ that can’t possibly have spread covid, especially in cities. Are people really going to isolate based on such a notification?

Third, these apps, collectively, have to capture a large number of contacts to actually help with the public health issue. It’s been estimated that around 80% of users need to download and use the app to get reasonable effectiveness. Take up in practice has been much, much less than this, often around 20%. Although these apps have been in use for, let’s say, 45 days in countries that have them, I cannot find a single report of an exposure notification anywhere.

Governments are inclined to say things like “Well, contact tracing apps aren’t doing anything useful now, but in the later stages they’ll be incredibly useful” (and so, presumably, we don’t have to rush to build them). But it’s mostly about being seen to do something rather than actually doing something helpful.

Embarrassware

There’s a new wrinkle on ransomware.

Smarter criminals are now exfiltrating files that they find which might be embarrassing to the organisation whose site they’ve hacked. Almost any organisation will have some dirty laundry it would rather not have publicised: demonstrations of incompetence, inappropriate emails, strategic directions, tactical decisions, ….

The criminals threaten to publish these documents within a short period of time as a way to increase the pressure to pay the ransom. Now even an organisation that has good backups may want to pay the ransom.

Actually finding content that the organisation might not want made public is a challenging natural language problem (although there is probably low-hanging fruit such as pornographic images). But, like the man (allegedly Arthur Conan Doyle) who sent a telegram to his friend saying “Fly, all is discovered” (The Strand, George Newnes, September 18, 1897, No. 831 – Vol. XXXII) and saw him leave town, it might not be necessary to specify which actual documents will be published.

Understanding risk at the disaster end of the spectrum

In conventional risk analysis, risk is often expressed as

risk = threat probability x potential loss

When the values of the terms on the right hand side are in the middle of their ranges, then our intuition seems to understand this equation quite well.

But when the values are near their extremes, our intuition goes out the window, as the world’s coronavirus experience shows. The pandemic is what Taleb calls a black swan, an event where the threat probability is extremely low, but the potential loss is extremely high. For example, if the potential loss is of the order of 10^9 (a billion) then a threat probability of 1 in a thousand still has a risk of magnitude a million.

I came across another disaster waiting to happen, with the same kind fo characteristics as the coronavirus pandemic — cyber attacks on water treatment facilities.

https://www.csoonline.com/article/3541837/attempted-cyberattack-highlights-vulnerability-of-global-water-infrastructure.html

In the U.S. water treatment facilities are small organizations that don’t have specialized IT staff who can protect their systems. But the consequences of cyber attacks on such facilities can cause mass casualties. While electricity grids, Internet infrastructure, and financial systems have received some protection attention, water treatment is the forgotten sibling. A classic example of a small (but growing) threat probability but a huge potential loss.

The threat isn’t even theoretical. Attacks have already been attempted.

Using technology for contact tracing done right

There has understandably been a lot of interest in using technology, especially cell phones, to help with tracking the spread of covid-19.

This raises substantial privacy issues, especially as we know that government powers grabbed in an emergency tend not to be rolled back when the emergency is over.

One of the difficulites is that not everybody with a cell phone carries it all times (believe it or not), and not everybody leaves their location sensor turned on. So many of the proposals founder on issues such as these; all the more so as those who don’t want to be tracked are more likely to be evasive.

One of the cleverer ideas is an app used in Singapore, TraceTogether. If you install the app, and have Bluetooth turned on, then the app swaps identities with any phone with the app that comes close enough to detect.

Using public key infrastructure, the identity of the other phones you’ve encountered is stored, encrypted, on your phone (and vice versa on theirs).

If you get sick, the app will send your list of phones you’ve been close to the government which can use its key to decrypt them. They can then notify everyone and contact trace them in minutes.

Note that the app doesn’t record where you crossed paths with others, just that you did. This, together with the fact that nobody but the government can decrypt your contacts, gives you a substantial amount of privacy, probably the best you can hope for given the public health need.

The epidemiology of spam

As someone who’s had the same email address for nearly 40 years, I get a lot of spam. (Of course, almost all of it is automatically filtered away.)

It’s been noticeable that spam was way down from January this year; and became vanishingly rare once India was put on lockdown last week.

But this week it’s come roaring back as China once again opens for business. I guess we know where most of it comes from (and maybe spam has a role to play as a covid-19 detector — perhaps we can find out how many infections there are really in Iran, for example).

Detecting intent and abuse in natural language

One of my students has developed a system for detecting intent and abuse in natural language. As part of the validation, he has designed a short survey to get human assessments of how the system performs.

If you’d like to participate, the url is

aquarius.cs.queensu.ca

Thanks in advance!

Towards a cashless economy

Australia is close to passing laws that would make it impossible to pay for anything with cash above $A10,000.

What’s interesting is who’s objecting: the Housing Association (cash for property innocently, really?); farmers (maybe this is about barter and/or tax avoidance), dentists (??), and big retail (hmm, sounds like high end products such as jewelry might be the issue here). Retailers are quoted as saying “customers like to save cash to make big payments” which sounds rather implausible.

One of the things that works against stamping out money laundering is that it means stamping out the black, and most of the grey, economy. The pushback from these parts of the economy is presumaby something between loss of perks and a feeling that the tax bite is too big.

Update to “A Gentle Guide to Money Laundering”

I’ve updated my guide to money laundering, mostly to include a discussion of Unexplained Wealth Orders, which seem likely to become a major part of the solution.

money laundering version 2 (Feb 2020)

More thoughts on Huawei

“5G” is marketing speak for whatever is coming next in computer networks. It promises 100 times greater speed and the ability to connect many more devices in a small space. However, “5G” is unlikely to exist as a real thing until two serious problem are addressed. First, there is no killer app that demands this increase in performance. Examples mentioned breathlessly by the media include being able to download an entire movie in seconds (which doesn’t seem to motivate many people), the ability for vehicles to communicate with one another (still years away), and the ability for Internet of Things to communicate widely (the whole communicating lightbulbs phenomenon seems to have put consumers off rather than motivated them). Second, “5G” will require a much denser network of cell towers and it’s far from clear how they will be paid for and powered. The 5G networks touted in the media today require specialized handsets that are incompatible with existing networks and exist only in the downtown cores of a handful of cities. So “5G” per se is hardly a pressing issue.

Nevertheless, it does matter who provides the next generation of network infrastructure because networks have become indispensable to ordinary life – not just entertainment, but communication and business. And that’s why several countries have been so vocal against Huawei’s attempts to become a key player.

There are two significant issues. First, a network switch provider can see, block, or divert all the traffic passing through its switches. Even encrypting the traffic content doesn’t help much; it’s still possible to see who’s communicating with whom and how often. Huawei, however much it claims to the contrary, is subject to Chinese law that requires it to cooperate with the Chinese government and so can never provide neutral services. It doesn’t help to say, as Huawei does, that because it never has acted at the behest of the Chinese government, it never will in the future. Nor does it help to say that no backdoor has ever been found in its software. All network switches have the capability to be updated over the Internet, so the software it is running today need not be the software it is running tomorrow. It is not surprising that many governments, including the US and Australia, have reservations about allowing Huawei to provide network infrastructure.

Second, the next generation of network infrastructure will have to be more complex than what exists now. A long-standing collaboration between the UK and Huawei tried to improve confidence in Huawei products by disassembling and testing them. Their concern, for a number of years, was that supposedly identical software built in China and built in the UK turned out to be of different sizes. This is a bad sign, because it suggests that the software pays attention to where it is being built and modifies itself accordingly (much as VW emissions testing software checked whether the vehicle was undergoing an emissions test and modified its behaviour ). However, their 2019 report concluded that the issue stemmed from Huawei’s software construction processes, which were so flawed that they were unable to build software consistently anywhere. The software being studied is for today’s 4G network infrastructure, and the joint GCHQ-Huawei Centre concluded that it would take them several years even to reach today’s software engineering state-of-the-art. It seems inconceivable that Huawei will be able to produce usable network infrastructure for an environment that will be many times more complex.

These two problems, in a way, cancel each other out – if the network infrastructure is of poor quality it probably can’t be manipulated explicitly by Huawei. But its poor quality increases the opportunity for attacks on networks by China (without involving Huawei), Russia, Iran, or even terrorist groups.

Huawei systems are cheaper than their competitors, and it’s a truism that convenience trumps security. But the long-term costs of a Huawei connected world may be more than we want to pay.

The difference between kinetic and cyber attacks

It’s striking — and worrying — that missile launches by North Korea, no matter how unimportant in the big picture, get worldwide news coverage every time they happen.

But North Korea’s ongoing cyberattacks, which are having serious effects and are raising startlingly large amounts of money for the regime are mentioned only on technical sites, and only occasionally.

We have to hope that military and government have a more balanced view of the relative threat — but it seems clear that politicians don’t.

Backdoors to encryption — 100 years of experience

The question of whether those who encrypt data, at rest or in flight, should be required to provide a master decryption key to government or law enforcement is back in the news, as it is periodically.

Many have made the obvious arguments about why this is a bad idea, and I won’t repeat them.

But let me point out that we’ve been here before, in a slightly different context. A hundred years ago, law enforcement came up against the fact that criminals knew things that could (a) be used to identify other criminals, and (b) prevent other crimes. This knowledge was inside their heads, rather than inside their cell phones.

Then, as now, it seemed obvious that law enforcement and government should be able to extract that knowledge, and interrogation with violence or torture was the result.

Eventually we reached (in Western countries, at least) an agreement that, although there could be a benefit to the knowledge in criminals’ heads, there was a point beyond which we weren’t going to go to extract it, despite its potential value.

The same principle surely applies when the knowledge is on a device rather than in a head. At some point, law enforcement must realise that not all knowledge is extractable.

(Incidentally, one of the arguments made about the use of violence and torture is that the knowledge extracted is often valueless, since the target will say anything to get it to stop. It isn’t hard to see that devices can be made to use a similar strategy. They would have a pin code or password that could be used under coercion and that would appear to unlock the device, but would in fact produce access only to a virtual subdevice which seemed innocuous. Especially as Customs in several countries are now demanding pins and passwords as a condition of entry, such devices would be useful for innocent travellers as well as guilty — to protect commercial and diplomatic secrets for a start.)

Democratic debates strategy

In an analysis of the language used by US presidential candidates in the last 7 elections, Christian Leuprecht and I showed that there’s a language pattern that predicts the winner, and even the margin. The pattern is this: use lots of positive language, use no negative language at all (even words like ‘don’t’ and won’t’), talk about abstractions not policy, and don’t talk about your opponent(s). (For example, Trump failed on the fourth point, but was good on the others, while Hillary Clinton did poorly on all four.)

In some ways, this pattern is intuitive: voters don’t make rational choices of the most qualified candidate — they vote for someone they relate to.

Why don’t candidates use this pattern? Because the media hates it! Candidates (except Trump) fear being labelled as shallow by the media, even though using the pattern helps them with voters. You can see this at work in the way the opinion pieces decide who ‘won’ the debates.

The Democratic debates show candidates using the opposite strategy: lots of detailed policy, lots of negativity (what’s wrong that I will fix), and lots of putting each other down.

Now it’s possible that the strategy needed to win a primary is different to that which wins a general election. But if you want to assess the chances of those who might make it through, then this pattern will help to see what their chances are against Trump in 2020.

Incumbency effects in U.S. presidential campaigns: Language patterns
matter, Electoral Studies, Vol 43, 95-103.
https://www.sciencedirect.com/science/article/pii/S0261379416302062