And so it begins

Stories out today that Google is now able to connect the purchasing habits of anyone it has a model for (i.e. almost everybody who’s ever been online) with Google’s own data on online activity.

For example, this story:

Google says that this enables them to draw the line between the ads that users have been shown, and the products that they buy. There’s a discrepancy in this story because Google also claim that they don’t get the list of products purchased using a credit card, but only the total amount. So a big hmmmmm.

(And if I were Google, I’d be concerned that there isn’t much of a link! Consumers might be less resentful if Google did indeed serve ads for things they wanted to buy, but everyone I’ve ever heard talk about online ads says the same thing: the ads either have nothing to do with their interests, or they are ads for things that they just bought.)

But connecting users to purchases (rather than ads to purchases) is the critical step to building a model of how much users are willing to pay — and this is the real risk of multinational data collection and analytics (as I’ve discussed in earlier posts).

Businesses processing emails

The Daily Mail reports an experiment by the High-Tech Bridge company in which they sent private emails or uploaded documents containing unique urls to 50 different platforms, and then waited to see if and who visited these urls.

Sure enough, several of them were visited by the businesses that had handled the matching document, including Facebook, Twitter, and Google. This won’t come as a surprise to readers of this blog, but once again points out the extent to which businesses like these are processing any documents they see to extract models of the sender/receiver.

There has been some confusion in the media about how this process might work. Evidently it’s not obvious to many that such a process is automated — there isn’t anyone ‘reading’ these documents, but they’re being processed by software which is capable of ingesting pages pointed to, and processing the contents of those pages as well. It would help if we agreed to verbs that distinguished ‘read by a human’ from ‘processed by software’ that were simple enough for the wider public to understand the difference.