Posts Tagged 'social network'

What is social media?

I was at a meeting last week whose focus was on social media. It quickly became clear that there were two kinds of interests. One group wanted to build high-level systems that would revolutionize business and government (somehow) leveraging social media; another group were building or wanted to build tools that would provide some kind of meta-view of social media content and activity.

The topic that was missing from all of the discussion was what social media was, and why it is the way it is; and so I came away feeling like the entire discussion, and quite a lot of work, was dancing on clouds. There seem to be a number of things that “everybody knows” about social media, but for which there seems to be little or no evidence. The Arab Spring was driven by social media! Well, maybe, but (a) was it and how much, (b) which parts were important and which were irrelevant?

It seems helpful to divide social media into three categories:

1.  Media that is essentially public access publishing or public access (micro)blogging. Although sites that provide this kind of functionality are often considered “social” there is almost nothing social about them — yes, the audience for posts can be restricted to a particular group, but that’s always been true of any publication. There is an interesting question lurking here though: what are the reasons why individuals read such posts? What kind of bond does it imply between the reader and the author? (Cynically, why would I care what even my closest friend had for breakfast?)

2. Media that start as public access publishing, but where the conversation built on an initial post is more important or interesting than the initial post itself — in other words, there’s something emergent in the conversation that transcends what any of the participants would have said ab initio. This is a kind of social knowledge or opinion construction, and there are lots of interesting questions about who participates, what their roles are, and how the content and tone are affected by the interactions. This is, of course, not a new phenomenon but what’s new is the scope and the detail of what’s recorded, allowing answers to be worked out in ways that were impractical or too expensive before.

3. Media in which explicit relational links are created between one person and another. This is the real heart of social media. Relational links between a pair of people have, of course, always existed, but they could only be constructed in a small number of ways and were (almost always) limited by geography.

The emergent structure of these links is a really interesting artifact that deserves study and from which we will probably learn a lot about what it means to be human in a global society. What does it mean when one person “friends” another? This is one question for which simple answers tend to be assumed, but even a brief consideration of A’s Facebook friends and the rest of A’s relationships in the real world quickly shows that there’s a complex connection between the two sets (and it depends heavily on characteristics of A).

One thing that quickly becomes clear when these questions are addressed computationally is that we aren’t going to get far until relationship links are typed. It’s fairly easy to look at each relationship and give it a numerical weight that reflects (say) closeness — but it’s still true that different kinds of relationships behave differently, and need to be modelled differently to understand them. (Social media sites should also implement this typing — not every piece of data should flow down every link of A’s social network.)

The fundamental question in a world where one person can create a visible relationship, is what does this mean — for the person creating it, for the person at the other end of the relationship, and for the emergent graph structure that a collection of these individual relationships creates. Good, solid answers to this question would be a foundation on which much more useful applications could be built.

Edge typing to make transitivity useful

There have been several studies over the past year that have shown that we are influenced by properties of people to whom we are not directly connected. There seems to be a pattern: if I have property X, then my immediate friends tend to be more Xy than would otherwise be expected (not surprising), but their friends whom I don’t know are also affected by my Xiness, and sometimes even their friends (which does seem surprising).

All of which is to say that transitivity in social networks is more interesting and important than it might seem intuitively. Social network sites have tried, in various ways, to exploit transitivity, usually in some form of speading: recommending things that I like or am doing to my immediate neighbours, and suggesting that people at distance 2 might usefully become friends at distance 1.

These attempts have, I think it is fair to say, been less than successful. A big part of the reason is the failure to model links (edges) as being of different kinds, as well as different intensities. Such sites do have access to intensity data, so they can estimate a weight for edges linking people (although probably this also is a bit shaky since many forms of contact are automated, so it’s not clear how much bonding each actually represents). In particular, connections that derive from work and those that derive from leisure seem like they should be treated differently, and some of the embarrasing faux pas have resulted from e.g. trying to get people to friend their boss’s boss. But, in general, people live in many different communities, and transitivity doesn’t work well across communities. It seems hard to be able to tell when transitivity is and is not a good thing without distinguishing different kinds of connections, and so different kinds of edges. For example, a personal relationship could be represented by a red edge, a work relationship by a blue edge, and a family relationship by a green edge. Now transitivity along paths of the same colour becomes a much more powerful, and less treacherous, idea.

From the perspective of data analysis, there are two challenges: how to acquire the information about what kind of edge a relationship is, and how to modify analysis techniques to take edges of different kinds into account.

The colour of an edge is not easy to induce from observing the activity on that edge. For example, suppose that you have access to someone’s email and you want to work out who are their friends, and who are professional contacts. The structure of email addresses doesn’t help much because a friend’s work email might be used, and because email addresses tend to be surrogates anyway. Time of day doesn’t help much because many people send personal email at work, and many send work emails out of working hours. The content of emails might help, but many organisations have extensive in-house non-work emails (for example, Enron had many emails about fantasy football that circulated only within the company). Social network sites have an advantage because they can ask users to explain which category of “friending” a particular contact is (this could be a big win — a category of “annoying person I don’t want to offend by removing the contact” could easily become the most popular edge type). In an intelligence or law enforcement setting, where the existence of the contact is acquired by observation or interception, the problem of categorising the contact is just as difficult.

Even if the edges can be labelled to indicate their type, using this information to improve the analysis of the resulting graph is difficult, and largely unstudied (AFAIK). Most approaches use some kind of iterative approach (see here for a recent example and some references). Integrating edge types into spectral approaches would be particularly useful — volunteers anyone?