### Small world is incompatible with communities

Most human-generated graphs, especially social networks, have the small world property: their diameter is much smaller than for a random graph with the same number of nodes or, more intuitively, there are short paths linking every pair of nodes. For example, in the real world one plausible path that could be used to transfer something between A and B would be:

A — local political representative — president of A’s country — president of B’s country — local political representative — B

taking 5 steps. The famous “6 degrees of separation” from Milgram’s experiment reflects the need for willingness of the people along the path, and also shows that there must be many redundant paths connecting A and B.

The example shows that the small world property sort of implies that there’s a tree contained within the topology of the network, so that one way to find short paths is to move up the tree and down again. Of course, real networks will tend to be much richer. But a binary tree is the canonical example of a small world graph.

Now what about communities within such networks. It’s a little difficult to come up with a reasonable definition for a community, although it’s kind of intuitive. A community is a set of nodes that’s more richly connected to one another than they are to the rest of the graph.

The trouble is that the property that this definition is trying to capture is actually the opposite of what needs to happen in a small world graph. Suppose that I’m a node in a community and I want to reach some distant node via a short path. Once I get to the long-distance connection structure everything is easy because it gets me right across the network in a few steps. The problem is getting to this structure in the first place. If I have to follow a number of steps within my community (and have to do the same within the community at the other end) then the small world property breaks.

The only way to preserve the small world property and also have communities of non-trivial size is for the graph to be a tree with cliques at each leaf — then every node has immediate access to the node that is connected to the  long-distances edges of the tree. But this is a very artificial graph. In real networks, the nodes are not specialized for long-distance connection; they are simply ordinary nodes (and so could be inside communities) that happen to have an incident edge that’s connected “far away”.

In other words, a small world graph can’t really have subgraphs that are connected mostly internally with only an occasional edge going to other subgraphs. Yet it’s surprising how often the example figures in research papers in this area show such a graph. Leskovec, Lang et al. did a series of experiments looking at the community structure discovered by a large variety of algorithms. They showed what you’d expect — if you look for communities of size k, you’ll find them because you’re carving a structure that has communities all the way down into chunks of size k. The really interesting part of their results is that there is an inhomogeneity: there are better (more easily selected) communities of size around 125-150, right at the Dunbar size that seems to be hardwired into humans. And this seems to be true for several different kinds of networks, suggesting that it this size is cognitively mediated rather than directly socially mediated.

J. Leskovec, K. Lang, A. Dasgupta, and M. Mahoney. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. Internet Mathematics 6(1), 2009.

The bottom line: there are communities in a network exactly to the extent that it fails to be small world (except for the trivial tree-connected cliques case). We think that there are communities in the networks to which we belong but it may (must) be, to some extent, an illusion that we impose by using other, non-graph criteria for who’s in and who’s out. Useful community detection probably can’t be done on the basis of graph structure alone.