<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Finding Bad Guys in Data</title>
	<atom:link href="http://skillicorn.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://skillicorn.wordpress.com</link>
	<description>Knowledge Discovery in Adversarial Situations</description>
	<lastBuildDate>Tue, 10 Jan 2012 01:44:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='skillicorn.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Finding Bad Guys in Data</title>
		<link>http://skillicorn.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://skillicorn.wordpress.com/osd.xml" title="Finding Bad Guys in Data" />
	<atom:link rel='hub' href='http://skillicorn.wordpress.com/?pushpress=hub'/>
		<item>
		<title>The Analysis Chasm</title>
		<link>http://skillicorn.wordpress.com/2012/01/09/the-analysis-chasm/</link>
		<comments>http://skillicorn.wordpress.com/2012/01/09/the-analysis-chasm/#comments</comments>
		<pubDate>Tue, 10 Jan 2012 01:44:06 +0000</pubDate>
		<dc:creator>skillicorn</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[counterterrorism]]></category>
		<category><![CDATA[cybersecurity]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[inductive modeling]]></category>
		<category><![CDATA[inductive modelling]]></category>
		<category><![CDATA[intelligence analysis]]></category>
		<category><![CDATA[knowledge discovery]]></category>
		<category><![CDATA[sensemaking]]></category>

		<guid isPermaLink="false">http://skillicorn.wordpress.com/?p=375</guid>
		<description><![CDATA[I&#8217;ve recently heard a couple of government people (in different countries) complain about the way in which intelligence analysis is conceptualized, and so how intelligence organizations are constructed. There are two big problems: 1.  &#8220;Intelligence analysts&#8221; don&#8217;t usually interact with datasets directly, but rather via &#8220;data analysts&#8221;, who aren&#8217;t considered &#8220;real&#8221; analysts. I&#8217;m told that, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=375&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently heard a couple of government people (in different countries) complain about the way in which intelligence analysis is conceptualized, and so how intelligence organizations are constructed. There are two big problems:</p>
<p>1.  &#8220;Intelligence analysts&#8221; don&#8217;t usually interact with datasets directly, but rather via &#8220;data analysts&#8221;, who aren&#8217;t considered &#8220;real&#8221; analysts. I&#8217;m told that, at least in Canada, you have to have a social science degree to be an intelligence analyst. Unsurprisingly (at least for now) people with this background don&#8217;t have much feel for big data and for what can be learned from it. Intelligence analysts tend to treat the aggregate of the datasets and the data analysts as a large black box, and use it as a form of Go Fish. In other words, intelligence analysts ask data analysts &#8220;Have we seen one of these?&#8221;; the data analysts search the datasets and the models built from them, and writes a report giving the answer. The data analyst doesn&#8217;t know why the question was asked and so cannot write a more helpful report that would be possible given some knowledge of the context. Neither side is getting as much benefit from the data as they could, and it&#8217;s mostly because of a separation of roles that developed historically, but makes little sense.</p>
<p>2. Intelligence analysts, and many data analysts, don&#8217;t understand inductive modelling from data. It&#8217;s not that they don&#8217;t have the technical knowledge (although they usually don&#8217;t) but they don&#8217;t have the conceptual mindset to understand that data can push models to analysts: &#8220;Here&#8217;s something that&#8217;s anomalous and may be important&#8221;; &#8220;Here&#8217;s something that only occurs a few times in a dataset where all behavior should be typical and so highly repetitive&#8221;; &#8220;Here&#8217;s something that has changed since yesterday in a way that nothing else has&#8221;. Data systems that do inductive modelling don&#8217;t have to wait for an analyst to think &#8220;Maybe this is happening&#8221;. The role of an analyst changes from being the person who has to think up hypotheses, to the person who has to judge hypotheses for plausibility. The first task is something humans aren&#8217;t especially good at, and it&#8217;s something that requires imagination, which tends to disappear in a crisis or under pressure. The second task is easier, although not something we&#8217;re necessarily perfect at.</p>
<p>There simply is no path for inductive models from data to get to intelligence analysts in most organizations today. It&#8217;s difficult enough to get data analysts to appreciate the possibilities; getting models across the chasm, unsolicited, to intelligence analysts is (to coin a phrase) a bridge too far.</p>
<p>Addressing both of these problems requires a fairly revolutionary redesign of the way intelligence analysis is done, and an equally large change in the kind of education that analysts receive. And it really is a different kind of education, not just a kind of training, because inductive modelling from data seems to require a mindset change, not the supply of some missing mental information. Until such changes are made, most intelligence organizations are fighting with one and a half arms tied behind their collective backs.</p>
<br /> Tagged: <a href='http://skillicorn.wordpress.com/tag/counterterrorism/'>counterterrorism</a>, <a href='http://skillicorn.wordpress.com/tag/cybersecurity/'>cybersecurity</a>, <a href='http://skillicorn.wordpress.com/tag/data-mining/'>data mining</a>, <a href='http://skillicorn.wordpress.com/tag/inductive-modeling/'>inductive modeling</a>, <a href='http://skillicorn.wordpress.com/tag/inductive-modelling/'>inductive modelling</a>, <a href='http://skillicorn.wordpress.com/tag/intelligence-analysis/'>intelligence analysis</a>, <a href='http://skillicorn.wordpress.com/tag/knowledge-discovery/'>knowledge discovery</a>, <a href='http://skillicorn.wordpress.com/tag/sensemaking/'>sensemaking</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/skillicorn.wordpress.com/375/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/skillicorn.wordpress.com/375/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/skillicorn.wordpress.com/375/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/skillicorn.wordpress.com/375/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/skillicorn.wordpress.com/375/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/skillicorn.wordpress.com/375/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/skillicorn.wordpress.com/375/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/skillicorn.wordpress.com/375/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/skillicorn.wordpress.com/375/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/skillicorn.wordpress.com/375/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/skillicorn.wordpress.com/375/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/skillicorn.wordpress.com/375/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/skillicorn.wordpress.com/375/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/skillicorn.wordpress.com/375/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=375&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://skillicorn.wordpress.com/2012/01/09/the-analysis-chasm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c65eddaeb6dee8b498863c39fc078c41?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">skillicorn</media:title>
		</media:content>
	</item>
		<item>
		<title>Spam Reporting Centre</title>
		<link>http://skillicorn.wordpress.com/2012/01/06/spam-reporting-centre/</link>
		<comments>http://skillicorn.wordpress.com/2012/01/06/spam-reporting-centre/#comments</comments>
		<pubDate>Fri, 06 Jan 2012 17:08:36 +0000</pubDate>
		<dc:creator>skillicorn</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[attribution]]></category>
		<category><![CDATA[Canada]]></category>
		<category><![CDATA[cybersecurity]]></category>
		<category><![CDATA[honeytrap]]></category>
		<category><![CDATA[malware]]></category>
		<category><![CDATA[spam]]></category>

		<guid isPermaLink="false">http://skillicorn.wordpress.com/?p=368</guid>
		<description><![CDATA[The Canadian government has decided to create a spam reporting centre (aka &#8216;The Freezer&#8217;) to address issues arising from cybercrime and communications fraud and annoyances of various kinds. The idea cannot possibly work on technical grounds. More worryingly, it displays a lack of awareness of the realities of cybersecurity that is astounding. The first peculiarity [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=368&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The Canadian government has decided to create a spam reporting centre (aka &#8216;The Freezer&#8217;) to address issues arising from cybercrime and communications fraud and annoyances of various kinds.</p>
<p>The idea cannot possibly work on technical grounds. More worryingly, it displays a lack of awareness of the realities of cybersecurity that is astounding.</p>
<p>The first peculiarity is that the Centre is supposed to address four problems: email spam, unsolicited phone calls, fake communications a la Facebook, and malware. Although these have a certain superficial similarity &#8212; they all annoy individuals &#8212; they do not raise the same kinds of technical issues underneath, and no one person could be an expert in detecting, let along prosecuting all of them. It&#8217;s a bit like trying to amalgamate the Salvation Army and the police force because they both wear uniforms and help people!</p>
<p>The Centre will rely on reports from individuals: get a spam email and forward it to the Centre, for example. One of the troubles with this idea is that individuals don&#8217;t usually have enough information to report such things in a useful way, and they don&#8217;t make good starting points for an eventual prosecution. Canada already has a way to report unsolicited phone calls but it only works for people who <em>almost</em> keep the law by announcing who they are at the beginning. The annoying (and illegal) robocalls can&#8217;t be reported because the person who gets them doesn&#8217;t know where they are coming from and who&#8217;s making them. And where there are prosecutions, each person who reports such a call has to sign an affidavit that the purported call did actually happen to provide the legal basis for the incident.</p>
<p>The second, huge, problem with this idea is that, if individuals can report bad incidents, then spammers can also report fake bad incidents! And they can do it in such volume that investigators will have no way to distinguish the real from the fake. Creating fake spam emails and evading mechanisms such as captchas to prevent wholesale reporting  is very easy.</p>
<p>There is also the deeper problem that besets all cybersecurity &#8212; attribution. It is always hard to trace cyberexploits back to their origins, and these origins are overwhelmingly likely to be computers taken over by botnets anyway. Working back along such chains to find someone to prosecute is tedious and expert work that depends on starting from as much information as possible.</p>
<p>The right way to address this problem is to set up honeytraps &#8212; machines and phones that seem to be ordinary but are instrumented so that, when an exploit happens, as much information as possible is collected at the time. Now there is a foundation for deciding which incidents are worth pursuing and starting out in pursuit with the best possible information. And, who knows, the knowledge that such systems are out there might dampen some of the enthusiasm on the part of the bad guys.</p>
<br /> Tagged: <a href='http://skillicorn.wordpress.com/tag/attribution/'>attribution</a>, <a href='http://skillicorn.wordpress.com/tag/canada/'>Canada</a>, <a href='http://skillicorn.wordpress.com/tag/cybersecurity/'>cybersecurity</a>, <a href='http://skillicorn.wordpress.com/tag/honeytrap/'>honeytrap</a>, <a href='http://skillicorn.wordpress.com/tag/malware/'>malware</a>, <a href='http://skillicorn.wordpress.com/tag/spam/'>spam</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/skillicorn.wordpress.com/368/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/skillicorn.wordpress.com/368/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/skillicorn.wordpress.com/368/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/skillicorn.wordpress.com/368/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/skillicorn.wordpress.com/368/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/skillicorn.wordpress.com/368/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/skillicorn.wordpress.com/368/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/skillicorn.wordpress.com/368/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/skillicorn.wordpress.com/368/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/skillicorn.wordpress.com/368/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/skillicorn.wordpress.com/368/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/skillicorn.wordpress.com/368/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/skillicorn.wordpress.com/368/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/skillicorn.wordpress.com/368/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=368&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://skillicorn.wordpress.com/2012/01/06/spam-reporting-centre/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c65eddaeb6dee8b498863c39fc078c41?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">skillicorn</media:title>
		</media:content>
	</item>
		<item>
		<title>Relentlessly re-explaining</title>
		<link>http://skillicorn.wordpress.com/2011/12/16/relentlessly-re-explaining/</link>
		<comments>http://skillicorn.wordpress.com/2011/12/16/relentlessly-re-explaining/#comments</comments>
		<pubDate>Fri, 16 Dec 2011 15:07:27 +0000</pubDate>
		<dc:creator>skillicorn</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[education]]></category>
		<category><![CDATA[research intensive]]></category>
		<category><![CDATA[scholarship]]></category>

		<guid isPermaLink="false">http://skillicorn.wordpress.com/?p=364</guid>
		<description><![CDATA[I was at a workshop a few months ago where the people were almost evenly divided: one-third government, one-third industry; and one-third academia. It struck me, as it hadn&#8217;t before, how much better the academics were at explaining things. Of course, we all know academics whose presentations are dreadful: both dull and incomprehensible but, on [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=364&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I was at a workshop a few months ago where the people were almost evenly divided: one-third government, one-third industry; and one-third academia. It struck me, as it hadn&#8217;t before, how much better the academics were at explaining things. Of course, we all know academics whose presentations are dreadful: both dull and incomprehensible but, on average, the quality of the academic&#8217;s presentations was much, much better than that of the other two groups.</p>
<p>Thinking about this, I realized that it&#8217;s another facet of scholarship. One of the (mostly invisible) things that research-active academics do is to take ideas and results from the cutting edge of knowledge and digest, rework, and refactor them to extract the key aspects and get rid of the unavoidable autobiography that goes along with research results. This is mostly why an undergraduate education is better at a research-intensive university. And why undergraduate degrees don&#8217;t keep getting longer, although the quantity of stuff we know is growing rapidly.</p>
<p>But there&#8217;s another aspect to this which I, at least, had under-rated. That is the relentless, year after year presentation of &#8220;the same&#8221; ideas to a a new crop of students who don&#8217;t already understand them. Of course, the ideas themselves are not the same from year to year &#8212; I taught the same course for twenty years, but I was still improving it and seeing implications that had previously escaped me. But the work of presenting the ideas over and over again, by itself, forces me to rethink and say things in different (and better) ways each time. This output side of scholarship was something I had not really appreciated enough. And I think it&#8217;s the explanation for why academics are better at communicating to a mixed audience than those who only ever talk to people from the same backgrounds.</p>
<p>I had always felt that working in a research lab rather than a university would be a more sterile experience, and now I think that I understand why. So this post is really a big thank you to all of my students, undergraduates and graduates, for making it possible for me to explain to you. and in the process understand better myself. And to those who got the earlier versions, my apologies, but it simply isn&#8217;t possible to polish first and then teach.</p>
<br /> Tagged: <a href='http://skillicorn.wordpress.com/tag/education/'>education</a>, <a href='http://skillicorn.wordpress.com/tag/research-intensive/'>research intensive</a>, <a href='http://skillicorn.wordpress.com/tag/scholarship/'>scholarship</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/skillicorn.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/skillicorn.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/skillicorn.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/skillicorn.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/skillicorn.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/skillicorn.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/skillicorn.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/skillicorn.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/skillicorn.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/skillicorn.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/skillicorn.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/skillicorn.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/skillicorn.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/skillicorn.wordpress.com/364/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=364&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://skillicorn.wordpress.com/2011/12/16/relentlessly-re-explaining/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c65eddaeb6dee8b498863c39fc078c41?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">skillicorn</media:title>
		</media:content>
	</item>
		<item>
		<title>Adversarial knowledge discovery is not just knowledge discovery with classified data</title>
		<link>http://skillicorn.wordpress.com/2011/12/01/adversarial-knowledge-discovery-is-not-just-knowledge-discovery-with-classified-data/</link>
		<comments>http://skillicorn.wordpress.com/2011/12/01/adversarial-knowledge-discovery-is-not-just-knowledge-discovery-with-classified-data/#comments</comments>
		<pubDate>Thu, 01 Dec 2011 21:33:16 +0000</pubDate>
		<dc:creator>skillicorn</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[adversarial]]></category>
		<category><![CDATA[data mining]]></category>
		<category><![CDATA[inductive modelling]]></category>
		<category><![CDATA[knowledge discovery]]></category>
		<category><![CDATA[prediction]]></category>
		<category><![CDATA[random forest]]></category>
		<category><![CDATA[support vector machine]]></category>

		<guid isPermaLink="false">http://skillicorn.wordpress.com/?p=360</guid>
		<description><![CDATA[Someone made a comment to me this week implying that data mining/knowledge discovery in classified settings was a straightforward problem because the algorithms didn&#8217;t have to be classified, just the datasets. This view isn&#8217;t accurate, for the following reason: mainstream/ordinary knowledge discovery builds models inductively from data by maximizing the fit between the model and [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=360&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Someone made a comment to me this week implying that data mining/knowledge discovery in classified settings was a straightforward problem because the algorithms didn&#8217;t have to be classified, just the datasets.</p>
<p>This view isn&#8217;t accurate, for the following reason: mainstream/ordinary knowledge discovery builds models inductively from data by maximizing the fit between the model and the available data. In an adversarial setting, this creates problems: adversaries can make a good guess about what your model will look like, and so can do better at hiding their records (making them less obvious or detectable); and they can also tell exactly what kinds of data they should try and force you to use if/when you retrain your model.</p>
<p>A simple example. The two leading-edge predictors today are support vector machines and random forests and, in mainstream settings, there&#8217;s often little to choose between them in prediction performance. However, in an adversarial setting, the difference is huge: support vector machines are relatively easy to game, because even one record that&#8217;s in the wrong place or mislabelled can change the entire orientation of the decision surface. To make it even worse, the way in which the boundary changes is roughly predictable from the kind of manipulation made. (You can see how this works in: J. G. Dutrisac, David B. Skillicorn: Subverting prediction in adversarial settings. ISI 2008: 19-24.). Random forests, on the other hand, are much more robust predictors, partly because their ensemble characteristics makes it hard to force particular behaviour because of the inherent randomness &#8216;inside&#8217; the algorithm.</p>
<p>The same thing happens with clustering algorithms. Algorithms such as Expectation-Maximization are easily misled by a few carefully chosen records; and no mainstream clustering technique does a good job of finding the kinds of &#8216;fringe&#8217; clusters that occur when something unusual is present in the data, but adversaries have tried hard to make it as usual as possible.</p>
<p>In fact, adversarial knowledge discovery requires building the entire framework of knowledge discovery over again, taking into account the adversarial nature of the problem from the very beginning. Some parts of mainstream knowledge discovery can be used directly; others with some adaptation; and others can&#8217;t safely be used. There are also some areas where we don&#8217;t know very much about how to solve problems that aren&#8217;t interesting in the mainstream, but are critical in the adversarial domain.</p>
<br /> Tagged: <a href='http://skillicorn.wordpress.com/tag/adversarial/'>adversarial</a>, <a href='http://skillicorn.wordpress.com/tag/data-mining/'>data mining</a>, <a href='http://skillicorn.wordpress.com/tag/inductive-modelling/'>inductive modelling</a>, <a href='http://skillicorn.wordpress.com/tag/knowledge-discovery/'>knowledge discovery</a>, <a href='http://skillicorn.wordpress.com/tag/prediction/'>prediction</a>, <a href='http://skillicorn.wordpress.com/tag/random-forest/'>random forest</a>, <a href='http://skillicorn.wordpress.com/tag/support-vector-machine/'>support vector machine</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/skillicorn.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/skillicorn.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/skillicorn.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/skillicorn.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/skillicorn.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/skillicorn.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/skillicorn.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/skillicorn.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/skillicorn.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/skillicorn.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/skillicorn.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/skillicorn.wordpress.com/360/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/skillicorn.wordpress.com/360/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/skillicorn.wordpress.com/360/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=360&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://skillicorn.wordpress.com/2011/12/01/adversarial-knowledge-discovery-is-not-just-knowledge-discovery-with-classified-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c65eddaeb6dee8b498863c39fc078c41?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">skillicorn</media:title>
		</media:content>
	</item>
		<item>
		<title>Language in Presidential Elections &#8212; 2012 Season Opener</title>
		<link>http://skillicorn.wordpress.com/2011/08/12/language-in-presidential-elections-2012-season-opener/</link>
		<comments>http://skillicorn.wordpress.com/2011/08/12/language-in-presidential-elections-2012-season-opener/#comments</comments>
		<pubDate>Fri, 12 Aug 2011 14:43:20 +0000</pubDate>
		<dc:creator>skillicorn</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[debate]]></category>
		<category><![CDATA[deception]]></category>
		<category><![CDATA[election]]></category>
		<category><![CDATA[language]]></category>
		<category><![CDATA[persona]]></category>
		<category><![CDATA[politics]]></category>
		<category><![CDATA[Republican]]></category>
		<category><![CDATA[spin]]></category>
		<category><![CDATA[textual analysis]]></category>
		<category><![CDATA[US election]]></category>

		<guid isPermaLink="false">http://skillicorn.wordpress.com/?p=356</guid>
		<description><![CDATA[Readers of this blog will know that we spent a lot of time analyzing the speeches of the U.S. presidential candidates in the 2008 election. Our primary interest was in the use of the deception model, a linguistic/textual model of how freeform language changes when the speaker/writer is being deceptive. In the political arena, factual [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=356&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Readers of this blog will know that we spent a lot of time analyzing the speeches of the U.S. presidential candidates in the 2008 election. Our primary interest was in the use of the deception model, a linguistic/textual model of how freeform language changes when the speaker/writer is being deceptive.</p>
<p>In the political arena, factual deception, saying things that just ain&#8217;t so, plays very little role, perhaps because voters have very low expectations of politicians in this area. What we call persona deception, presenting oneself as a better,wiser,  more powerful, more able, more knowledgeable person than one really is is the heart of successful campaigning. It turns out that the deception model captures deception across the whole range from factual to persona deception, so it gives us a lens to look at candidates and campaigns. What&#8217;s more, because language generation is almost entirely subconscious, this lens is hard to fool.</p>
<p>The most important skill candidates and their campaigns have is the ability to reach out to potential voters to convince them that they are better than the other possibilities. The language that they use is an important channel, especially in settings where everyone is conservatively dressed, and standing behind a podium that conceals most of their body language, as the Republican presidential field was in Iowa yesterday.</p>
<p>Strong candidates understand, at least instinctively, that they are not making arguments to convince voters, but presenting themselves as more compelling human beings. Our analysis of the speeches of candidates in the 2008 U.S. presidential election showed that candidates use three different kinds of speeches: blue skies speeches that promise generically good things and could be delivered interchangeably by any candidate – they are aimed at a wide audience; track record speeches that use past achievements to imply special qualifications for future achievements – they are aimed at swing voters; and manifesto speeches that describe a candidate’s personal qualities directly – they are aimed at a candidate’s base and reinforce common identity. But in all three cases, it’s not the content of the speech that matters, but what it implies about the speaker.</p>
<p>Our analysis in the last election cycle showed that Obama was by far the best as presenting himself as a wonderful person, and many voters, and certainly many in the media, projected onto the persona  positive qualities that were perhaps not there. Interestingly, yesterday was the first time I have seen open Democratic buyers remorse about electing Obama, something I predicted would happen from the analysis we did.</p>
<p>The Republican candidates’ debate in Ames showed what a shaky grasp many of the candidates have on how to be a convincing candidate. Of course, this venue was a difficult one. Its overt purpose was for candidates to explain themselves to the local Republican base ahead of the Ames Straw Poll,which would have required largely manifesto content; but national television coverage made it an unmissable opportunity to reach out to a wider, but much more diverse audience, suggesting track record content. Blue skies content is always dangerous in the early stages of a campaign because grand but potentially unwise statements can come back to haunt a candidate.</p>
<p>Manifesto content was indeed popular – for example, we learned how many children almost every candidate has – typical content aimed at the base (&#8220;I&#8217;m a parent just like you&#8221;). Several candidates also tried for track record content, but got it quite wrong. The purpose of a track record speech is not for candidates to read their resumes to the audience; it’s to make the argument “I was able to do A, so you can trust me to be able to do similar-but-larger B” and this second part was notably absent.</p>
<p>Voters also want candidates to be sincere &#8212; recall the famous quotation &#8220;The secret of success is sincerity. Once you can fake that you&#8217;ve got it made&#8221; (Jean Girardoux). This is not just a cute quotation; this is what good politicians are able to do. In Iowa, this was another area where almost everyone stumbled. It was clear that most of the candidates had not only prepared talking point responses to probable questions, but has also rehearsed actual answers. Delivering from a prepared and memorized script and seeming sincere is a difficult business, and actors who can do it reliably command high rewards.  Most of the candidates failed at seeming sincere. Several managed the worst of both worlds by trying to combine their prepared scripts with some ad libbing and came across as quite incoherent. One of the reasons for Gingrich’s strong showing is that he stayed away from scripts and delivered his answers as if he had just thought of them. Huntsman and Romney, in contrast, were especially wooden.</p>
<p>When humans listen to humans, the content matters. But when character is the issue, other aspects of language matter more. Much language generation is subconscious, and therefore beyond a candidate’s control. This is good for voters because it means we can sometimes see through to the real person no matter how sophisticated their speech writers and spin doctors.</p>
<br /> Tagged: <a href='http://skillicorn.wordpress.com/tag/debate/'>debate</a>, <a href='http://skillicorn.wordpress.com/tag/deception/'>deception</a>, <a href='http://skillicorn.wordpress.com/tag/election/'>election</a>, <a href='http://skillicorn.wordpress.com/tag/language/'>language</a>, <a href='http://skillicorn.wordpress.com/tag/persona/'>persona</a>, <a href='http://skillicorn.wordpress.com/tag/politics/'>politics</a>, <a href='http://skillicorn.wordpress.com/tag/republican/'>Republican</a>, <a href='http://skillicorn.wordpress.com/tag/spin/'>spin</a>, <a href='http://skillicorn.wordpress.com/tag/textual-analysis/'>textual analysis</a>, <a href='http://skillicorn.wordpress.com/tag/us-election/'>US election</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/skillicorn.wordpress.com/356/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/skillicorn.wordpress.com/356/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/skillicorn.wordpress.com/356/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/skillicorn.wordpress.com/356/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/skillicorn.wordpress.com/356/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/skillicorn.wordpress.com/356/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/skillicorn.wordpress.com/356/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/skillicorn.wordpress.com/356/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/skillicorn.wordpress.com/356/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/skillicorn.wordpress.com/356/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/skillicorn.wordpress.com/356/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/skillicorn.wordpress.com/356/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/skillicorn.wordpress.com/356/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/skillicorn.wordpress.com/356/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=356&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://skillicorn.wordpress.com/2011/08/12/language-in-presidential-elections-2012-season-opener/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c65eddaeb6dee8b498863c39fc078c41?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">skillicorn</media:title>
		</media:content>
	</item>
		<item>
		<title>Persistent Malware Attacks</title>
		<link>http://skillicorn.wordpress.com/2011/08/08/persistent-malware-attacks/</link>
		<comments>http://skillicorn.wordpress.com/2011/08/08/persistent-malware-attacks/#comments</comments>
		<pubDate>Mon, 08 Aug 2011 16:49:18 +0000</pubDate>
		<dc:creator>skillicorn</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cybersecurity]]></category>
		<category><![CDATA[malware]]></category>
		<category><![CDATA[Mcafee]]></category>
		<category><![CDATA[spear phishing]]></category>

		<guid isPermaLink="false">http://skillicorn.wordpress.com/?p=351</guid>
		<description><![CDATA[The revelation by McAfee last week has created some waves. Here are a few thoughts from an adversarial analysis perspective. The thing that has gotten attention about this report is that it describes attacks by a single attacker and single vector that have lasted over a long period of time (more than 5 years) and [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=351&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The revelation by McAfee last week has created some waves. Here are a few thoughts from an adversarial analysis perspective.</p>
<p>The thing that has gotten attention about this report is that it describes attacks by a single attacker and single vector that have lasted over a long period of time (more than 5 years) and have targeted governments, quasi-government organizations, and businesses in a sophisticated way. The attacks are being attributed to a particular state actor, for obvious reasons, but attribution is always murky in cyberspace so it&#8217;s (just) conceivable that someone else is responsible and covering themselves.</p>
<p>It was helpful to get this kind of information out into the public awareness. People on the inside (for several different values of inside) have known about these kinds of attacks, their frequency, and their huge impact for some time; but either haven&#8217;t wanted to or haven&#8217;t been allowed to reveal them.</p>
<p>The attacks themselves seem to have begun with a spear phishing attack on some mid-level person at each organization, so relatively unsophisticated but requiring substantial preliminary research. I&#8217;m not aware of any attempt to measure how easily spear phishing attacks work, but presumably with patience to try with enough spacing that nobody mentions it, not very much personalization is required. Once in, the attacks seem to have been quite sophisticated and long-lasting. Even after the report came out, several of the organizations were denying that they had been hacked. &#8220;Hacked and unaware&#8221; seems more likely than &#8220;not hacked&#8221; given that McAfee could see the IP logs.</p>
<p>Of course, since this is only one vector, it would be naive not to suppose that a number of other, broadly similar attacks are going on with other sources and vectors.</p>
<p>I did a fairly large number of media interviews about this report, and the obvious and common question that came up was: what did these organizations do wrong and what can be done to protect against these kinds of attacks? That&#8217;s a hard question to answer. Malware detection tools are in their infancy so, while running them is a good idea, they may not protect against sophisticated attacks very well. It doesn&#8217;t seem possible to protect completely against spear phishing, given the convenience of attachments. I received a number of emails from companies who claim that their approach/tools would have protected these organizations, but I didn&#8217;t see anything that looked like a substantial advance in the state of the art.</p>
<p>It may be that the time has come to do what the military and intelligence organizations do &#8212; to run separate networks that do not connect to the internet for anything that needs to be protected. This is, of course, relatively painful; and still not necessarily secure since data and software still need to be walked from one network to the other. But many organizations may need to take partial steps towards this kind of robust physical separation, since virtual separation is not working. In other words, firewalls don&#8217;t get the job done.</p>
<br /> Tagged: <a href='http://skillicorn.wordpress.com/tag/cybersecurity/'>cybersecurity</a>, <a href='http://skillicorn.wordpress.com/tag/malware/'>malware</a>, <a href='http://skillicorn.wordpress.com/tag/mcafee/'>Mcafee</a>, <a href='http://skillicorn.wordpress.com/tag/spear-phishing/'>spear phishing</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/skillicorn.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/skillicorn.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/skillicorn.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/skillicorn.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/skillicorn.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/skillicorn.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/skillicorn.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/skillicorn.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/skillicorn.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/skillicorn.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/skillicorn.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/skillicorn.wordpress.com/351/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/skillicorn.wordpress.com/351/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/skillicorn.wordpress.com/351/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=351&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://skillicorn.wordpress.com/2011/08/08/persistent-malware-attacks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c65eddaeb6dee8b498863c39fc078c41?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">skillicorn</media:title>
		</media:content>
	</item>
		<item>
		<title>Google Ideas and Extremism</title>
		<link>http://skillicorn.wordpress.com/2011/07/04/google-ideas-and-extremism/</link>
		<comments>http://skillicorn.wordpress.com/2011/07/04/google-ideas-and-extremism/#comments</comments>
		<pubDate>Mon, 04 Jul 2011 13:44:20 +0000</pubDate>
		<dc:creator>skillicorn</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[extremism]]></category>
		<category><![CDATA[Google Ideas]]></category>
		<category><![CDATA[radicalization]]></category>
		<category><![CDATA[terrorism]]></category>

		<guid isPermaLink="false">http://skillicorn.wordpress.com/?p=345</guid>
		<description><![CDATA[Google&#8217;s think/do tank (!!) is sponsoring a summit on extremism. See the post by Jared Cohen, its director, here. The problem is that, like many such discussions, it&#8217;s based on the autobiographies of a number of people who became extremists &#8212; the idea is to look for commonalities in such biographies as hints about the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=345&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Google&#8217;s think/do tank (!!) is sponsoring a summit on extremism. See the post by Jared Cohen, its director, <a href="http://googlepublicpolicy.blogspot.com/2011/06/google-ideas-launches-summit-against.html">here</a>.</p>
<p>The problem is that, like many such discussions, it&#8217;s based on the autobiographies of a number of people who became extremists &#8212; the idea is to look for commonalities in such biographies as hints about the process and/or drivers of extremism.</p>
<p>BUT it ignores the very large number of people from apparently identical backgrounds who didn&#8217;t join gangs, or the IRA, or jihadist groups! Such people are counterexamples to almost all explanations of what happens with radicalization, and yet they are often/usually ignored in the discussion.</p>
<p>So Google asks:</p>
<p>&#8220;Why does a 13-year old boy in a tough neighborhood in South Central LA join a gang? Why does a high school student in a quiet, Midwestern American town sign on neo-Nazis who preach white supremacy? Why does a young woman in the Middle East abandon her family and future and become a suicide bomber?&#8221;</p>
<p>But just as important are questions like: why did the 13-year old boy&#8217;s best friend and classmate NOT join a gang, etc.</p>
<p>This summit&#8217;s approach is called, in the research community, &#8220;sampling on the dependent variable&#8221;. Google should know better.</p>
<br /> Tagged: <a href='http://skillicorn.wordpress.com/tag/extremism/'>extremism</a>, <a href='http://skillicorn.wordpress.com/tag/google-ideas/'>Google Ideas</a>, <a href='http://skillicorn.wordpress.com/tag/radicalization/'>radicalization</a>, <a href='http://skillicorn.wordpress.com/tag/terrorism/'>terrorism</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/skillicorn.wordpress.com/345/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/skillicorn.wordpress.com/345/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/skillicorn.wordpress.com/345/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/skillicorn.wordpress.com/345/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/skillicorn.wordpress.com/345/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/skillicorn.wordpress.com/345/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/skillicorn.wordpress.com/345/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/skillicorn.wordpress.com/345/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/skillicorn.wordpress.com/345/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/skillicorn.wordpress.com/345/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/skillicorn.wordpress.com/345/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/skillicorn.wordpress.com/345/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/skillicorn.wordpress.com/345/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/skillicorn.wordpress.com/345/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=345&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://skillicorn.wordpress.com/2011/07/04/google-ideas-and-extremism/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c65eddaeb6dee8b498863c39fc078c41?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">skillicorn</media:title>
		</media:content>
	</item>
		<item>
		<title>The power is in the edges</title>
		<link>http://skillicorn.wordpress.com/2011/06/24/the-power-is-in-the-edges/</link>
		<comments>http://skillicorn.wordpress.com/2011/06/24/the-power-is-in-the-edges/#comments</comments>
		<pubDate>Fri, 24 Jun 2011 05:13:22 +0000</pubDate>
		<dc:creator>skillicorn</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[connecting the dots]]></category>
		<category><![CDATA[edge prediction]]></category>
		<category><![CDATA[link analysis]]></category>
		<category><![CDATA[LinkedIn]]></category>
		<category><![CDATA[social network analysis]]></category>

		<guid isPermaLink="false">http://skillicorn.wordpress.com/?p=341</guid>
		<description><![CDATA[I&#8217;ve argued that it isn&#8217;t social media unless there are relational edges between individuals or individual objects. These edges are the drivers of power because the graph structure that emerges from them reveals a lot more than the individual nodes and edges do. The number of LinkedIn contacts I have is now large enough that [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=341&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve argued that it isn&#8217;t social media unless there are relational edges between individuals or individual objects. These edges are the drivers of power because the graph structure that emerges from them reveals a lot more than the individual nodes and edges do.</p>
<p>The number of LinkedIn contacts I have is now large enough that I can tell this story. I know someone from one of the more secretive US government organizations. His (or it might be her) public web presence, of course, has nothing at all to do with his day job, and we&#8217;ve never exchanged emails using his public email. Yet LinkedIn suggests him as someone I might possibly know.</p>
<p>The reason must be that we have enough mutual connections that the software LinkedIn uses sees that there &#8220;should&#8221; be some connection between us &#8212; it is doing edge prediction. This is exactly the kind of analysis that good intelligence tools can do on relational/graph data. The knowledge is in the links, collectively; in other words, noticing the potential link between us requires knowing both the presence of some links and the absence of others (because the system doesn&#8217;t recommend other people whose web presence is as dissimilar from mine as his is).</p>
<p>So, well done LinkedIn, but a cautionary tale for security folks generally, and especially those who believe in anonymization &#8212; it can&#8217;t be done!</p>
<br /> Tagged: <a href='http://skillicorn.wordpress.com/tag/connecting-the-dots/'>connecting the dots</a>, <a href='http://skillicorn.wordpress.com/tag/edge-prediction/'>edge prediction</a>, <a href='http://skillicorn.wordpress.com/tag/link-analysis/'>link analysis</a>, <a href='http://skillicorn.wordpress.com/tag/linkedin/'>LinkedIn</a>, <a href='http://skillicorn.wordpress.com/tag/social-network-analysis/'>social network analysis</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/skillicorn.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/skillicorn.wordpress.com/341/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/skillicorn.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/skillicorn.wordpress.com/341/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/skillicorn.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/skillicorn.wordpress.com/341/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/skillicorn.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/skillicorn.wordpress.com/341/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/skillicorn.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/skillicorn.wordpress.com/341/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/skillicorn.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/skillicorn.wordpress.com/341/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/skillicorn.wordpress.com/341/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/skillicorn.wordpress.com/341/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=341&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://skillicorn.wordpress.com/2011/06/24/the-power-is-in-the-edges/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c65eddaeb6dee8b498863c39fc078c41?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">skillicorn</media:title>
		</media:content>
	</item>
		<item>
		<title>European Intelligence and Security Informatics conference</title>
		<link>http://skillicorn.wordpress.com/2011/06/23/european-intelligence-and-security-informatics-conference/</link>
		<comments>http://skillicorn.wordpress.com/2011/06/23/european-intelligence-and-security-informatics-conference/#comments</comments>
		<pubDate>Thu, 23 Jun 2011 14:15:00 +0000</pubDate>
		<dc:creator>skillicorn</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[intelligence]]></category>
		<category><![CDATA[intertestingness]]></category>
		<category><![CDATA[knowledge discovery]]></category>
		<category><![CDATA[open source]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[terrorism]]></category>

		<guid isPermaLink="false">http://skillicorn.wordpress.com/?p=338</guid>
		<description><![CDATA[The program is now available here and looks impressive (note also the associated Open Source Intelligence workshop in which one of my students has a paper about our work on interestingness). Tagged: intelligence, intertestingness, knowledge discovery, open source, security, terrorism<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=338&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The program is now available <a href="http://www.eisic.eu/">here</a> and looks impressive (note also the associated Open Source Intelligence workshop in which one of my students has a paper about our work on interestingness).</p>
<br /> Tagged: <a href='http://skillicorn.wordpress.com/tag/intelligence/'>intelligence</a>, <a href='http://skillicorn.wordpress.com/tag/intertestingness/'>intertestingness</a>, <a href='http://skillicorn.wordpress.com/tag/knowledge-discovery/'>knowledge discovery</a>, <a href='http://skillicorn.wordpress.com/tag/open-source/'>open source</a>, <a href='http://skillicorn.wordpress.com/tag/security/'>security</a>, <a href='http://skillicorn.wordpress.com/tag/terrorism/'>terrorism</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/skillicorn.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/skillicorn.wordpress.com/338/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/skillicorn.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/skillicorn.wordpress.com/338/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/skillicorn.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/skillicorn.wordpress.com/338/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/skillicorn.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/skillicorn.wordpress.com/338/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/skillicorn.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/skillicorn.wordpress.com/338/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/skillicorn.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/skillicorn.wordpress.com/338/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/skillicorn.wordpress.com/338/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/skillicorn.wordpress.com/338/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=338&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://skillicorn.wordpress.com/2011/06/23/european-intelligence-and-security-informatics-conference/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c65eddaeb6dee8b498863c39fc078c41?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">skillicorn</media:title>
		</media:content>
	</item>
		<item>
		<title>What is social media?</title>
		<link>http://skillicorn.wordpress.com/2011/06/22/what-is-social-media/</link>
		<comments>http://skillicorn.wordpress.com/2011/06/22/what-is-social-media/#comments</comments>
		<pubDate>Wed, 22 Jun 2011 14:01:10 +0000</pubDate>
		<dc:creator>skillicorn</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[social network]]></category>
		<category><![CDATA[social network analysis]]></category>
		<category><![CDATA[Twitter]]></category>

		<guid isPermaLink="false">http://skillicorn.wordpress.com/?p=334</guid>
		<description><![CDATA[I was at a meeting last week whose focus was on social media. It quickly became clear that there were two kinds of interests. One group wanted to build high-level systems that would revolutionize business and government (somehow) leveraging social media; another group were building or wanted to build tools that would provide some kind [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=334&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I was at a meeting last week whose focus was on social media. It quickly became clear that there were two kinds of interests. One group wanted to build high-level systems that would revolutionize business and government (somehow) leveraging social media; another group were building or wanted to build tools that would provide some kind of meta-view of social media content and activity.</p>
<p>The topic that was missing from all of the discussion was what social media was, and why it is the way it is; and so I came away feeling like the entire discussion, and quite a lot of work, was dancing on clouds. There seem to be a number of things that &#8220;everybody knows&#8221; about social media, but for which there seems to be little or no evidence. The Arab Spring was driven by social media! Well, maybe, but (a) was it and how much, (b) which parts were important and which were irrelevant?</p>
<p>It seems helpful to divide social media into three categories:</p>
<p>1.  Media that is essentially public access publishing or public access (micro)blogging. Although sites that provide this kind of functionality are often considered &#8220;social&#8221; there is almost nothing social about them &#8212; yes, the audience for posts can be restricted to a particular group, but that&#8217;s always been true of any publication. There is an interesting question lurking here though: what are the reasons why individuals read such posts? What kind of bond does it imply between the reader and the author? (Cynically, why would I care what even my closest friend had for breakfast?)</p>
<p>2. Media that start as public access publishing, but where the conversation built on an initial post is more important or interesting than the initial post itself &#8212; in other words, there&#8217;s something emergent in the conversation that transcends what any of the participants would have said ab initio. This is a kind of social knowledge or opinion construction, and there are lots of interesting questions about who participates, what their roles are, and how the content and tone are affected by the interactions. This is, of course, not a new phenomenon but what&#8217;s new is the scope and the detail of what&#8217;s recorded, allowing answers to be worked out in ways that were impractical or too expensive before.</p>
<p>3. Media in which explicit relational links are created between one person and another. This is the real heart of social media. Relational links between a pair of people have, of course, always existed, but they could only be constructed in a small number of ways and were (almost always) limited by geography.</p>
<p>The emergent structure of these links is a really interesting artifact that deserves study and from which we will probably learn a lot about what it means to be human in a global society. What does it mean when one person &#8220;friends&#8221; another? This is one question for which simple answers tend to be assumed, but even a brief consideration of A&#8217;s Facebook friends and the rest of A&#8217;s relationships in the real world quickly shows that there&#8217;s a complex connection between the two sets (and it depends heavily on characteristics of A).</p>
<p>One thing that quickly becomes clear when these questions are addressed computationally is that we aren&#8217;t going to get far until relationship links are <em>typed</em>. It&#8217;s fairly easy to look at each relationship and give it a numerical weight that reflects (say) closeness &#8212; but it&#8217;s still true that different kinds of relationships behave differently, and need to be modelled differently to understand them. (Social media sites should also implement this typing &#8212; not every piece of data should flow down every link of A&#8217;s social network.)</p>
<p>The fundamental question in a world where one person can create a visible relationship, is what does this mean &#8212; for the person creating it, for the person at the other end of the relationship, and for the emergent graph structure that a collection of these individual relationships creates. Good, solid answers to this question would be a foundation on which much more useful applications could be built.</p>
<br /> Tagged: <a href='http://skillicorn.wordpress.com/tag/facebook/'>Facebook</a>, <a href='http://skillicorn.wordpress.com/tag/social-media/'>social media</a>, <a href='http://skillicorn.wordpress.com/tag/social-network/'>social network</a>, <a href='http://skillicorn.wordpress.com/tag/social-network-analysis/'>social network analysis</a>, <a href='http://skillicorn.wordpress.com/tag/twitter/'>Twitter</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/skillicorn.wordpress.com/334/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/skillicorn.wordpress.com/334/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/skillicorn.wordpress.com/334/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/skillicorn.wordpress.com/334/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/skillicorn.wordpress.com/334/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/skillicorn.wordpress.com/334/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/skillicorn.wordpress.com/334/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/skillicorn.wordpress.com/334/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/skillicorn.wordpress.com/334/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/skillicorn.wordpress.com/334/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/skillicorn.wordpress.com/334/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/skillicorn.wordpress.com/334/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/skillicorn.wordpress.com/334/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/skillicorn.wordpress.com/334/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=skillicorn.wordpress.com&amp;blog=2947850&amp;post=334&amp;subd=skillicorn&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://skillicorn.wordpress.com/2011/06/22/what-is-social-media/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/c65eddaeb6dee8b498863c39fc078c41?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">skillicorn</media:title>
		</media:content>
	</item>
	</channel>
</rss>
