<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>20bits &#187; facebook</title>
	<atom:link href="http://20bits.com/tag/facebook/feed/" rel="self" type="application/rss+xml" />
	<link>http://20bits.com</link>
	<description>Driven by Data</description>
	<lastBuildDate>Wed, 07 Oct 2009 06:07:48 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Building a Social Network, Island by Island</title>
		<link>http://20bits.com/articles/building-a-social-network-island-by-island/</link>
		<comments>http://20bits.com/articles/building-a-social-network-island-by-island/#comments</comments>
		<pubDate>Mon, 11 May 2009 15:30:36 +0000</pubDate>
		<dc:creator>Jesse</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[acquisition]]></category>
		<category><![CDATA[density]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[hi5]]></category>
		<category><![CDATA[myspace]]></category>
		<category><![CDATA[product development]]></category>
		<category><![CDATA[social network analysis]]></category>
		<category><![CDATA[strategy]]></category>
		<category><![CDATA[xfire]]></category>

		<guid isPermaLink="false">http://20bits.com/?p=649</guid>
		<description><![CDATA[
A necessary condition for building a self-sustaining social network is density.  We understand this intuitively.  After all, a network of one person is hardly a &#8220;network&#8221; at all.



Metcalfe&#8217;s Law, which states that the value of a network grows in proportion to the square of the number of users of that network, express this [...]]]></description>
			<content:encoded><![CDATA[<p>
A necessary condition for building a self-sustaining social network is density.  We understand this intuitively.  After all, a network of one person is hardly a &#8220;network&#8221; at all.
</p>

<p>
Metcalfe&#8217;s Law, which states that the value of a network grows in proportion to the square of the number of users of that network, express this idea formally.  The value of a social network rests in its ability to foster communication, in its connections.
</p>

<p>	
If you&#8217;re building a social network, whether it&#8217;s a destination website or an application that exists on another social network, density must figure into key strategic decisions.  Let&#8217;s see how.
</p>

<h3>Islands</h3>
<p>
One way to think about social networks is as a network of networks.  On Facebook, for example, if I graph the connections between all my friends I see distinct groups: my high school friends, my college friends, my current circle of friends, and my professional network.
</p>

<p>
Each group of people is more or less isolated from each other.  Density exists within each of these islands, but not between them.  I&#8217;m fairly certain this is a topological property of any social network.
</p>

<h3>Case Study: Facebook</h3>
<p>
Facebook started at Harvard and was initially college-only.  Their growth strategy was explicit from day one: move from school to school as demand warranted.  I&#8217;ve been told that Sean Parker wouldn&#8217;t consider opening up access to a new school until at least several dozen students from that school requested an account.
</p>

<p>
Each school was an island.  Once Facebook saturated a specific set of colleges it moved onto the next round.  Eventually there was enough anticipatory buzz that they could launch at large, state schools without risk of fading away.
</p>

<p>
They still pursue this strategy today.  After establishing a critical density among colleges they opened up access to high schools and then to everyone with an email address.  From there they started moving country-to-country.
</p>

<p>
The countries where Facebook is having the most difficulty gaining traction are the ones with already-established social networks, like Germany with <a href="http://en.wikipedia.org/wiki/StudiVZ">StudiVZ</a>.  In fact, if you look at <a href="http://gawker.com/tech/data-junkie/the-world-map-of-social-networks-273201.php">this old map</a> of the most popular social network in each country, you get an idea of how isolated this country-by-country growth really is.
</p>

<h3>Other Networks</h3>
<p>
Facebook isn&#8217;t the only example of a social network who grew this way.  hi5 has a similar story, starting with smaller markets overseas and spreading from country-to-country.  Or Craigslist, by starting small in San Francisco and eventually becoming a presence in most major US cities.
</p>

<p>
The MySpace team had a background in direct marketing, which is all about targeting specific offers at the people who are most likely to respond.  They started with the club scene in LA and grew from there.
</p>

<p>
The key to all these strategies was density.  
</p>

<p>
If you&#8217;re launching a new social service, even if your end goal is to have everyone and their mother using it, it&#8217;s important to understand the impact density has on the growth
</p>

<h3>Multiple Dimensions of Density</h3>
<p>
So far the only kind of density we&#8217;ve talked about is network density, i.e., multiple people connected through their shared use of a service.  You could call this &#8220;product density.&#8221;
</p>

<p>
Sometimes product density isn&#8217;t enough.  Take IM, for example, or any network that requires synchronous communication.  Not only do two people have to be using the same product but they have to be using it at the same time.  What good is your friend being on IM if you&#8217;re never awake at the same time?
</p>

<p>
<a href="http://en.wikipedia.org/wiki/Xfire">Xfire</a>, an IM client for gamers, is an example of a product that innovated in this space by tackling a segment of customers who were already interacting synchronously.
</p>

<p>
Mobile social networks take this to an even greater extreme.  To connect with people on Loopt or Google Latitude not only do we have to be using the same product at the same time, but we have to be in the same place!
</p>

<p>
This isn&#8217;t to say building these networks is impossible.  Rather, they come with an extra handicap in the form of reduced density.  Overcoming that problem has to be a key part of the product strategy.
</p>

<h3>Conclusion and Counterexamples?</h3>
<p>
Most product strategy discussions, in my experience, are focused on acquisition or other topline metrics that go &#8220;up and to the right.&#8221;  Instead, if you&#8217;re building social software, I believe density is a necessary condition for long-term success and needs to be a part of the strategy discussion from day one.
</p>

<p>
First, understand the density requirements for your product.  Do customers need to sign up for the same service?  Do they need to be using it at the same time?  Do they need to be in the same place?  Is there anything you can do to lower the density requirement?
</p>

<p>
Second, build a &#8220;depth first&#8221; strategy.  Are there any naturally dense customer segments that might fit your product?  Do you have the ability to target specific demographics or segments for acquisition?  Which ones respond positively and is it possible to build density there?
</p>

<p>
Once you&#8217;ve achieved sufficient density on one island hop to the next and repeat.
</p>

<p>
And if anyone out there can think of any counterexamples &mdash; social networks or services that got big &#8220;all at once&#8221; &mdash; leave a comment and let me know!  I honestly can&#8217;t think of any.
</p>]]></content:encoded>
			<wfw:commentRss>http://20bits.com/articles/building-a-social-network-island-by-island/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Notification Strategies for Social Networks</title>
		<link>http://20bits.com/articles/notification-strategies-for-social-networks/</link>
		<comments>http://20bits.com/articles/notification-strategies-for-social-networks/#comments</comments>
		<pubDate>Tue, 05 May 2009 16:00:02 +0000</pubDate>
		<dc:creator>Jesse</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[notifications]]></category>
		<category><![CDATA[retention]]></category>
		<category><![CDATA[social network analysis]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://20bits.com/?p=599</guid>
		<description><![CDATA[
You&#8217;ve built a social application and launched a new feature.  The number of notifications you can send out is constrained.  Which set of users should you notify to guarantee the most people start using this new feature?



This problem might seem artificial.  Why not put up an ad on your product, or send [...]]]></description>
			<content:encoded><![CDATA[<p>
You&#8217;ve built a social application and launched a new feature.  The number of notifications you can send out is constrained.  Which set of users should you notify to guarantee the most people start using this new feature?
</p>

<p>
This problem might seem artificial.  Why not put up an ad on your product, or send a notification to every single person who might be interested?  There are several reasons the number of people you can notify might be constrained.
</p>
<ol>
<li>There is a technical constraint, e.g., Facebook limits the number of application-to-user notifications at the API level.</li>
<li>There is a financial constraint, e.g., you&#8217;re sending notifications over SMS and every message costs you money.</li>
<li>There is a strategic constraint, e.g., sending notifications too frequently causes fatigue and reduces the effectiveness of future notifications.</li>
</ol>

<p>
So, the situation is not too far fetched.  Let&#8217;s investigate the issue.
</p>

<p>
For the rest of the article the &#8220;application&#8221; is going to be a Facebook application and it can only send 100 application-to-user notifications per week.  Which 100 users should we notify?
</p>

<h3>The Basic Considerations</h3>
<p>
In the <a href="http://20bits.com/articles/behavior-adoption-on-social-networks/">linear cascade model</a> when a user in a social network adopts a new behavior there is a probability that each neighbor in the network will adopt it.
</p>

<p>
Under this model we probably wouldn&#8217;t want to notify two people who are friends, and especially not a cluster of friends or a <a href="http://en.wikipedia.org/wiki/Clique_(graph_theory)">clique</a>.  The new feature wouldn&#8217;t spread very far beyond this group.
</p>

<p>
Likewise, we wouldn&#8217;t want to notify people who are very far apart on the social network because a user is more likely to adopt a behavior if more than one of his friends has also adopted it.  So there is a balancing act between notifying users who are close together, to achieve density, and notifying users who are far apart, to achieve breadth.
</p>

<h3>Heuristics and Centrality Measures</h3>
<p>
The easiest solution is to pick 100 random users to notify, but this is also the most naive since it takes into account neither the structure of the network nor likelihood that a person will influence their neighbors.
</p>

<p>
A better<sup>1</sup> solution to this problem is to develop a heuristic that ranks every user in the network according to some metric.  If we can only send 100 notifications then they are sent to the first 100 people on this ranked list.
</p>

<p>
The idea here is to use <a href="http://en.wikipedia.org/wiki/Centrality">centrality measures</a> to come up with heuristics.  In graph theory &#8220;centrality&#8221; is a measure of how important an individual node is.
</p>

<p>
The simplest measure is called &#8220;degree centrality&#8221; and is equal to the number of neighbors of a node.  On a social network this is the number of friends of a given user.  So, if you wanted to send out 100 notifications using this heuristic we&#8217;d send notifications to the 100 users with the most friends.  This heuristic involves convincing celebrities to use the new feature.
</p>

<p>
There are other, more complex heuristics.  The Wikipedia article linked above has a list of other centrality measures, and I wrote an article about calculating <a href="http://20bits.com/articles/graph-theory-part-iii-facebook/">eigenvalue centrality</a>, which is similar to PageRank.  Each of these admits a heuristic which can tell us which users to notify.
</p>

<p>
Of course, which strategy works best is hard to know beforehand, as it varies with respect to both time and the underlying notification.  A/B testing this is difficult because the effects are intentionally dependent.  If anyone has a good solution to this that doesn&#8217;t involve collecting massive amounts of data about user behavior I&#8217;d be interested in hearing it!
</p>

<p>
It should be noted that each of these heuristics only takes into account the underlying structure of the graph and not the probability of &#8220;infection.&#8221;  By including the latter we can come up with a nearly exact model of the optimal subset of users to notify.
</p>

<h3>A Global Solution</h3>
<p>
<a href="http://www.cs.cmu.edu/~bmeeder/">Brendan Meeder</a> at CMU pointed me to a great paper that discuss this very topic, <a href="http://www.cs.cornell.edu/home/kleinber/kdd03-inf.pdf">Maximizing the Spread of Inﬂuence through a Social Network</a> by Kempe, et al.
</p>

<p>
Rather than take a localized view of the problem by ranking each node individually, we create a statistical model of how the new feature propagates through the network.
</p>

<p>
First, we start with a finite seed set, A.  In our case A is a set of 100 users.  Say we convert each of these 100 users.
</p>

<p>
In our model if a user <em>u</em> is converted then for each neighbor <em>v</em> there is some probability
</p>
<div class="math">
<img src='http://s.wordpress.com/latex.php?latex=p_%7Bu%2Cv%7D&#038;bg=ffffff&#038;fg=000000&#038;s=2' alt='p_{u,v}' title='p_{u,v}' class='latex' />
</div>
<p>
that v will also be converted.
</p>

<p>
After the process has run its course some set of users has adopted the new feature.  Because adoption is probabilistic the size of this final configuration is a random variable.  Using the notation from Kempe, et al., for a given seed set A the size of the final set of adopters is a random variable denoted by
</p>
<div class="math">
<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%7B%5Cvarphi%5Cleft%28A%5Cright%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=2' alt='\displaystyle{\varphi\left(A\right)}' title='\displaystyle{\varphi\left(A\right)}' class='latex' />
</div>

<p>
Our goal is to pick the set A which maximizes the expected value of this random variable.
</p>

<p>
Formally, we want to find the subset A such that
</p>
<div class="math">
<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%7B%5Csigma%5Cleft%28A%5Cright%29%20%3D%20E%5Cleft%5B%5Cvarphi%5Cleft%28A%5Cright%29%5Cright%5D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=2' alt='\displaystyle{\sigma\left(A\right) = E\left[\varphi\left(A\right)\right]}' title='\displaystyle{\sigma\left(A\right) = E\left[\varphi\left(A\right)\right]}' class='latex' />
</div>
<p>
is maximized, where
</p>
<div class="math">
<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%7B%5Csigma%5Cleft%28%5Ccdot%5Cright%29%7D&#038;bg=ffffff&#038;fg=000000&#038;s=2' alt='\displaystyle{\sigma\left(\cdot\right)}' title='\displaystyle{\sigma\left(\cdot\right)}' class='latex' />
</div>
<p>
is called the <em>influence function</em>.
</p>

<h3>The Algorithm and The Results</h3>
<p>
It turns out that calculating the influence function exactly is <a href="http://en.wikipedia.org/wiki/NP-hard">NP-hard</a>, but there is a <a href="http://en.wikipedia.org/wiki/Greedy_algorithm">greedy algorithm</a> which approximates the value under certain (unrestrictive) conditions.
</p>

<p>
If you want more details read the paper linked above or the related <a href="http://www.cs.cornell.edu/home/kleinber/icalp05-inf.pdf">Inﬂuential Nodes in a Diﬀusion Model for Social Networks</a> by Kempe, et al.
</p>

<p>
Using Monte Carlo methods Kempe, et al. simulated the diffusion process using this algorithm versus several of the heuristics I described above.  The results are fairly striking: their algorithm performs at least 18% beter than the best-performing heuristic (degree centrality) and 48% better than if the seed set were randomly selected.  I&#8217;ve embedded a graph of their results below.
</p>
<img src="http://20bits.com/wp-content/uploads/2009/05/kempe-graph.png" alt="kempe-graph" title="kempe-graph" width="429" height="334" class="alignnone math size-full wp-image-613" />
<p>
The &#8220;target set&#8221; is the initial seed set of users to notify, and the &#8220;active set&#8221; is the final set of users who actually adopted the new feature or product.  The more users who adopt the feature the better the strategy.
</p>

<h3>Feasibility</h3>
<p>
Kempe&#8217;s algorithm is more feasible than many of the heuristics discussed above, although the best performing heuristic &mdash; degree centrality &mdash; is also the easiest to calculate.  He also doesn&#8217;t include eigenvalue centrality in his analysis, which I&#8217;d be interested in comparing.
</p>

<p>
The biggest downside to his algorithm is that it requires both full knowledge of the underlying graph and an accounting of all the user-to-user transmission probabilities.  Modeling these probabilities would require a lot of data about users over an extended period of time.
</p>

<p>
Whether the additional 18% is worth the extra computation and data collection depends on a lot on specific circumstances, but personally I&#8217;m going to try to implement it in my projects and see how the performance compares first-hand.
</p><ol class="footnotes"><li id="footnote_0_599" class="footnote">&#8220;Better&#8221; according to what?  As we&#8217;ll see, randomly selecting seed users performs worse than all the other heuristics.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://20bits.com/articles/notification-strategies-for-social-networks/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
		<item>
		<title>Why hi5 Might Have an Edge on Facebook</title>
		<link>http://20bits.com/articles/why-hi5-might-have-an-edge-on-facebook/</link>
		<comments>http://20bits.com/articles/why-hi5-might-have-an-edge-on-facebook/#comments</comments>
		<pubDate>Tue, 28 Apr 2009 17:50:26 +0000</pubDate>
		<dc:creator>Jesse</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[hi5]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[myspace]]></category>
		<category><![CDATA[opinion]]></category>
		<category><![CDATA[social-networking]]></category>
		<category><![CDATA[strategy]]></category>
		<category><![CDATA[virtual goods]]></category>

		<guid isPermaLink="false">http://20bits.com/?p=579</guid>
		<description><![CDATA[
Facebook has been trying hard to find a business model.  Their Beacon advertising product is probably the most infamous example.  So far they&#8217;ve been left empty handed and have been forced to look outside the company for money, first from Microsoft1 and then from foreign investors.2



If Facebook wants to be the internet&#8217;s cable [...]]]></description>
			<content:encoded><![CDATA[<p>
Facebook has been trying hard to find a business model.  Their <a href="http://en.wikipedia.org/wiki/Beacon_(Facebook)">Beacon</a> advertising product is probably the most infamous example.  So far they&#8217;ve been left empty handed and have been forced to look outside the company for money, first from Microsoft<sup>1</sup> and then from foreign investors.<sup>2</sup>
</p>

<p>
If Facebook wants to be <a href="http://news.cnet.com/8301-10784_3-9946606-7.html">the internet&#8217;s cable company</a> what are they going to have to do to turn themselves into a <a href="http://www.google.com/finance?q=NASDAQ%3ACMCSA">$40Bn</a> company?
</p>

<h3>Are Virtual Goods the Key?</h3>
<p>
Not all social networks are struggling to find a great business model.  Tencent, a Chinese social networking company, pulled in over $1Bn in revenue last year, primarily through its use of virtual goods.<sup>3</sup>
</p>

<p>
But Facebook doesn&#8217;t need to look overseas to see that virtual goods could work for them.  Most of the top Facebook games use virtual currency to make money, powered by leadgen-based ad networks like <a href="http://offerpal.com">Offerpal</a> and <a href="http://getgambit.com">Gambit</a>.  There are reports that some of these apps are pulling in eight figures per year.<sup>4</sup>
</p>

<p>
And of course there&#8217;s Facebook&#8217;s own gifting service, which has recently moved to a virtual currency system, pricing gifts in &#8220;points&#8221; that can be bought with real money.<sup>5</sup>
</p>

<p>
All of this is to say that it appears that virtual goods are a natural business model for social networks and Facebook has enough data to see that.  Why isn&#8217;t Facebook pursuing this strategy more aggressively?  Why do they seem dead-set on building advertising technologies like Social Ads and Beacon?
</p>

<h3>The US Advertising Crutch</h3>
<p>
In the world of advertising not all countries are equal.  US traffic is generally valued the highest, followed by other English-speaking countries, the <a href="http://en.wikipedia.org/wiki/G20_industrial_nations">G20</a>
, and finally the rest of the world.
</p>

<p>
Until recently Facebook was concentrated in the English-speaking world.  It&#8217;s the second largest social network in the US, after MySpace, and the largest in both Canada and the UK.
</p>

<p>
Unlike other social networks which don&#8217;t have a significant presence in the English-speaking world, Facebook can support itself through advertising.  This is a crutch that prevents Facebook making bold decisions with their business model.  I believe Facebook sees themselves as the next Google, one piece of technology away from <a href="http://www.roughtype.com/archives/2007/11/the_social_graf_1.php">changing the world of advertising</a>.
</p>

<h3>The Demographic Crunch</h3>
<p>
Not all social networks have Facebook&#8217;s demographics, of course. hi5, the world&#8217;s third largest social network after MySpace and Facebook, has an extensive presence throughout Latin America and other countries which advertisers and publishers typically ignore.  The same can be said of the advertising market in China, but recall that Tencent pulled in $1Bn last year through virtual goods.
</p>

<p>
It&#8217;s little wonder, then, that hi5 is aggressively pursuing a virtual goods strategy.<sup>6</sup>  Their demographics makes this strategy much more appealing.  Facebook has the money and the audience to waste pursuing a pure-advertising strategy for social networks.
</p>

<p>
What once seemed like a demographic disadvantage might turn out to be a demographic advantage for hi5.  Will they beat Facebook to the business model punch?
</p>
<p>
And a year from now will we be reading articles about Facebook&#8217;s virtual goods strategy compares to hi5&#8217;s, as opposed to articles about how Facebook&#8217;s new homepage compares to Twitter?
</p>

<h3> You&#8217;re crazy.  You know that, right?</h3>

<p>
Obviously hi5 has an uphill battle.  Facebook is growing on the order of 500,000 new users <em>per day</em> and shows no signs of slowing.  But the same was said of MySpace and Friendster when Facebook launched.  I think we still have a few more twists in the story of social networking on the web, and this is just one possible twist among many.
</p><ol class="footnotes"><li id="footnote_0_579" class="footnote"><a href="http://news.cnet.com/8301-13577_3-9803872-36.html">Microsoft acquires equity stake in Facebook, expands ad partnership</a> (cnet)</li><li id="footnote_1_579" class="footnote"><a href="http://www.businessinsider.com/2008/11/update-on-facebook-s-dubai-fundraising-trip">Update On Facebook&#8217;s Dubai Fundraising Trip</a> (Business Insider)</li><li id="footnote_2_579" class="footnote"><a href="http://venturebeat.com/2009/03/19/the-worlds-most-lucrative-social-network-chinas-tencent-beats-1-billion-revenue-mark/">The world’s most lucrative social network? China’s Tencent beats $1 billion revenue mark.</a> (VentureBeat)</li><li id="footnote_3_579" class="footnote"><a href="http://venturebeat.com/2008/08/25/developer-analytics-facebook-game-mob-wars-making-22000-a-day/">Developer Analytics: Facebook game Mob Wars making $22,000 a day</a> (VentureBear)</li><li id="footnote_4_579" class="footnote"><a href="http://blog.facebook.com/blog.php?post=36577782130">Gift Shop Credits Have Arrived</a> (Facebook)</li><li id="footnote_5_579" class="footnote"><a href="http://venturebeat.com/2009/01/22/hi5s-virtual-entertainment-plans-could-hit-a-virtual-jackpot/">Hi5’s virtual entertainment plans could hit a virtual jackpot</a> (VentureBeat)</li></ol>]]></content:encoded>
			<wfw:commentRss>http://20bits.com/articles/why-hi5-might-have-an-edge-on-facebook/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Behavior Adoption on Social Networks</title>
		<link>http://20bits.com/articles/behavior-adoption-on-social-networks/</link>
		<comments>http://20bits.com/articles/behavior-adoption-on-social-networks/#comments</comments>
		<pubDate>Fri, 24 Apr 2009 16:20:12 +0000</pubDate>
		<dc:creator>Jesse</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[graph-theory]]></category>
		<category><![CDATA[mathematics]]></category>
		<category><![CDATA[obama]]></category>
		<category><![CDATA[social-networking]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[twitter]]></category>
		<category><![CDATA[viral growth]]></category>

		<guid isPermaLink="false">http://20bits.com/?p=548</guid>
		<description><![CDATA[
Why and how do people adopt new behaviors?  Why do they start using new products?  Did you sign up for Facebook because all of your friends were on it, or because a specific friend recommended it to you?  Or do you refuse to sign up at all?



In this article I&#8217;m going to [...]]]></description>
			<content:encoded><![CDATA[<p>
Why and how do people adopt new behaviors?  Why do they start using new products?  Did you sign up for Facebook because all of your friends were on it, or because a specific friend recommended it to you?  Or do you refuse to sign up at all?
</p>

<p>
In this article I&#8217;m going to outline two models that describe how new behaviors, ideas, and messages propagate through social networks.
</p>

<h3>The Threshold Model</h3>
<p>
The first model is called the Threshold Model.<sup>1</sup>  It says that people adopt a new behavior bceause a sufficiently large proportion of their friends have adopted that behavior.  Early adopters have a very low threshold, say 5% or 10%, while late adopters would have a much higher threshold. Every person, however, has their own individual threshold.
</p>

<p>
For example, my girlfriend&#8217;s stated reason for signing up for Twitter was that &#8220;all my friends were using it.&#8221;  And during the 2008 US Presidential election, some Obama supporters would adopt Hussein as their middle name.<sup>2</sup>  When I saw that lots of my friends were doing it I was certainly tempted to do the same.
</p>

<p>
The underlying psychological principle is one of &#8220;missing out&#8221; or &#8220;when in Rome.&#8221;  The key variable here is the initial distribution of thresholds across a social network, which describes in totality the final extent of the behavior.
</p>

<p>
It&#8217;s worth noting that this model says nothing about how people <em>initially</em> adopt behavior.  That is, it says nothing about innovators, only about the spread of innovation through a social network.
</p>

<h3>The Cascade Model</h3>
<p>
The second model is called the Cascade or Word-of-Mouth Model<sup>3</sup>, and is the method of &#8220;viral growth&#8221; that most <a href="http://20bits.com/articles/social-applications-are-social-networks/">social application developers</a> are familiar with.  It says that every person has a chance of adopting a new behavior whenever one of their neighbors adopts it.
</p>

<p>
This model describes phenomena like product recommendations or user-to-user notifications on Facebook.  The probability that a person adopts the new behavior is the conversion rate for the notification.<sup>4</sup>
</p>

<p>
This probability is both a function of the sender and the recipient, so more influential people are more likely to convince you to adopt a behavior (or purchase a product, or install an application).
</p>

<h3>Practical Implications</h3>
<p>
Both of these models describe facets of real-world interaction on social networks.  My take is that the cascade model is more accurate at the beginning of a social network&#8217;s life, where behavior is spreading through sparse areas, connected by influencers.  Later on, after a critical density has settled in, people start adopting the behavior because everyone else is adopting it and there&#8217;s a social cost to not doing the same.
</p>

<p>
We see this pattern in services like Facebook and MySpace, both of which got their start by harvesting emails and spreading through word-of-mouth (and spam) across a social network.<sup>5</sup> Eventually each network reached a point where a sufficient number of people were familiar with the product and new users adopted it not because their friends recommended it (the cascade model), but because there was a social expectation that they do (the threshold model).
</p>

<p>
Also, with respect to analytics and viral growth, the threshold model is more difficult to track.  In the cascade model we record who sent what to whom and which messages they responded to.  It&#8217;s clear who gets credit for a user&#8217;s conversion.  In the threshold model you have to track passive exposures, and there&#8217;s no clear causal relationship. 
</p>

<p>
If ten of my friends are doing something and I decide to start doing the same thing, who gets credit?  Most analytics packages will show this behavior as a direct visit, with no connection to other users&#8217; behavior, even though there is a viral process underlying it.
</p>

<p>
In short, the threshold model requires a certain level of behavioral density, while the cascade model doesn&#8217;t.  However, we see both models expressed in how people actually adopt new behaviors in social contexts.
</p>

<h3>Formalisms</h3>
<p>
In the threshold model every person <em>u</em> has a threshold
</p>
<div class="math"><img src='http://s.wordpress.com/latex.php?latex=T_u%20%5Cin%20%5B0%2C1%5D&#038;bg=ffffff&#038;fg=000000&#038;s=2' alt='T_u \in [0,1]' title='T_u \in [0,1]' class='latex' /></div>
<p>
and each of their neighbors <em>v</em> is weighted according to
</p>
<div class="math"><img src='http://s.wordpress.com/latex.php?latex=w_%7Bu%2Cv%7D&#038;bg=ffffff&#038;fg=000000&#038;s=2' alt='w_{u,v}' title='w_{u,v}' class='latex' /></div>
<p>
If
</p>
<div class="math"<img src='http://s.wordpress.com/latex.php?latex=%5Cdisplaystyle%7BT_u%20%3C%20%5Csum_%7Bv%20%5Cin%20%5Ctext%7Badopters%7D%7D%20w_%7Bu%2Cv%7D%7D&#038;bg=ffffff&#038;fg=000000&#038;s=2' alt='\displaystyle{T_u &lt; \sum_{v \in \text{adopters}} w_{u,v}}' title='\displaystyle{T_u &lt; \sum_{v \in \text{adopters}} w_{u,v}}' class='latex' /></div>
<p>
then the person <em>u</em> adopts the behavior.
</p>

<p>
The set of thresholds, weights, and initial adopters completely determines the extent of the behavior in the social network.
</p>

<p>
In the cascade model, for every person <em>u</em> and neighbor <em>v</em> there is a random variable
</p>
<div class="math"><img src='http://s.wordpress.com/latex.php?latex=X_%7Bu%2Cv%7D&#038;bg=ffffff&#038;fg=000000&#038;s=2' alt='X_{u,v}' title='X_{u,v}' class='latex' /></div>
<p>
which describes the likelihood of <em>u</em> adopting the behavior if <em>v</em> has adopted it.
</p>

<h3>Takeaways</h3>
<p>
I&#8217;ll try to boil all this down into a few, practical takeaways.
</p>
<ol>
	<li>The Threshold and Cascade Models describe two mechanisms of behavior adoption in social networks.</li>
	<li>The Threshold Model says that people do something if enough of their friends are doing it.</li>
	<li>The Cascade Model says that people have a chance of doing something if one of their friends is doing it.</li>
	<li>Both models correspond to different real-life adoption patterns.</li>
	<li>The typical &#8220;viral loop&#8221; involves the cascade model, but most successful social networks rely on the mechanics of the threshold model in the long run, i.e., density is important for long-term success.</li>
	<li>The cascade model is a good tool for analyzing acquisition scenarios, but the threshold model is probably more helpful for understanding retention and engagement &mdash; it at least implies that <em>density</em> is a key factor in social network growth, a metric that&#8217;s not often discussed publicly.</li>
</ol> 

<p>
Agree?  Disagree?  Leave a comment, send me an email, or <a href="http://twitter.com/jessefarmer">follow me on Twitter</a>!
</p><ol class="footnotes"><li id="footnote_0_548" class="footnote"><footnote>See <a href="http://rumordynamics.awardspace.com/phfs/Threshold_Models_of_Collective_Behavior.pdf">Threshold Models of Collective Behavior</a> (1978) by the famous sociologist Mark Granovetter. </li><li id="footnote_1_548" class="footnote">See <a href="http://www.huffingtonpost.com/2008/06/28/obama-supporters-adopting_n_109788.html">Obama Supporters Adopting Middle Name &#8220;Hussein&#8221; As Their Own</a></li><li id="footnote_2_548" class="footnote">See <a href="http://pluto.huji.ac.il/~msgolden/home_page/pdf/TalkofNetworks.pdf">Talk of the Network: A Complex Systems Look at the 
Underlying Process of Word-of-Mouth</a> (2001) by Goldenburg, Libari, and Muller.</li><li id="footnote_3_548" class="footnote">More accurately, we&#8217;d model the &#8220;probability&#8221; as a random variable whose mean was the conversion rate.</li><li id="footnote_4_548" class="footnote">See <a href="http://www.amazon.com/Stealing-MySpace-Control-Popular-Website/dp/1400066948">Stealing MySpace: The Battle to Control the Most Popular Website in America</a> for details about the MySpace team&#8217;s background in direct marketing.  The ConnectU vs. Facebook court documents, which you can find via Google, paint a similar story for Facebook&#8217;s early years.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://20bits.com/articles/behavior-adoption-on-social-networks/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Social Applications are Social Networks</title>
		<link>http://20bits.com/articles/social-applications-are-social-networks/</link>
		<comments>http://20bits.com/articles/social-applications-are-social-networks/#comments</comments>
		<pubDate>Thu, 09 Apr 2009 15:00:51 +0000</pubDate>
		<dc:creator>Jesse</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[engagement]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[mathematics]]></category>
		<category><![CDATA[monetization]]></category>
		<category><![CDATA[retention]]></category>
		<category><![CDATA[social network analysis]]></category>
		<category><![CDATA[social-networking]]></category>
		<category><![CDATA[top friends]]></category>
		<category><![CDATA[virality]]></category>

		<guid isPermaLink="false">http://20bits.com/?p=485</guid>
		<description><![CDATA[
Are all social applications also social networks?  Dave McClure made a passing reference to this a little over a year ago, saying &#8220;RockYou &#038; Slide [are] arguably social networks of their own.&#8221;1  I want to make the stronger claim: social applications are always social networks.



It doesn&#8217;t matter how large you are, it doesn&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>
Are all social applications also social networks?  Dave McClure made a passing reference to this a little over a year ago, saying &#8220;RockYou &#038; Slide [are] arguably social networks of their own.&#8221;<sup>1</sup>  I want to make the stronger claim: social applications are always social networks.
</p>

<p>
It doesn&#8217;t matter how large you are, it doesn&#8217;t matter what your goals are, and it doesn&#8217;t matter what your product is.  I think if you&#8217;re building a social application then you&#8217;re trying to build a new social network.  As we&#8217;ll see, this has both strategic and technical implications.
</p>

<h3>What is a Social Network?</h3>
<p>
First, if I&#8217;m going to convince you that something is a social network we should understand what a social network is. If you ask a person to name a few social networks, they will probably list services like Facebook, MySpace, and Twitter.  And if an investor tells you they&#8217;re &#8220;not investing in social networks,&#8221; they mean it in this concrete, social-network-as-a-product sense.
</p>

<p>
Others, like Brad Fitzpatrick and Mark Zuckerberg, use the term <em>social graph</em><sup>2</sup> to distinguish between the underlying social relations between people and the services, called social networks, that are built on top of them.
</p>
<p>
But if there&#8217;s one thing I learned from my mathematics education it&#8217;s this: we&#8217;re free to define things however we want so long as they&#8217;re consistent.  Therefore we ought to choose the definition that helps us get our job done.
</p>

<p>
So, here is my first, and most abstract definition: <blockquote>A social network is a collection of people bound together through a specific set of social relations.</blockquote>
</p>

<!-- Let's see if anyone makes me define social relation! -->

<p>
By &#8220;social relation&#8221; I mean a connection between people that permits the exchange of information.  This prevents artificial relations like &#8220;Alex and James are connected if they have the same hair color.&#8221;
</p>

<p>
When I say &#8220;social network&#8221; I always mean the actual collection of people.  Facebook is a social network.  There are actual people engaged with the site, creating relationships, sharing information, and doing all the things they&#8217;d do in &#8220;real life.&#8221;  Or, put another way: a family is a social network, a family tree is not.<sup>3</sup>
</p>



<p>
If you don&#8217;t like the above definition I can give you a functional one which I believe is equivalent. <blockquote>A collection of people is a social network if and only if it is possible for something to spread virally through that collection.</blockquote>
</p>

<p>
In Web 2.0 speak, a &#8220;social network&#8221; is a collection of people over which you can &#8220;go viral&#8221;.  I believe that virality and social networks are fundamentally linked, and that both the above definitions are equivalent.
</p>

<h3>Social Applications are Social Networks</h3>
<p>
Accepting the above definitions, even if for the sake of argument, I don&#8217;t think it&#8217;s too hard to see why social applications are social networks. Let&#8217;s take Slide&#8217;s <a href="http://www.facebook.com/apps/application.php?id=2425101550">Top Friends</a> as an example.  Is Top Friends a social network in its own right?
</p>

<p>
I think it&#8217;s easier to see that Top Friends meets the first definition.  It is certainly a collection of people: the set of Facebook users who have installed the application.  Are those people bound by specific social relations?  Yes, and those relations are distinct from the ones represented in Facebook.  For example, Alex adding James as a top friend is a social signal distinct from Facebook.
</p>

<p>
What about the second definition?  Top Friends doesn&#8217;t have an external API so it&#8217;s impossible to build apps or plugins for Top Friends.<sup>4</sup>  So, what &#8220;goes viral&#8221; over Top Friends?  New features and patterns of usage do.<sup>5</sup>
</p>

<p>
I&#8217;d also argue that the converse is true: social networks are all social applications.  YouTube spread through MySpace, Facebook spread through email, email spread through the real-life &#8220;social graph&#8221;, and PayPal spread through eBay.<sup>6</sup> All social networks are social applications built off of pre-existing social networks.
</p>

<h3>Strategic Implications</h3>
<p>
If Top Friends is a social network in its own right then there are strategic implications for Facebook. <em>Prima facie</em>, Top Friends is competing with facebook for users&#8217; attention on its own platform.  Before Facebook launched the Platform it was the Eye of Providence, collecting, collating, and analyzing every bit of activity that occurred on its network.
</p>

<p>
After the Platform launched these third parties were able to infect portions of Facebook&#8217;s network.  In some cases, e.g., the Causes application, the relationship was symbiotic.  In others, e.g., Top Friends, the relationship was antagonistic, with Facebook actually shutting down Top Friends at one point.<sup>7</sup>  
</p>

<p>
What does Facebook gain by having Top Friends on its Platform?  Nothing substantial, as far as I can tell.  What does it lose?  Control and insight over the activities of its userbase.<sup>8</sup>
</p>

<p>
In effect, Top Friends is a social network bootstrapped off of Facebook, with its own set of communication channels over which Facebook has no authority or insight.  This tension is present everywhere in the Platform because application developers&#8217; interests are not wholly aligned with Facebook&#8217;s and will probably never be.
</p>

<h3>Technical Implications</h3>
<p>
I&#8217;m going to save the technical implications for another article, but it boils down to this: social networks in the sense that I defined above are fairly well understood.  I believe the techniques used on the web today to grow &#8220;viral&#8221; applications are under the research from fields like social network analysis and epidemiology.

<p>
Since I believe social applications and social networks are synonymous, we can better understand how these applications grow by understanding how social networks grow.
</p>

<p>
In the meantime, I recommend reading <em><a href="http://www3.interscience.wiley.com/journal/118986267/abstract">The Statistical Evaluation of Social Network Dynamics</a></em> by Tom A. B. Snijders from the University of Groningen if you&#8217;re interested in the technical aspects of social networks and social applications.
</p>

<p>
And please, leave a comment if you have any thoughts about the above!
</p><ol class="footnotes"><li id="footnote_0_485" class="footnote"><a href="http://500hats.typepad.com/500blogs/2007/11/google-open-soc.html">Google Open Social + Friends vs. Facebook Platform</li><li id="footnote_1_485" class="footnote">See, e.g., <a href="http://bradfitz.com/social-graph-problem/">Thoughts on the Social Graph</li><li id="footnote_2_485" class="footnote"><a href="http://www.artinthepicture.com/artists/Rene_Magritte/pipe.jpeg">Ceci n&#8217;est pas un Social Network</li><li id="footnote_3_485" class="footnote">For all I know Slide has an internal Top Friends API that lets them build new services that ride on Top Friends&#8217; success, but that&#8217;s only <a href="http://api.topfriends.com/">speculation</a>.</li><li id="footnote_4_485" class="footnote">This is the essence of <a href="http://startuplessonslearned.blogspot.com/2008/12/engagement-loops-beyond-viral.html">engagement loops</a>.  Eric Ries talks about going &#8220;beyond viral.&#8221;  There is no &#8220;beyond viral.&#8221;  Rather, on social networks viral processes govern the whole stack: acquisition, retention, engagement, and monetization.</li><li id="footnote_5_485" class="footnote">Slide is to Facebook as Paypal was to eBay.  Anyone buy it?</li><li id="footnote_6_485" class="footnote">See <a href="http://www.techcrunch.com/2008/06/26/did-facebook-shut-down-slides-top-friends-how-very-myspace-of-them/">this TechCrunch article</a>.</li><li id="footnote_7_485" class="footnote">There&#8217;s a broader argument that ceding control in this way is the right strategic move, but Facebook is not there yet &mdash; the limit of that argument is something like OpenSocial.</li></ol>]]></content:encoded>
			<wfw:commentRss>http://20bits.com/articles/social-applications-are-social-networks/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>When in Rome: Newcomers on Facebook</title>
		<link>http://20bits.com/articles/when-in-rome-newcomers-on-facebook/</link>
		<comments>http://20bits.com/articles/when-in-rome-newcomers-on-facebook/#comments</comments>
		<pubDate>Mon, 19 Jan 2009 10:23:04 +0000</pubDate>
		<dc:creator>Jesse</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[data-driven-development]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://20bits.com/?p=449</guid>
		<description><![CDATA[<img src="/downloads/Rome_01.jpg" align="left" style="float: left" /><p>"<a href="http://www.thoughtcrumbs.com/publications/paper0778-burke.pdf">Feed Me: Motivating Newcomer Contribution in Social Network Sites</a>" is a recent paper about newcomer behavior on Facebook.</p>
<p>
I give a brief overview and discuss the paper's hypotheses, methodology, and conclusions.  Sneak peak: it turns out newcomers's behavior is strongly influenced by their friends' behavior.  The paper is a great example of applying the scientific principle to a web product &#8212; there's lots to learn, so read on!
</p>]]></description>
			<content:encoded><![CDATA[<p>
A <a href="http://zellunit.com">teammate</a> of mine recently sent me a link to a paper called &#8220;<a href="http://www.thoughtcrumbs.com/publications/paper0778-burke.pdf">Feed Me: Motivating Newcomer Contribution in Social Network Sites</a>&#8221; and I thought it was worth discussing.  The paper was jointly authored by <a href="http://www.cs.cmu.edu/~mkburke/">Moira Burke</a>, a PhD student at Carnegie Mellon, and <a href="http://overstated.net/">Cameron Marlow</a> and Thomas Lento, two research scientists at Facebook.
</p>

<h3>The Chicken and the Egg</h3>
<p>
The root question addressed in the paper is this: <em>what motivates newcomers to contribute to social networks?</em>  For social networking sites getting users to contribute is one of the primary problems, right after how you acquire new users.
</p>

<p>
Let&#8217;s dive right in and look at their hypotheses, methodology, and conclusions.
</p>

<h3>Hypotheses</h3>

<p>
The authors took all users who joined on a random weekday in March 2008 &mdash; amounting to about 140,000 users &mdash; and tried to predict their long-term sharing habits based on the experiences they have in the first two weeks.  Specifically, they looked at how users interacted with photos.
</p>

<p>
The paper outlines four hypotheses:
<ol>
<li>Social learning: Newcomers whose friends share more content will go on to contribute more content themselves.</li>
<li>Singling out: Newcomers who are singled out in content will contribute more content.</li>
<li>Feedback: Newcomers receiving more feedback on their initial content will go on to contribute more content.</li>
<li>Distribution: Newcomers whose initial content is distributed widely will go on to contribute more content.</li>
</ol>
</p>

<h3>Conclusion: When in Rome&#8230;</h3>
<p>
The authors also broke down the newcomers into two categories, early uploaders and non-early uploaders, depending on whether or not they uploaded more than one photo in the first two weeks.
</p>

<p>
The two factors that correlated with long-term photo sharing for early uploaders were whether your friends were also sharing photos in your firs two weeks, and whether people commented on your photos.  Surprisingly &#8220;singling out,&#8221; i.e., getting tagged in photos, had no <a href="http://20bits.com/articles/hypothesis-testing-the-basics/">statistically significant</a> effect.
</p>

<p>
Singling out, however, did work for non-early uploaders, suggesting that people can be cajoled into uploading photos by tagging them, but that people already uploading photos to Facebook won&#8217;t upload any more than they were before.
</p>

<p>
In short, newcomers are susceptible to peer pressure.
</p>

<h3>How is this useful?</h3>
<p>
The upside to this paper is that it gives a clear picture of what is worth measuring.  Getting a user to upload a photo doesn&#8217;t just mean one more photo on the site &mdash; some percentage of their friends will upload a photo, too.
</p>

<p>
What&#8217;s more, you can enter into a sort of feedback loop.  The paper didn&#8217;t address whether &#8220;social learning&#8221; also correlated with increasing auxiliary activities like feedback, but imagine this: more photos uploaded means more comments, which in turn means more photos.  Is it possible to make this cycle self-sustaining?
</p>

<p>
The downside is that this doesn&#8217;t help with the chicken-and-egg problem.  What happens when a user comes to the site and they have no friends?  There are some public spaces on Facebook, but most social networking in that vein are dominated by interactions among friends.
</p>

<p>
Overall one of the most detailed papers analyzing data from a huge social networks.  Leave a comment and let me know your thoughts, especially if you know any other papers of this kind!
</p>]]></content:encoded>
			<wfw:commentRss>http://20bits.com/articles/when-in-rome-newcomers-on-facebook/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Data Management, Facebook-style</title>
		<link>http://20bits.com/articles/data-management-facebook-style/</link>
		<comments>http://20bits.com/articles/data-management-facebook-style/#comments</comments>
		<pubDate>Mon, 10 Nov 2008 13:00:55 +0000</pubDate>
		<dc:creator>Jesse</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[cassandra]]></category>
		<category><![CDATA[cloudera]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[distributed storage]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[hive]]></category>

		<guid isPermaLink="false">http://20bits.com/?p=367</guid>
		<description><![CDATA[
Jeff Hammerbacher, the former lead of the Data Team at Facebook and now VP of Product at Cloudera, put up some great slides on the evolution of Facebook&#8217;s data management strategy.



They&#8217;re very interesting from many perspectives, so take a look and then stay tuned for my two cents.






Growing With Data

Jeff was at Facebook for about [...]]]></description>
			<content:encoded><![CDATA[<p>
<a href="http://jeffhammerbacher.com/">Jeff Hammerbacher</a>, the former lead of the Data Team at Facebook and now VP of Product at <a href="http://www.cloudera.com/">Cloudera</a>, put up some great slides on the evolution of Facebook&#8217;s <a href="http://www.cloudera.com/blog/2008/10/24/thrift-scribe-hive-and-cassandra-open-source-data-management-software/">data management strategy</a>.
</p>

<p>
They&#8217;re very interesting from many perspectives, so take a look and then stay tuned for my two cents.
</p>

<p>
<div class="math"id="__ss_689126"><object style="margin:0px" width="425" height="355"><param name="movie" value="http://static.slideshare.net/swf/ssplayer2.swf?doc=20081022cca-1224867567253598-9&#038;stripped_title=20081022cca-presentation" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed src="http://static.slideshare.net/swf/ssplayer2.swf?doc=20081022cca-1224867567253598-9&#038;stripped_title=20081022cca-presentation" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object></div>
</p>

<h3>Growing With Data</h3>
<p>
Jeff was at Facebook for about two and a half years and saw Facebook grow from a company dealing with gigabytes of data per day to a company dealing with terrabytes of data per day.  It was his job to guide the process of making sense of this pile of semi-structured data.
</p>

<p>
The technical aspects are interesting, but what&#8217;s more interesting to me is the story.  A good title for the presentation might be &#8220;Growing With Data.&#8221;
</p>

<h3>The Three Stages</h3>
<p>
As I said, the most interesting part to me was how Facebook&#8217;s data initiatives evolved over time to meet their growing needs.
</p>

<p>
At first they did what everyone does &mdash; periodic offline batch processing.  But we all know this doesn&#8217;t scale forever, especially if your data is growing at an exponential rate.
</p>

<p>
Eventually you wind up in a situation where you produce more data in an hour that you can process.  You can try to scale vertically, getting more bandwidth, more processing power, faster disks, etc., but the exponential nature of the situation will win in the end.
</p>

<p>
Once the ad hoc ETL system no longer met their needs they built a system for distributed logging.  Unfortunately it didn&#8217;t provide the flexibility they needed.  Analysts couldn&#8217;t run SQL and maintaining the system was difficult.
</p>

<p>
Eventually they hit upon <a href="http://hadoop.apache.org/core/">Hadoop</a>, an open source implementation of Google&#8217;s MapReduce.  They built <a href="http://wiki.apache.org/hadoop/Hive">Hive</a>, a system for querying datasets stored in Hadoop files.  This means you get the scalability of Hadoop and the flexibility of a SQL-like language.  It&#8217;s very slick.
</p>

<p>
They also built <a href="http://code.google.com/p/the-cassandra-project/">Cassandra</a>, which provides a <a href="http://en.wikipedia.org/wiki/BigTable">BigTable</a>-like system for storing massive amounts of structured data.

<h3>Evolution, not Revolution</h3>
<p>
As I said, I like the story.  They didn&#8217;t start by building these complex tools, but rather they evolved to fit a growing need within the company.  Beyond that I like that their approach to Hive was so customer-centric.  The analysts wanted SQL so they built a SQL-like language on top of their fancy distributed technology.  Very cool.
</p>

<p>
There&#8217;s a lot more where that came from over at the <a href="http://www.cloudera.com/blog/">Cloudera blog</a>, so check it out.  The future is data.
</p>]]></content:encoded>
			<wfw:commentRss>http://20bits.com/articles/data-management-facebook-style/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Facebook Users Just Want Entertainment</title>
		<link>http://20bits.com/articles/facebook-users-just-want-entertainment/</link>
		<comments>http://20bits.com/articles/facebook-users-just-want-entertainment/#comments</comments>
		<pubDate>Mon, 19 May 2008 22:40:32 +0000</pubDate>
		<dc:creator>Jesse</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[marketing]]></category>
		<category><![CDATA[myspace]]></category>
		<category><![CDATA[opinion]]></category>
		<category><![CDATA[pages]]></category>

		<guid isPermaLink="false">http://20bits.com/?p=139</guid>
		<description><![CDATA[
Starting in late 2007 Facebook began instituting its media strategy in earnest with Facebook Pages.  According to Facebook pages offer &#8220;a unique experience where users can become more deeply connected with your business or brand.&#8221;



It&#8217;s good to see Facebook becoming conscious about how they can help shape the future of branding, since this is [...]]]></description>
			<content:encoded><![CDATA[<p>
Starting in late 2007 Facebook began instituting its media strategy in earnest with <a href="http://blog.facebook.com/blog.php?post=6972252130">Facebook Pages</a>.  <a href="http://www.facebook.com/business/?pages">According to Facebook</a> pages offer &#8220;a unique experience where users can become more deeply connected with your business or brand.&#8221;
</p>

<p>
It&#8217;s good to see Facebook becoming conscious about how they can help shape the future of branding, since this is where the real money is for social networks.  Let&#8217;s see how Facebook Pages has evolved since last November.
</p>

<h3>Factoids</h3>
<p>
I went into this project without any pre-conceptions of what I would find.  I never really used Facebook pages and wasn&#8217;t sure if there were any definite conclusions to be found in the data.  At best I thought the numbers might be useful for third parties.  Here are some interesting facts:
</p>

<ul>
	<li>As of May 18th, 2008 there are <strong>190,365</strong> pages and <strong>50,800,399</strong> fans across all pages.
	<li><strong>One third</strong> of all pages are dedicated to <strong>musicians</strong>, but this category represents <strong>37%</strong> of all fans.</li>
	<li>After musicians the category with the second-largest number of fans is <strong>TV Shows</strong> even though it has <strong>3.8 times</strong> fewer fans.</li>
	<li><strong>7.9%</strong> of pages are in the &#8220;other business&#8221; category, the largest business category, but only <strong>3.6%</strong> of fans.</li>
</ul>

<p>
<strong>NOTE</strong>: Facebook changes the copy from &#8220;fans&#8221; to something type-specific.  For example, politicians don&#8217;t have &#8220;fans&#8221; they have &#8220;supporters.&#8221;  I&#8217;m going to use fan in the general sense.
</p>

<p>
It turns out that sports, entertainment, and politics are the three broad categories that perform the best on Facebook, as we&#8217;ll see below.
</p>

<h3>Trends</h3>
<p>
There are two ways to measure the size of a category: one, by the number of pages in that category; two, by the number of fans in that category.  Let&#8217;s look at both.
</p>

<a href='http://20bits.com/wp-content/uploads/2008/05/pct-pages.png'><img src="http://20bits.com/wp-content/uploads/2008/05/pct-pages.png" alt="" title="pct-pages" width="274" height="300" class="math size-medium wp-image-141" /></a>

<p>
The graph includes the ten largest categories by the number of pages in each, with the remaining 55 categories grouped into one.  The interesting thing is that the graph is divded almost evenly into thirds, consisting of one single category (musicians), the 1-9 most common categories, and the remaining 55 categories.
</p>

<p>
This doesn&#8217;t really tell us how Facebook Pages are performing by category only how people are investing in Facebook pages.  Let&#8217;s look at the users&#8217; side of things.
</p>

<a href='http://20bits.com/wp-content/uploads/2008/05/pct-fans.png'><img src="http://20bits.com/wp-content/uploads/2008/05/pct-fans.png" alt="" title="pct-fans" width="293" height="300" class="math size-medium wp-image-140" /></a>
<p>
There are two things worth noting in this graph.  One, the categories make a qualitative shift towards entertainment and politics and away from general businesses.  Two, the graph becomes even more lop-sided, with musicians taking up almost 40% of the graph and &#8220;other&#8221; falling to around 20%.
</p>


<h3>Usage</h3>
<p>
So, here&#8217;s a question: what categories fair best?  Let&#8217;s look at the 100 largest pages by number of fans and see how they break down by category.
</p>

<a href='http://20bits.com/wp-content/uploads/2008/05/top100.png'><img src="http://20bits.com/wp-content/uploads/2008/05/top100.png" alt="" title="top100" width="300" height="219" class="math size-medium wp-image-142" /></a>

<p>
The difference here is even more stark.  <strong>48%</strong> of the top 100 pages are musician pages and <strong>17%</strong> are for TV shows.  No other category has more than 10% of the fans.
</p>

<p>
Let&#8217;s take a look at how pages are paying off by taking the difference between the percentage of fans in a category and the percentage of pages.  All else being equal we&#8217;d expect pages different categories to have a similar &#8220;return on investment.&#8221;  Anything beyond that can only be explained by how Facebook users interact with pages.
</p>

<p>
That is, if 10% of all pages are in a category but 15% of all fans are in that same category, we say that category has a 5% &#8220;ROI.&#8221;  This metric allows us to see which categories are most likely to pay off.
</p>

<center>
<table class="monthly-data">
	<tr class="top">
		<th colspan="4">Facebook Page Categories by ROI</th>
	</tr>
	<tr class="odd">
		<th style="width: 15ex;">Category</th>
		<th style="width: 9ex;" class="date">% of Pages</th>
		<th style="width: 9ex;" class="date">% of Fans</th>
		<th style="width: 8ex;" class="date">ROI</th>
	</tr>
	<tr><td class="statistic">TV Show                                   </td><td>1.20%</td><td>9.65%</td><td>8.45%</td></tr>
	<tr class="odd"><td class="statistic">Film                                      </td><td>1.44%</td><td>5.75%</td><td>4.31%</td></tr>
	<tr><td class="statistic">Musician                                  </td><td>34.11%</td><td>37.03%</td><td>2.93%</td></tr>
	<tr class="odd"><td class="statistic">Politician                                </td><td>1.85%</td><td>4.43%</td><td>2.57%</td></tr>
	<tr><td class="statistic">Actor                                     </td><td>1.48%</td><td>3.20%</td><td>1.72%</td></tr>
	<tr class="odd"><td class="statistic">Comedian                                  </td><td>0.94%</td><td>2.02%</td><td>1.09%</td></tr>
	<tr><td class="statistic">Game                                      </td><td>0.62%</td><td>1.41%</td><td>0.78%</td></tr>
	<tr class="odd"><td class="statistic">Food and Beverage                         </td><td>0.91%</td><td>1.67%</td><td>0.77%</td></tr>
	<tr><td class="statistic">Athlete                                   </td><td>0.92%</td><td>1.65%</td><td>0.72%</td></tr>
	<tr class="odd"><td class="statistic">Sports Team                               </td><td>1.37%</td><td>1.96%</td><td>0.59%</td></tr>
	<tr><td class="statistic">Sports / Athletics                        </td><td>1.16%</td><td>1.72%</td><td>0.57%</td></tr>
</table>
</center>

<p>
The most striking thing, for me, is how conventional these categories are: entertainment, politics, and sports.
</p>

<h3>Conclusions</h3>
<p>
Facebook&#8217;s future rests in the branding of traditional media verticals.  They have captured a powerful demographic and have nearly perfected a distribution mechanism.  They deny they&#8217;re a media company even as their VP of Product Marketing <a href="http://www.news.com/8301-10784_3-9946606-7.html">says</a> they&#8217;re the &#8220;net&#8217;s cable company.&#8221;
</p>

<p>
Entertainment and sports pages perform above expectations, while generic &#8220;business&#8221; pages perform below expectations.  The popularity of the political categories is to be expected since Facebook&#8217;s largest userbase is in the US and 2008 is a Presidential election year.
</p>

<p>
MySpace seems to understand that social networking and entertainment go hand-in-hand.  It&#8217;s about time Facebook embraces the same &mdash; their users already have.
</p>

<p>
<div class="download">Download the <a href="http://20bits.com/downloads/facebook-pages-data.xls">the full dataset</a> in a Microsoft Excel spreadsheet</div>
</p>]]></content:encoded>
			<wfw:commentRss>http://20bits.com/articles/facebook-users-just-want-entertainment/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Facebook Bans Google Friend Connect</title>
		<link>http://20bits.com/articles/facebook-bans-google-friend-connect/</link>
		<comments>http://20bits.com/articles/facebook-bans-google-friend-connect/#comments</comments>
		<pubDate>Thu, 15 May 2008 19:27:16 +0000</pubDate>
		<dc:creator>Jesse</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[friend connect]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[opinion]]></category>
		<category><![CDATA[platform]]></category>

		<guid isPermaLink="false">http://20bits.com/?p=137</guid>
		<description><![CDATA[
Facebook announced today on their official developers&#8217; blog that they have banned Google Friend Connect, stating privacy concerns.



Google Friend Connect is a service that allows users to share their social data, such as personal information and friends, with websites that embed the Google-created widgets.  This data can come from many social networks, including Facebook, [...]]]></description>
			<content:encoded><![CDATA[<p>
Facebook announced today on their official developers&#8217; blog that they have <a href="http://developers.facebook.com/news.php?blog=1&#038;story=111">banned Google Friend Connect</a>, stating privacy concerns.
</p>

<p>
<a href="http://www.google.com/friendconnect/">Google Friend Connect</a> is a service that allows users to share their social data, such as personal information and friends, with websites that embed the Google-created widgets.  This data can come from many social networks, including Facebook, Hi5, Orkut, and Google Talk.
</p>

<p>
The key section in the second-to-last paragraph: <blockquote>Now that Google has launched Friend Connect, we’ve had a chance to evaluate the technology. We’ve found that it redistributes user information from Facebook to other developers without users’ knowledge, which doesn’t respect the privacy standards our users have come to expect and is a violation of our Terms of Service. Just as we’ve been forced to do for other applications that redistribute data in a way users might not expect or understand, we’ve had to suspend Friend Connect’s access to Facebook user information until it comes into compliance.</blockquote>
</p>

<p>
They claim that they have &#8220;reached out to Google several times about this issue,&#8221; but do not state what conversations, if any, took place.  Nor do they spell out exactly how Google Friend Connect violates the Terms of Service.
</p>

<p>
Facebook announced on May 9th, 2008 that they will be launching their own competitor to Google Friend Connect, <a href="http://developers.facebook.com/news.php?blog=1&#038;story=108">Facebook Connect</a>.  Both Google Friend Connect and Facebook Connect came on the heels of MySpace&#8217;s May 8th announcement of their <a href="http://www.news.com/8301-13577_3-9939286-36.html">Data Availability</a> project.
</p>

<h3>Analysis</h3>
<p>
First, it&#8217;s exciting to see competition in the data portability space.  What seemed like a fantasy just a year ago is now an inexorable trend: data will flow freely across all social networks.  Or, as <a href="http://blogs.forrester.com/charleneli/2008/03/the-future-of-s.html">Charlene Li said</a>, &#8220;Social networks will be like air.&#8221;
</p>

<p>
Google, MySpace, Yahoo!, and Facebook all have huge stakes in this game, each controlling a slice of the social networking pie.  Facebook and MySpace have &#8220;social networks&#8221; in their own right, but don&#8217;t forget that friend data can come from services like email and IM, too.
</p>


<p>
Second, Facebook is skirting a fine, legalistic line.  They don&#8217;t claim they have a problem with Google Friend Connect taking data from Facebook.  Rather, their problem is that Google Friend Connect supposedly then shares this data with third-parties.  Of course, the blog post announcing all this is rather opaque and gives no specifics.
</p>

<p>
But does anyone sincerely believe this isn&#8217;t just Facebook pressing its competitive advantages?  They&#8217;re about to launch their own version of Friend Connect and crippling your competitor in anticipation is a play right out of the Microsoft platform handbook.
</p>

<p>
I think the folks at Facebook are just upset because Google, for once, got the drop on them.  The only way they know how to respond is with muscle rather than grace.
</p>

<p>
Facebook is a business, so I understand it has to operate out of self-interest, but I hope they&#8217;re not so self-deluded as to believe this move was motivated by privacy concerns.  The original launch of Facebook Beacon is enough to know that Facebook doesn&#8217;t have privacy on the mind all the time.
</p>

<p>
On a more general level, Facebook likes to play the world domination game, as <a href="http://discussionleader.hbsp.com/haque/2008/05/http20bitscom20080506thestateo.html">Umair Haque</a> has pointed out countless times.  Using privacy as a front Facebook acts the paternalist.
</p>

<p>
Does Facebook know best?  Are they the best arbiters of my privacy? Thanks, Facebook, but no thanks.  I should be able to do with my data as I please.
</p>

<p>
<strong>Update</strong>: <a href="http://www.techcrunch.com/2008/05/15/he-said-she-said-in-google-v-facebook/#comment-2299812">TechCrunch</a> has more, including a follow-up from both Google and Facebook.
</p>

<p>
<strong>Update 2</strong>: John Furrier has an <a href="http://furrier.org/2008/05/15/facebook-just-pulled-a-netscape-hey-facebook-what-are-you-thinking/">interesting post</a> where he compares Facebook&#8217;s strategy to Netscape rather than Microsoft.
</p>
]]></content:encoded>
			<wfw:commentRss>http://20bits.com/articles/facebook-bans-google-friend-connect/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>The State of the Platform: Update</title>
		<link>http://20bits.com/articles/the-state-of-the-platform-update/</link>
		<comments>http://20bits.com/articles/the-state-of-the-platform-update/#comments</comments>
		<pubDate>Wed, 07 May 2008 20:12:44 +0000</pubDate>
		<dc:creator>Jesse</dc:creator>
				<category><![CDATA[Articles]]></category>
		<category><![CDATA[facebook]]></category>
		<category><![CDATA[opinion]]></category>
		<category><![CDATA[platform]]></category>

		<guid isPermaLink="false">http://20bits.com/?p=129</guid>
		<description><![CDATA[
My article about The State of the Facebook Platform has been spreading through the blogosphere like a game of telephone.  Lots of people have chimed in with their own opinions.



I wanted to write a follow-up post to clarify my opinion and address some of the responses.


What I&#8217;m Claiming

My claims are simple and uncontroversial.  [...]]]></description>
			<content:encoded><![CDATA[<p>
My article about <a href="http://20bits.com/2008/05/06/the-state-of-the-facebook-platform/">The State of the Facebook Platform</a> has been spreading through the blogosphere like a game of telephone.  <a href="http://andrewchen.typepad.com/andrew_chens_blog/2008/05/has-the-faceboo.html#comments">Lots</a> <a href="http://www.sarahlacy.com/sarahlacy/2008/05/facebook-platfo.html">of</a> <a href="http://blog.playfish.com/2008/05/07/facebooks-stricter-app-regulations-are-a-good-thing/">people</a> <a href="http://twitter.com/Scobleizer/statuses/805095480">have</a> <a href="http://venturebeat.com/2008/05/06/facebooks-platform-issues-fewer-developers-slower-app-growth/">chimed</a> in with their own opinions.
</p>

<p>
I wanted to write a follow-up post to clarify my opinion and address some of the responses.
</p>

<h3>What I&#8217;m Claiming</h3>
<p>
My claims are simple and uncontroversial.  I observed two things: one, the activity level in the Facebook forums is a fraction of what it was four months ago; two, Facebook apps launched today are much less likely to succeed.
</p>

<p>
The trends for these two observations are highly correlated and exhibit the same peak around February 2nd, 2008.  What happened around that time?  One, Facebook began instituting increasingly demanding and arbitrary developer policies.  Two, other networks began launching fully-featured competitors to Facebook&#8217;s platform.
</p>

<p>
From the high correlation, the timing of events, and comments from people working in the industry, I concluded that developers are less interested in Facebook today because there&#8217;s less return on their investment of labor.
</p>

<h3>What I&#8217;m NOT Claiming</h3>
<p>
I&#8217;m not claiming that the Facebook Platform is unhealthy.  Nor am I claiming that it was a bad idea for Facebook to implement the policy changes they did.
</p>

<p>
I&#8217;m certainly not claiming that any of the data implies either of the above.  Indeed, it&#8217;s still possible to find <a href="http://adonomics.com/about/10726707410">success</a> on the Facebook Platform.  It just requires more effort than it used to.
</p>

<p>
Also, most emphatically, <em>I&#8217;m not talking about Facebook users</em>.  The article was only about developers and their decision to create software for Facebook, not about Facebook as a whole, which is still seeing phenomenal success.
</p>

<h3>Other Hypotheses</h3>
<p>
The most common alternate hypotheses for these trends was summarized by <a href="http://blog.jeffreymcmanus.com/">Jeffery McManus</a>: <blockquote>This is not a terrific metric for developer activity — it doesn’t measure what you purport to measure. Developers generally view and post to forums when they have problems; if fewer developers are posting to the forums, it may mean that there are more developers who are having less trouble.</blockquote>
</p>

<p>
I call this the &#8220;documentation hypothesis&#8221; and addressed it briefly in my original article.  I think it&#8217;s an unappealing explanation for a few reasons.
</p>

<p>
First, if it were true, we&#8217;d expect to see spikes in forum activity whenever a new issue arose on the Platform, especially since Facebook&#8217;s changes tend to be radical and out of nowhere.  The decline in activity is virtually monotonic, however, and the data shows no such spikes.
</p>

<p>
Second, even if it were true, it doesn&#8217;t explain the correlation between forum activity and application success, nor does it explain the sudden decline beginning around February 2nd.  As an explanation it just isn&#8217;t sufficient.
</p>

<h3>Is the Trend Good or Bad?</h3>
<p>
I understand the tone of the article was bearish, but I was writing it from the perspective of a developer deciding whether or not to commit to the Facebook Platform.  There are lots of perspectives, though.
</p>

<dl>
	<dt>Facebook&#8217;s Perspective</dt>
	<dd>
	<p>
	I believe Facebook is making these changes intentionally.  They have a love-hate relationship with companies like Slide.  Strategically speakingthese companies got in at the very beginning and quickly cordoned off sections of the social graph for themselves, largely out of Facebook&#8217;s reach.  Messages on FunWall don&#8217;t go through Facebook, for example.
	</p>
	<p>
	This is clearly not in Facebook&#8217;s strategic interest, but they can&#8217;t just boot these companies out because a significant number of Facebook users would throw a fit.  From Facebook&#8217;s perspective these trends in developer engagement are good because it allows them to reassert control and improve their image as the &#8220;high-quality social network.&#8221;
	</p>
	</dd>
	<dt>Facebook Users&#8217; Perspective</dt>
	<dd>
	<p>
	Let&#8217;s face it, most Facebook users don&#8217;t like to be pestered by applications.  For them these changes are good.  And judging by Facebook&#8217;s <a href="http://www.alexa.com/data/details/traffic_details/facebook.com?site0=myspace.com&#038;site1=facebook.com&#038;y=r&#038;z=3&#038;h=300&#038;w=610&#038;c=1&#038;u%5B%5D=myspace.com&#038;u%5B%5D=facebook.com&#038;x=2008-05-07T20%3A35%3A12.000Z&#038;check=www.alexa.com&#038;signature=n49bq4%2B6Z5asVqN59LzvZZubXw8%3D&#038;range=max&#038;size=Medium">traffic stats</a> it isn&#8217;t hurting them one bit.
	</p>
	</dd>
	<dt>Advertisers&#8217; Perspective</dt>
	<dd>
	<p>
	For advertisers these developments are universally good.  If the bar for application development is higher it means the applications that succeed will be of a higher quality.  Nobody wants to advertise on &#8220;What color barf are you?&#8221; and Facebook doesn&#8217;t want that application to be front-and-center, either.  It just looks bad.
	</p>
	</dd>
	<dt>Developers&#8217; Perspective</dt>
	<dd>
	<p>
	For developers this is a mixed bag.  Facebook&#8217;s cavalier attitude about platform policy means that you&#8217;re playing on shifting ground.  On top of that the changes they&#8217;ve already made mean it&#8217;s harder for applications to succeed, on average.
	</p>
	<p>
	Still, for companies like <a href="http://www.socialgn.com/">SGN</a> and <a href="http://www.playfish.com/">PlayFish</a>, who want to make quality applications, this means that they don&#8217;t have to worry about competing with win-at-all-cost, spammy applications.
	</p>
	<p>
	I just wouldn&#8217;t recommend developing <em>only</em> on Facebook, as they&#8217;ve shown they&#8217;re willing to change and bend the rules at a whim and for their own benefit.  You know that as soon as Facebook decides they don&#8217;t like what you&#8217;re doing they&#8217;ll do everything in their power to hinder you.  Hedge your bets.
	</p>
	</dd>
</dl>

<h3>Hype Cycle</h3>
<p>
Don&#8217;t forget about the <a href="http://en.wikipedia.org/wiki/Hype_cycle">hype cycle</a>, either.  All technologies go through a phase of inflated expectations followed by a trough of disillusionment.
</p>


<p>
I&#8217;d say we&#8217;re right in the middle of the trough of disillusionment.  Companies like Zynga, SocialMedia, Slide, RockYou, and SGN are going to slug through the slope of enlightenment.
</p>

<p>
Will we have a social operating system or a revolutionary social commerce system waiting at the end?  Probably not.  Will we have innovative casual gaming platforms?  I&#8217;d take that bet.
</p>

<p>
I&#8217;m interested in hearing other perspectives, too, particularly investors&#8217; perspectives.  Does anyone have any insight on that?
</p>

<p>
<strong>Update:</strong> <a href="http://runningwithfoxes.com/2008/05/07/facebook-platform-thinning-of-the-herd/">Nick Gonzalez</a>, formerly of TechCrunch and now of SocialMedia, makes a similar point about the hype cycle.   I also like the Darwinian nature of his post&#8217;s title: &#8220;The Thinning of the Herd.&#8221;  Heh.
</p>]]></content:encoded>
			<wfw:commentRss>http://20bits.com/articles/the-state-of-the-platform-update/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>
