This is a supplemental blog for a course which will cover how the social, technological, and natural worlds are connected, and how the study of networks sheds light on these connections.


Being Popular (for Programming Languages)

http://www.paulgraham.com/popular.html

As our notes state, popularity is the idea that while most things are known to an immediate circle, a few rise above and become widely accepted and/or admired by a more global community. Being a popular programming language is not different. Paul Graham writes a whole essay on what makes a programming language popular, but I will only be discussing the first two points: The Mechanics of Popularity and External Factors.

The Mechanics of Popularity discusses how a programming language must be accepted by a group of “expert hackers”, or, as Malcolm Gladwell would call them, “innovators” or possibly just “early adopters” (197). For most epidemics, these are the kinds of people who are more adventurous and willing to try something new. For programming languages, the experts “can tell a good language when they see one, and they’ll use it”. Graham notes that often these early adopters “tell other programmers what language to use,” creating the early majority that will hopefully propel the language into popularity. Obviously, one can see how network effects play a major role in programming languages. The more people and applications that use a language, the more popular it will become and hopefully stay.

The second point focuses on External Factors. Like network effects, these factors are not an intrinsic part of the programming language, but still highly influence the payoff to the users. Consequently, they greatly affect the language’s popularity. The major external factor that Graham discusses is the need for a use, specifically the need to become “the scripting language of a popular system”. As he notes, programming languages are often judged on what it is used for. Therefore, make it apply to something popular, and there’s a better chance that it will become popular.

Since programming languages have to be used by many different people, it is a much better choice to use a programming language that everyone else is using. Thus, the idea of the rich-getting-richer comes into play. If a programming language is very popular, it is probably pretty easy for it to stay popular or even increase its popularity. However, if no one uses the language, it takes a much longer time for it to gain acceptance on a large level. Thus, a new language may just need time, or as Graham puts it (in point 9), “it’s not when people notice you’re there that they pay attention; it’s when they notice you’re still there”.

Posted in Topics: Education

No Comments

The Tipping Point in the Political Underground

At the beginning of this month, a clever illustration of the power of information cascades was carried out in the even seedier underground of politics: partisan blogging. As Josiah Roe, Executive VP of Coptix, Inc., which provides backup DNS hosting for sites such as “georgewbush.com”, writes, his company decided decided to do a little test on April 1st. Starting with a photograph of Vice President Dick Cheney, a few minutes work produced a similar photo, but with a suspicious file-folder now located under his arm, labeled “COPTIX”. The doctored photo was then given to a right-wing blogger who was in on the plan, a number of links were posted to other blogging and social-networking sites, and the fun began. In short order, tens of thousands of people were viewing the image, and propagating the information on their own blogs. From there it then spread to big-name political blogs like Wonkette and Daily Kos — which combined have over a million unique visitors every day — and within a few short days from its creation, the falsified image had been seen by millions of people as evidence of Cheney’s illegal relocation of damaging emails.

This experiment reveals some of the power that information cascades have in all parts of the political arena. Once given a reasonably plausible (but not necessarily reliable) source, many people will accept the new information as fact, provided that it fits with their preconceived ideas. From there, the idea picks up speed in the community, and in a short time it will have enough attention to be picked up by one of the larger sites, and may even spread to more mainstream newspaper or TV news organizations if left unchecked. The most important factor seems to be the low level of information required to tip the first bloggers, due to their own biases. While these liberal blogs would be unlikely to pick up on something that contradicted their own beliefs without very strong, factual evidence, and perhaps not even then, a hoax of this fashion needs only the least shreds of credibility to be propagated at first, and once it reaches this tipping point, it builds enough momentum to convince even the more reputable blogs of its veracity. For this reason, it is perhaps not surprising that the story got much less press on the other side of the aisle, since both the idea and the myriad sources quoting it as truth were much less palatable to conservatives, resulting in insufficient impetus to start another cascade.

Posted in Topics: General, social studies

View Comment (1) »

Party Acquaintances

http://www.cut-the-knot.org/Curriculum/Combinatorics/ThreeOrThree.shtml

A famous theorem in combinatorics known as the Friendship Theorem states the following:

“In a party of six people, there are at least three mutual acquaintances, or there are at least three mutual strangers.”

It is clear that this statement is equivalent to saying that if we use a bicoloring of the edges in the complete graph on 6 vertices, where a red edge is drawn between a pair of acquaintances and a blue edge is drawn otherwise (where nodes represent people), then there exists at least one monochromatic triangle. 

The proof of this claim is a simple application of the Pigeonhole Principle.  Consider any vertex in K_6, call it x.  Clearly x has degree 5 and thus is incident with at least three edges of the same color (by Pigeonhole Principle), WLOG say red.  Let a, b, and c be 3 vertices connected to x by a red edge.  If any of the edges between a, b, or c is red, then there will be a monochromatic red triangle.  (If there is a red edge between a and b, then triangle xab will be a red triangle.)

If none of the edges between a, b, or c is red then clearly triangle abc  will be a monochromatic blue triangle.

In either case, we have shown that there will be a monochromatic triangle, which completes the proof.

In fact, A.W. Goodman has shown that any bicoloring of the edges in K_6 must have at least 2 monochromatic triangles.  His theorem gives the greatest lower bound on the number of monochromatic triangles that must appear in any bicoloring of the edges of K_n.  The bounds and the proof of the bounds is given here:  http://www.artofproblemsolving.com/Forum/viewtopic.php?t=5820

These theorems relate to what we have discussed in class because we are using graph theory to analyze social networks.    This topic is related to Ramsey Theory, which is a popular topic in graph theory/ combinatorics that further generalizes the the Friendship Theorem so that it may be applied to some larger social networks.   

Posted in Topics: Education

No Comments

Lying on your resume, information-cascade-style

Yesterday, MIT’s dean of admissions stepped down after admitting to lying about her academic credentials when she was hire, 28 years earlier. Marilee Jones had claimed to have degrees from three different New York colleges, when in fact she did not have a college degree at all. She had initially been hired for a position which did not require a college degree, which may explain why her credentials were never checked.

How does a high-school graduate become the dean of admissions for one of the world’s most prestigious universities? The answer is simple: credentials are an information cascade. Once one person has accepted Ms. Jones’s resume, that is a signal to the next person that there is no need to do any background checking. As more and more people accept her credentials, it becomes less and less likely that the next person will suspect anything. After enough people have accepted before you, checking will seem like a waste of time, because surely someone before you would have noticed any problems. The concept that the system could be duped so completely is too big a leap. By 1997, Ms. Jones had been working for MIT for 18 years, and calling Union, RPI, and Albany Medical College to verify her degrees would be unthinkable. Surely no one could work at MIT for 18 years with fake degrees!

This phenomenon is not restricted to MIT, of course. Lying on one’s resume is believed to be a fairly common practice, although estimates vary widely. But Ms. Jones is a particularly spectacular example of how a single mistake can turn into a big embarrassment three decades later, through the mechanism of information cascades.

Posted in Topics: Education

No Comments

The Movement to Mobilize

http://press.meetup.com/archives/000464.html

Last presidential election race saw a change in the way politicians informed America on their positions both domestic and foreign. Joe Trippi, Howard Dean’s manager for the 2004 presidential election, learned about a small Internet website called Meetup.com. The website connects people together with other people who share a similar interest. These like-minded individuals form groups on the website and post up when they will be having their next group meeting, or meet up. Mr. Trippi realized that political issues are no different than any other hobby or interest someone might have, and soon he began to form Howard Dean meet ups.

These groups became very powerful for a couple of reasons. First, it became a new source for political donations, reaching into the tens of millions of dollars for Mr. Dean alone. This also became a good forum both for spreading new political messages and gaining feedback from the voters. And lastly, these meet ups were able to do something ubiquitous television is not able to do; they mobilize the masses.

A true grass roots movement formed, mainly comprised of younger voters, who spread Howard Dean’s message by word of mouth to their friends. By using the Internet to bring supporters together, Howard Dean was able to garner a far larger and more active support base than he otherwise would have gained. Mr. Trippi discovered that if you can use the vast social network of the Internet correctly, it becomes incredibly powerful. This is not just because you can reach a large audience (television is much better at that), but because it can infiltrate into a larger friend-social network that can move people’s feet and their wallets.

Posted in Topics: Education

No Comments

Santa Claus, the Easter Bunny, and the Six Degrees of Separation

http://www.healthywealthynwise.com/article.asp?Article=5203

While browsing the internet I came across this article, and the title attracted my attention. Sadly, the article was not about Santa Claus nor the Easter Bunny, but it did analyze and attempt to debunk the six degrees idea. While the article does a good job exposing possible problems with the theory, its claims can be refuted by what we’ve learned in class.

In class we talked about the Small World Phenomenon and the Six Degrees of Separation. The concept of the Six Degrees of Separation originates from experiments conducted by Stanley Milgram in the 1960s. He set a target person somewhere in the United States, and told 100 other random people to try and get a letter to this person. The only catch is, these 100 people must send it through acquaintances whom are on a first name basis. The results of the experiment were startling, as Milgram found that the average number of steps passed was about six.

In the article, the author argues that though the it is true that the average number of steps taken is six, “the overwhelming majority of people in all of Milgram’s studies never got the material to the intended recipient at all!” He goes on to cite quantitative evidence that only 29% of the 217 starting chains were completed, and of those 64, the magic number of six was derived. He then argues that the other 71% of the population “were NOT CONNECTED AT ALL!” While the statistics he cites are legit, he neglected to take into consideration of the possibility of apathy in the participants. The fact that any two people in the United States may not be connected can only be one contributing factor in the 71%, many others may be too indolent or do not have the motivation to carry out this experiment.

In addition to the apathy experienced by the participants, the experiment is not designed to find the shortest possible path. The experiment that Milgram conducted utilizes a technique called tunneling, where each partcipant is only allowed to send one letter to his/her acquaintance. Even if each and every participant tried their best to carry on the message, there is no way for each individual to know the links connected to their acquaintance, hence the participant does not know the shortest path to the target.

The second half of his article seems to pertain more to the importance of networking skills in satisfying the six degrees of separation. Only through “reading, training and coaching can people develop their networking skills, and become part of the roughly 29% of people.” However, in class we proved that the existence of one random link connecting an individual to a network different from his could be the difference, and unless this link is non existent, there will always be a path connecting two individuals.

The last line of the article goes something like this: “As for the 71% of people who are not connected and yet still believe in the Six Degrees of Separation concept – keep the faith. You’ll always have Santa Claus.” and we will keep faith, as we are all connected to Santa Claus through that one single unique link, that allows us to be connected to everyone else.

Posted in Topics: Education, social studies

No Comments

Hard Cliques and Clustering in Graph Structure

An important aspect of graph structure is a clique or cluster: a highly connected component of nodes. For example, a clique within a social network might be a group of friends that all know each other. If we have a graph that represents countries that are trading partners, we could identify a cartel by searching for a clique. Finally, we might have a graph where nodes represent individuals in a company and edges represent e-mail conversations; if a clique appears in this graph - the individuals may be participating in some project, or even engaging in fraud [enron]. Indeed, the search for clique’s has been applicable to a wide range of networks - relevant to sociology, physical networks, and bioinformatics. Thus, a common question for computer scientists is to rigidly define what a clique is, and determine a way to find them.

One technique for finding clique’s is to partition a graph into k (integer) clusters. Each node must be in one and only one cluster (”hard clustering”). Finally, the objective is to assign the nodes to cluster’s in an optimal way. The first way is to maximize the minimum-distance between nodes in two separate clusters. That is, the technique tries to make any two clusters as far as possible from each other. Note that this problem is solvable in polynomial time using a minimum spanning tree. A second way is to minimize the maximum-distance between nodes within the same cluster. That is, the technique tries to make every clique as ’small’ and ‘dense’ as possible. Note that this problem is NP complete, with polynomial time approximations. [Details on the two techniques (Asano, Bhattacharya, Keil, Yao, 1988)] . Both techniques do not require the clusters to be strongly connected. Thus, given a graph, we can find partition the graph into clique’s.

Once a graph is partitioned into clusters, it may be interesting to determine how dense each cluster is. For example, if every person in a social clique is friends with each other, then the clique is dense - and fully connected. However, if there are half as many edges in the social clique as in the fully connected clique — then the clique is less dense. A common metric for representing the density of the neighbors around a node is the Clustering Coefficient [wiki]. Indeed, the clustering coefficient over a whole graph provides a good metric for whether a graph will exhibit small world phenomena. A graph with a high clustering coefficient will be less small-world than a similar graph with the same number of edges and small clustering coefficient. The clustering coefficient is a ratio: the number of edges between a node’s neighbors divided by the number of possible edges. This can be calculated trivially. Finally, the density of an entire clique can be determined by averaging the clustering coefficient of all its members.

Thus, we can use the clustering coefficient and k-clustering techniques to find clique’s within a network. Both are important ways to study the interconnectivity within a graph.

Posted in Topics: Education

No Comments

Information Cascades vs Efficient Market Hypothesis

According to a paper by Len Skerratt in 2000, there is substantial evidence that financial markets do not react to information exactly as suggest by the efficient market hypothesis as argued by many. How can this be? One of the main explanations contribute the discrepancies to exogenous imperfections such as transaction costs. However, Skerratt focuses on behavioral causes and how agents actually process information rather than on imperfections. The subject of the paper is on how agents respond not only to their own private information, but also to the information which they infer other agents have. As discussed in class, these inferences are made on the basis of the actions which other agents are seen to take.

The idea behind market efficiency is that agents have private information which influences trading. The actions from the information is immediately reflected in the price (good news will push prices up). One of the main points of the paper is that when investors make sequential decisions, they will tend to ignore their own information and be guided by the actions of others. This is an inescapable spiral where subsequent investors only observe the uninformed choices of previous investors, which then ultimately reinforce the original choice (regardless of right or wrong). The paper continues on to quantify this theory and the origins of a cascade.

Skerratt draws several conclusions from his theory: The more informed will tend to go first since the less informed will wait to see what others are doing, thus ultimately most decision will be correct; but! the less informed may also go first, this happens because the more informed will wait until the market is really out of wack and strike later with their private information. Skerratt concludes that cascading in financial markets with respect to accounting information makes the most sense. It is the most public and strongest information available to the investor.

For more on Skerratt’s theories refer to:

http://cascades.behaviouralfinance.net/Sker00.pdf

Posted in Topics: Education

No Comments

“Small World” Experiment Revivial

Duncan Watts, of Watts-Strogatz model fame, is in the process of conducting a modernized version of Stanley Milgram’s original 1967 “small world” experiment as mentioned in class. Watts, now at Columbia University is the principal investigator on the the Columbia Small World Project. In this version of the experiment, people can volunteer to start a chain from the website. The user is then given the name of a target person as well as some personal information what they do for a living and where they living. Much as in the Milgram experiment, the volunteers are then asked to send the message to a person they think is closer to the target with the eventual goal of reaching the target. This however is where the similarites end.

With the benefit of hindsight, Watts and his colleagues have made many improvements on the original experiment. First, and most notably is the use of email rather than snail mail to send the the messages provides many advantages over the original experiment. Other than the obvious fact that email is much faster (especially in the case of overseas targets), email also allows a single person to start multiple chains very easily. This in turn allowed the experimenters to examine the effectiveness of different strategies of connecting people (geographically vs. occupationally for example). Also the experimenters hosted the email client that was used to send all of the emails for the experiment. This allowed them to more easily keep track of who was sending what where as well as discourage any sort of cheating.

Watts’ team also did a better job of collecting information on the participants. All volunteers were asked to provide demographic information on themselves before starting. This information included address, age, race, gender, religion, and occupation. In addition every time an email was sent, the sender had to say how they knew the recipient, and how strong their tie to them was. The recipient then had to confirm this information. The results of the first round of the experiment, which involved over 60,000 people and 18 targets in 13 countries, can he found here.

Posted in Topics: General, social studies

No Comments

Symmetry Breaking in Cornell Housing

Recent media has criticized Cornell’s cultural living communities for facilitating and encouraging segregated housing. While the housing initiative was clearly made with good intentions, many feel that the implementation of ethnocentric housing actually discourages the diversity Cornell aims to support, or, at very least, limits the interchange and interaction of people from diverse racial backgrounds.

In our discussion of symmetry breaking, we spoke about Schelling’s neighborhood model wherein, dependent upon tolerance to being in the minority, people self-segregate to be surrounded by similar peers. The simple idea is that people have only a certain tolerance to being in the minority and if this level of tolerance is exceeded, that is if the fraction of proximate similar peers is sufficiently low, then the person in the vast minority will move to be closer to those who share more in common. Similarly, the now vacant house of the displaced will be more desirable to someone similar to the majority in that area (i.e. the same group the other person had fled from). People are not necessarily fleeing any specific group intentionally; they just do not want to be in the minority. One obvious application for the analysis of this thought is race within neighborhoods and, taken a step further, dorms.

By this reasoning, if Cornell actually wanted to encourage diversity they would not provide the alternative of ghettoized housing arrangements. By segregating dormitories, even though that segregation was entirely voluntary and based on shared culture and interests, Cornell allows for a very low tolerance to being a minority. The way to maximize diversity in interchange instead of just numbers is to increase this tolerance. Being forced to live in a dorm with a variety of people unlike oneself creates an increased acceptance of being the minority and the realization that it does not have to matter. As was noted in discussion of statistical discrimination, if people think that type does not matter, then, in fact, type does not matter.

Forcing students of all types to integrate, especially during the crucial developmental period of college, will increase the tolerance for being a minority in a given setting and, in turn, tolerance for living, working, and general life situations.

Posted in Topics: Education

No Comments