» Cornell Info 2040

Cornell Info 2040 - Networks

This is a supplemental blog for a course which will cover how the social, technological, and natural worlds are connected, and how the study of networks sheds light on these connections.

The World’s Longest Tunnel: A Networks Perspective

Thursday, April 26th, 2007 1:26 pm

Contributed by: krf39

Russia Plans World’s Longest Tunnel, a Link to Alaska

Russia recently released plans to build the world’s longest tunnel. The proposed tunnel would be dug beneathe the bering straight, linking Siberia to Alaska, and would contain a railway, highway, pipelines, power, and fiber-optic cables. The tunnel would be more than twice as long as the channel tunnel that connects The U.K. and France and may seem quite obsurd considering that it would connect two fairly remote and unpopulated areas.

This is an interesting development from a networks perspective as it will join the two disconnected giant components of many different networks. If we consider that in all likelyhood, the ralroad, highway, power, etc… networks are essentially connected within the americas and within the rest of the world, this link is a huge development as it will cause the two giant components to merge into one. If this project is actually completed, it will be interesting to see what changes will occur on these various networks on a global scale.

One thing that comes to mind is Braess’ Paradox which demonstrates that sometimes the addition of links in a network can cause unexpected changes even in far away (but connected) parts of a network. In class, we also talked about how fairly minimal failures in power networks have been known to cause power outages on a huge scale, another example of how small changes can have unexpected effects on a very large scale. This suggests that it might be unwise to connect the two giant components since minor problems would then be able to cause problems in much of the world as opposed to only about half without this network link.

Another connection to class was the discussion of what happens when two disconnected giant components connect to each other. This was related to when the Americas were connected to the rest of the world by trans-atlantic travel and the explosive effects that happened as a result.

Will the addition of network edges connecting many of the currently disconnected physical networks cause explosive changes? It is difficult to predict but nonetheless, it is certainly something that should be considered before completing a project of this magnitude in order to understand what may happen besides the immediate changes caused by the project.

Posted in Topics: Education

No Comments

False Information Cascade?

Thursday, April 26th, 2007 1:20 pm

Contributed by: cp2009

It is interesting to consider the phenomenon of information cascades when people have to make decisions between several choices. This does not seem to be too big of a deal when making decisions as to where to eat, how to dress, etc. based on what other people are doing. However, when an information cascade concerns how individuals feel about things such as a political or moral topics, this could greatly affect outcomes of a more significant impact. As we discussed in class, information cascades do not have to necessarily cascade correct, relevant, or even favorable information (such as the most popular restuarant is not your preferred choice).

I recently found this story involving a post on Digg in which someone is accused of stealing a stylesheet from Digg. This story in itself is not a big deal, but this post on Digg was heavily upvoted as one of the best, most popular posts on the site. In a situation like this you have to wonder if any of these upvotes were because the readers did investigation on the topic, or were just persuaded by those before them to agree with the writer (chances lean heavily on the second choice). This shows that an information cascade can deface an individual among other harmful things (such as make one restaurant go out of business while another prospers).

This particular situation is almost like spreading a rumor online where everyone can see it. One person says it is true, so everyone else agrees that it must be too. This kind of statement is not in the same spirit as drawing marbles from two bags with different colors at differing probabilities. In a case such as that, subsequent individuals have some private information they can base their judgement on as well as the results of the previous individuals. In fact, each individual has some solidly factual information on which to base their judgment. In the Digg case, other individuals can either agree, disagree, or just abstain from partaking in a story without having any factual information for certain.

This also relates to the small world phenomenon that we discussed in class. For a normal rumor, if everyone spread it, then it would become popular knowledge in very little time. For a social site like Digg, people have even more of a chance to interact with others that they do not personally know in “real” life. This allows false information like this to spread rapidly amongst visitors of the site. As stated previously, they can vote that post positively so that others will see it, subsequently making it spread more rapidly. Needless to say, one should be very judicious when viewing the opinions of others. With the way that people interact and think, especially with the way information spreads on the internet, it seems sufficient to say that individuals should keep these ideas in mind when reading or discussing any sort of non-trivial information.

Posted in Topics: Education

Comments (2) »

Stochastic Diffusion Search

Thursday, April 26th, 2007 10:13 am

Contributed by: Tecla

Stochastic Diffusion Search Explination (basic)

Algorithmic Explination ofSDS

Mathmatical Proof

Stochastic Diffusion Search is a process which allows for the testing of a “hypothesis” or to acquire information from a social network. Individual “agents” are used to break down the problem and gather what has been labeled “partial evaluations” of the “hypothesis.” Through the process “agents” and their “evaluations” will converge on a result. We can use the concepts learned in class to develop a very basic understanding of this while analyzing the algorithm.

First we must assume that there are many numbers of behaviors that a node may adopt, and there are N nodes. All nodes begin with a randomly adopted behavior and begin their test. After each test the nodes then communicate via one-on-one interactions. If they receive a favorable result they will stay with that behavior. However, if a node does not receive a favorable result then two things may happen 1) the node will talk to one if its neighbors and will switch to the neighbors behavior if the neighbor received a favorable result or 2) the node will begin its process over again with a different randomized behavior. As the time progresses the nodes will converge to a single favorable behavior or what can be labeled as the socially optimizing behaviors.

As the process continues nodes begin to adopt behaviors based on the results of their neighbors and less and fewer behaviors are left as options. As the nodes begin to tend to a behavior the threshold of the network is reached since every node communicates with several if not all of its neighbors. As the nodes run through the algorithm their behavioral options become limited and soon each node is behaving based on the actions of their neighbors. Clusters can applied to explain why the process does not unravel through some false cascade. As the acceptance of a behavior grows the density of the network of nodes with this behavior increases, which decreases the likeliness of a cascade against this behavior. Once the density reaches the crucial value of 1-q then not only is it impossible to cascade through and change the results thus far, the behavior adopted by this cluster more likely than not is the favorable or socially optimizing behavior

It is important to keep in mind that in this very simple explanation behavior can be a Boolean, a piece of information, the result of a personal payoff to a node and many more. This is a very bare bones analysis. The mathematical article does a good job with the proof of convergence.

Posted in Topics: Education

No Comments

Six Degrees of Separation on Facebook

Thursday, April 26th, 2007 9:14 am

Contributed by: Bistra Dilkina

Social networking websites would seem to be the perfect venue for conducting a “Six Degrees of Separation” experiment over a large population. The advantages are that the networks involved are very explicit, and that as long as a large, popular network was used, there would be tremendous amounts of data accessible to the experimenters.

Early on in Facebook’s career, there was a section on each person’s profile that indicated how many friend links it took to get from you to them. (Sadly, it appears to have been removed from the more recent versions). From what I remember, this count was never more than three or four for people in the cornell community (the only profiles I could look at back when Facebook was closed), which would indicate (although by no means prove) that Cornell has a lower separation degree internally than the oft quoted six. This makes sense due to the small size of the campus, and the relatively close quarters in which all of us live and work.

Even now on Facebook, after a search for groups involving Six Degrees of Separation, there are a number that propose to research this phenomenon. However, these groups vary in size. The only sizable group is here , and has more than 500,000 members at the moment. The person who started the group has made one trial run where he tried to connect himself to several randomly selected members of the group, and succeeded with average separation degrees of 4.67 the first time and 4 the second time (taken at group sizes of 350k and 400k respectively). However, he only tested himself connecting to 6 people each time, so the data is perhaps somewhat suspect.

All the other Facebook groups are significantly smaller (none over 1000 members), despite some of them having better experimental procedure .For instance, one group aims to collect the friend data of all the users who join and then to analyze that offline, which would be much more effective at checking large numbers of chains. However, the most telling experiment would be if Facebook itself conducted the experiment, using all of the data stored on their servers. That would at least give an estimate of the separation over a very widely geographically distributed population (albeit on connected by a common interest). Because of this common interest, I would expect that such a survey would result in degrees of separation much lower than the original Milgram study (perhaps in the neighbor hood of 4 or 5, rather than the 6-7 that Milgram found).

Sadly, none of these steps really addresses the question of whether this is true in the world as a whole. For that, we are likely to wait a long, long time.

Posted in Topics: Education

No Comments

Box Office Hit or Miss?: Rolling the 80 Million Dollar Dice

Thursday, April 26th, 2007 12:47 am

Contributed by: lastimada

The following article discusses the random nature of movie success in the industry as it exists today:link to article. Academics have slowly come to realize that the heads of film studios who make the decisions on what films to “green light” do no better than an ape throwing darts at a board would in their position. Movie success is not determined by critical examination of scripts or a studio head’s taste. Instead, a number of random factors from pre- to post-production are at work here. That is the reason a movie like the Blair Witch Project, which had a production budget of 60,000 dollars, can do so well while films with bigger budgets, star actors, polished scripts, and well known directors can flop. The big money invested can help make a film successful and low budget films can do well but there are three obstacles:

1) A even spending a lot of money on actors, directors, writers, etc. does not guarantee chemistry or good performances.

2) A good film with a moderate budget won’t have the big budget advantages or advertisement.

3) Most importantly, the budget only affects the film’s initial success. Soon after release the information cascade effects take over.

People all have some information from sources like advertisements, promotion, previews, and headliners, all of which the studios can control. Then there are reviews from critics or possibly peer reviews. The real cascades come from what movies we see other people selecting. I believe there are two ways we see this: 1) our friends say “hey, that was a really good/bad movie” and we see/skip the film. Just think how much stock you put into what your friends think of a film, especially when there are up to 9 dollars per film at stake. Think of the typical dialog when you are deciding to view a film: “What about film x? Oh I heard that was bad.” 2) When a film has a huge or horrible opening weekend this can make a difference. Films can distinguish themselves from others by being number 1 in the box office. The idea is the individual reasons that if most people are investing their money in these movies, maybe I should see one of these movies.

Film studios have long sought to devise some formula for choosing which movies to produce, however, the answer is simple: make movies that people will like. No bad movie has ever done well. Once you figure out how to choose good movies though you have to accept that the movies success is still quite random. Suppose there are two “good” movies with good meaning that the movie meets whatever criteria. Assuming that you have a set of people and they choose one movie your “good” movie may or may not do well. Therefore, success is not determined by the quality of the movie, but by dumb luck.

Posted in Topics: Education

View Comment (1) »

The Small World Project

Thursday, April 26th, 2007 12:02 am

Contributed by: mlh253

http://smallworld.columbia.edu/index.html

In class we discussed Stanley Milgram’s famous experiment that attempted to test the small-world hypothesis. Although his experiment lead to the coining of the famous phrase “Six degrees of separation,” we also mentioned that his results were not particularly compelling for several reasons (e.g. Targets were people of high or notable status and many of the chains never reached the target). Attempts to replicate or conduct similar experiments have proven difficult and/or more restricted to a smaller network, but there is one that aims to discover whether or not the small-world property is indeed real.

The Small World Project, lead by Duncan J. Watts and others at Columbia University, is a large-scale experiment that uses email instead of regular postal mail to try to find targets. The basic rules still apply: participants are given some information about their assigned target but are not allowed to email them directly unless they know them personally; they can only pass on the message via email to a mutual acquaintance in effort to get the message “closer” to their target.

The project also expands upon Milgram’s experiment in a few ways. First, targets are chosen at random; some targets have included a professor from America, an Australian policeman, and a veterinarian from Norway (http://www.newscientist.com/article.ns?id=dn4037). This tests to see if “six degrees of separation” is applicable to arbitrary and “lower-status” targets. Second, participants are allowed to send multiple emails for the same target, as long as each person they send to is a mutual acquaintance. Since participants have to fill out information about each person they send to, how they know each other, and why they chose to send an email to them (as well as demographic information about themselves), spamming is not a major concern. Third, the project also seeks to analyze the “distribution of lengths, along with the effect of race, class, nationality, occupation, and education,” in addition to average length in general.

So far the project has estimated that for chains that started and ended in the same country, the average length was about 5, and for those that ended in different countries, the average was close to 7 (http://www.sciencemag.org/cgi/content/full/301/5634/827?ijkey=Evqpw33fK8Y.2&keytype=ref&siteid=sci).

In lecture we talked a little bit about how replicating Milgram’s experiment would be impractical, mainly because participation rates would be low. The Small World Project, however, has taken measures that helped achieve its success in obtaining substantial participation. First, the first people that start every chain volunteer themselves by signing up on the website; thus the dropout rate at the first step is low. Second, email is quick and more convenient than postal mail, which means that more people may be willing to take the small amount of time to participate if they receive an email from a previous person in the chain. Third, all emails and communications relevant to the experiment are done on the project site itself, which makes it more trustworthy, especially since the site is hosted on Columbia University.

Posted in Topics: Education

No Comments

Relationship Building

Wednesday, April 25th, 2007 9:57 pm

Contributed by: lfe56

http://www.stayfreemagazine.org/public/wsj_software.html

I recently came across an interesting article called “Six Degrees of Exploitation? — New Programs Help Companies `Mine’ Workers’ Relationships For Key Business Prospects”. It is a Wall Street Journal article written by William M. Burkeley and Wailin Wong in August of 2003. Clearly, the article is quite old in technology years and is also a bit out of date, but I think it presents many interesting ideas that can supplement our in-class discussion of “six degrees of freedom”.

The article discusses the creation of software that companies can use to seek out new business contacts. This software works by searching through employees’ electronically stored correspondence with acquaintances, which could be stored in computer address books, calendars, email, buddy lists on instant messenger, or some other medium (Burkeley and Wong). If the person using the software finds a business contact that interest him, he can request an introduction from the employee whose information he used to find this person (Burkeley and Wong).

In the context of our discussion of “six degrees of freedom”, this software can be thought of as allowing the user to see one step ahead in the network of acquaintances. The users still need to, or at least should, ask a fellow employee to introduce him to his acquaintance, so no step is completely eliminated. This greater visibility of the social network should certainly make new business opportunities available, but it is obvious that these types of systems also introduce many new privacy and ethics issues.

These systems can be compared to the way online social networking sites could be used today. In particular, a user of Facebook.com may look through the profiles of an acquaintance’s list of friends. If the user decides that he wants to meet one of these people, he can ask his acquaintance to introduce him.

Posted in Topics: Education

No Comments

The Six Degrees of Wikipedia

Wednesday, April 25th, 2007 9:44 pm

Contributed by: leerose

A friend recently sent me a link to The Six Degrees of Wikipedia, an online tool that finds the shortest path between any two articles on Wikipedia using links in that article. While certainly an addicting little procrastination tool (seriously, try to “beat” it), it is interesting to consider this in the context of what we have recently been studying in class. The website is not updated all that often (as in, synchronized with the current Wikipedia database), but the Wikipedia homepage currently claims 1,755,365 articles in English.

Obviously, the number of articles on Wikipedia is significantly less than the world population, but the average number of Wiki links per Wikipedia article is also probably significantly less (although I was unable to find any real data on this). For example, not including articles on dates and years (an option of the 6 Degree tool), the article “Jon Kleinberg” has 11 outlinks to other Wikipedia articles (Sadly, the article must be too recent to be included in the database searched by the 6 Degree tool). With the current population of the world (according to Google) being around 6,525,170,264, roughly a factor of 1000 more than the number of English articles on Wikipedia, we find that Wikipedia is an even smaller world than the world’s social network, assuming Professor Kleinberg’s Wikipedia entry is indicative of the average entry and that he doesn’t know 11,000+ people on a first-name basis.

What could make up for this, however, is that all of the articles on Wikipedia don’t necessarily have something in common. All people, or at least most of us, are social beings. We are all part of the same large social network and inhabit the same world, breath the same air and drink the same water. It is true that all articles on Wikipedia are, well, articles on Wikipedia - they at least have that in common. But what, exactly, would the band Dispatch have in common with Torrenza, a technological initiative/project by Advanced Micro Devices? Yet, plugging in “Dispatch (band)” and “Torrenza” give just 4 degrees of separation:

Dispatch (band)

1990s

AOL

Advanced Micro Devices

Torrenza

which is somewhat remarkable.

The return path is similarly short, requiring only 5 steps:

Apparently, the floppy disk is somehow related to Zimbabwe, which is related to a nonprofit organization created (or inspired) by Dispatch, the jam band from Boston. Again, somewhat unbelievable, but looking through the path articles proves all of this.

While there is not near enough data available to actually create a model justifying this small-world phenomenon on Wikipedia, we could possibly come up with one if we could know the average number of outlinks and inlinks to any given article on Wikipedia (outlinks are only links to other Wikipedia articles and inlinks are only links from other Wikipedia articles). Using the same intuition we used when first discussing the world’s social network, the number of outlinks should be equal to the number of inlinks given our closed network - every outlink from one article must be an inlink to another. From this information we could possible determine the average redundancy (ie, the average number of outlinks that are not directly shared by an article’s “neighbors,” with its “neighbors” being the articles it directly links to) and thus create a fairly accurate model predicting the average “degree of separation” between any two Wikipedia articles.

Posted in Topics: Education

No Comments

Organizational Information Hierarchies: An Application of Information Cascade

Wednesday, April 25th, 2007 9:38 pm

Contributed by: ramuski

In what may be a politically biased article which bemoans flaws in US military intelligence organizations, Julian Sanchez argues that information cascades are a potential reason these flaws occur. He references an article on information cascades which mentions an experiment performed by economists Angela Hung and Charles Plott:

“Subjects were told that they would be picking marbles from one of two urns. Each urn contained a mix of dark and light colored marbles, but urn A had many more light than dark marbles, while urn B had many more dark than light ones. Each subject would pluck a marble from the urn, showing it to nobody, and then, in sequence, they made guesses as to which urn they were picking from, without showing the other group members what marble they’d seen. Each member of the group stood to win a few bucks if she guessed correctly.” - Sanchez

Assuming that Sanchez also meant to include that subjects would announce their guess, this seems very similar to the 2nd problem of our 5th homework. Hung and Plott’s experiment indicated that an information cascade would likely form in this situation, which was coincidentally also the answer to our homework problem. Sanchez asks us to think of the existence of a similar situation in an organization. Envisioning this, we could give a semi-formal structure to this problem.

If a certain decision, A or B, is to be made by a manager and a number of analysts. The analysts, each having an initial 50/50 chance of an A or B signal, voice their choice on the decision in a sequential manner. After they are done, the majority choice is chosen as a decision by the boss. A cascade could very well develop here if the first people who voice their opinions choose A because they think the boss wants them to choose A (say for political reasons). Sanchez also mentions a potential reason why it is in the individual’s interest to follow the cascade in an organization, which seems intuitive:

“If you’re wrong when the majority gets it wrong, you’re unlikely to get singled out, but if you dissent from an accurate consensus, the mistake is much more likely to get noticed.” - Sanchez

Now we can consider what happens if we add hierarchies to the situation, and the boss is himself/herself an analyst reporting to some other boss which reports to some other boss and so on. What we are likely to have is an example of the SNAFU principle, which Sanchez quotes from Robert Anton Wilson:

“SNAFU Principle: Because subordinates tend to tell superiors what they want to hear, the higher up any hierarchical ladder you go, the more distorted the picture becomes. The person with the most authority in the system will likely be the most ignorant…” - Sanchez

I would like to change this to SNAFU’ which starts with “If subordinates tell superiors…” This decision-making structure can be envisioned as a tree where a boss with analysts corresponds to a node with children. A decision-making process would start with each analyst in a set of analysts in the lowest layer of the tree choosing some decision. When a set is done, their results are reported to their boss which takes the majority decision as a signal and announces the boss’s choice. Some bosses will announce earlier than others depending on how many analysts they have, which would make announcement of choices by bosses more or less sequential

If the politically motivated analysts choose faster than their more objective colleagues, sets of analysts with more politically motivated individuals will choose faster, and consequently their boss will announce his choice faster. This implies that the politically motivated cascade would likely continue to bubble up the tree until it reaches the head node, which makes the final decision.

If this problem of politically motivated analysts does exist, Hung and Plott give advice on how to avoid a politically motivated cascade. In their experiment, they found that if all individuals announce their choices at the same time and without collaborating with each other, it increased the chance of the overall group being right. Paying the group as a whole if the majority chose the right answer also increased the chance of the overall group being right. Incorporating these suggestions into an organization would likely have the same effect although the second may be more feasible than the first. This is one example of how the information cascade concept (and research) can be used to model (and solve) real life problems.

Posted in Topics: Mathematics, social studies

No Comments

Wednesday, April 25th, 2007 9:07 pm

Contributed by: Blanche Dubois

http://www.theage.com.au/news/web/facebook-the-future-for-social-networks/2007/04/24/1177180618008.html

There has always been a heated battle about whether Facebook or MySpace is the top social networking site, but according to Jimmy Wales, Wikipedia co-founder, Facebook will emerge on top. So what makes Facebook so much better than the more veteran MySpace? “It’s actually useful,” claims Wales. While MySpace offered online social networking first, it is simply a social network for the sake of social networking. Facebook, atleast originally, had a purpose to network college students with respect to relevant academic parameters like school, major and classes. Though now its basic social networking capabilties are extended to pretty much anyone with an email address, its originally purpose is still its most popular.

So what makes MySpace for horrible? The lack of a centralized purpose for one thing. Wales also criticizes MySpace’s abuse of advertisements. Facebook, he claims, has a tasteful placement and appropriate amount of advertisements, while MySpace bombards its users with flashing ads. An interesting question remains, however. How exactly did MySpace, the extremely popular and innovative social network become second rate to the newcomer, Facebook? While ad frequency and placement certainly affected this shift, perhaps the more important difference is purpose. As mentioned above, Facebook had a distinct purpose, and this purpose attracted thousands of people whose primary interest wasn’t social networking but rather to learn more about their college. Of course online network junkies from MySpace jumped on the Facebook bandwagon too, resulting in a rich and unique social networking experience. Mark Zuckerburg, the founder of Facebook, has declined offers of well over 900 million U.S. dollars for his site. Clearly Facebook developers see a bright future for the site.

Posted in Topics: Technology

No Comments

The World’s Longest Tunnel: A Networks Perspective

False Information Cascade?

Stochastic Diffusion Search

Six Degrees of Separation on Facebook

Box Office Hit or Miss?: Rolling the 80 Million Dollar Dice

The Small World Project

Relationship Building

The Six Degrees of Wikipedia

Organizational Information Hierarchies: An Application of Information Cascade

Information

Categories

Previous Posts