» Cornell Info 2040

Cornell Info 2040 - Networks

This is a supplemental blog for a course which will cover how the social, technological, and natural worlds are connected, and how the study of networks sheds light on these connections.

PageRank in different contexts

Monday, March 31st, 2008 12:29 am

Contributed by: bigT

PageRank is well known as a highly successful link analysis method for deducing the importance of a web page. While the context in which this method is used has been limited to web page analysis, the algorithm does not make much assumptions about the nature of the nodes. This raises an interesting question. Can we utilize PageRank in a different kind of network graph? For example, if we run PageRank on a social network, what do the converged PageRank values mean, in the context of this social network? Or, if we run PageRank on a pipe network flow, does it tell us anything about the flow itself? Does it even make any sense?

While I do not have the answers, it’s worthwhile to investigate these questions, just to see where it leads us. We know PageRank revolves around the concept of Hub and Authority, so we can ask what do Hub and Authority mean in the context of, say, a social network. Before we move on, it should be noted that PageRank runs on a directed graph. A social network is not a directed graph; however, it is trivia to convert it into one, depending on what the directed edges between nodes mean. For simplicity’s sake, we can just convert each undirected edge (u, v) into two directed eges (u, v) and (v, u), so that every node has as many incoming edges as outgoing edges. In a social network, popular nodes have many incoming edges as well as outgoing edges. It seems obvious that these nodes have very high authority scores. Then what about unpopular nodes? Do they necessarily have low authority scores? If we think about it, these nodes, on their own, are not able to generate very high authority score due to their lack of incoming edges. However, as soon as an unpopular node makes a connection with a popular node, the popular node’s high authority score serves as a strong endorsement on that unpopular node. Thus, even if an unpopular node has very few incoming edges, if all the incoming edges are connected to popular nodes, this unpopular node would have very high authority score as well (though not as high as a popular node’s).

But what could authority scores possibly mean in the context of a social network? A node’s popularity? power? influence? helpful connections? authority (seems apt)? Knowledge? All these seem equally valid, but are they? As mentioned earlier, even unpopular nodes can have high authority scores, so at least we know authority scores do not indicate popularity. What about power? When we talk about power, ability to manipulate/control other nodes comes to mind. Using this definition, can we infer that from authority scores? If we think in terms of network exchange, nodes with many incoming edges usually (but not always) have bargaining power. Their bargaining power is greatest when at least one of their neighbors is a lonely node (a node whose incoming edges are very few), more so when most of that lonely node’s edges connect to non-lonely nodes. Do all this correlate well with authority scores? Not really. Again, unpopular, or lonely nodes, can have high authority scores, so having high authority scores don’t really mean one can control others. How about influence? We commonly talk about peer influence, so influence could well mean the ability to affect other nodes. It is one of the mechanisms by which a fad, a trend, or a meme spreads out. Strong/weak influence usually means how much we are able to affect a particular person, so such measure is very much the same as power. However, having wide influence would mean the ability to affect nodes very far out. Does this correlate well with authority scores? Seems likely (even an unpopular node can spread his/her idea if the node can affect a popular node), however, this depends on the edge’s strength, i.e. how close a person is to another. Because we have not taken this into account, we’ll leave it as inconclusive for the moment.

What about helpful connections? Does a node with high authority score get help easily compared to a node with low authority score? Consider an unpopular node with high authority score. The node has a higher chance of getting help from his/her popular neighbors. But if we think about it, a popular node, more often than not, has to interact with many nodes. His/her attention span to any particular node is often very short. It seems reasonable to expect that popular node not to pay much attention to this unpopular node (unless we take in account of edge strength). So this unpopular node, even with high authority score, is pretty much hopeless, so to speak. However, it’s also conceivable that helping the unpopular node doesn’t require much effort, as the popular node can often relegate the request to one of his/her many neighbors. Then that unpopular node is effectively gaining help through a friend’s friend. As the unpopular node does so, he/she gains a new friend. Now we have made quite a lot of assumptions, but this seems to point to a totally different and new direction. The ability to gain new friends. Does high authority score mean one can easily gain new friends? Seems very likely. To see this, read ‘The Tipping Point’ chapter two. It talks of connectors and non-connectors. A non-connector gets most of his/her friends through connectors, who are well connected by its very definition. The non-connector’s high authority score correspond well to this. If he/she has low authority score, it also seems reasonable that he/she is not connected to that many connectors, so that he/she can’t make new friends easily.

Much of these conclusions are purely thought experiments using common sense and a lot of assumptions, without much buttress from real-life experiments or examples. However, they do serve to illuminate the point that utilizing PageRank on different networks can let us gain new insights.

Posted in Topics: Education, Technology, social studies

No Comments

Networks… in the comics?

Monday, March 31st, 2008 12:13 am

Contributed by: timewalker

Well, this doesn’t relate to things in class necessarily, but I saw today’s xkcd comic and couldn’t resist posting it:

While I’m at it, I thought I’d post a few more network themed things from the interwebs:

On maximizing social welfare:

You may now return to your regularly scheduled blog-posting.

Posted in Topics: Bookmarks

No Comments

Visualizing Wikipedia and the Art of Drawing Large Graphs

Sunday, March 30th, 2008 11:51 pm

Contributed by: easy

While we have long since passed graph theory in class, I still find it to be a very interesting topic. As such, when I found some references to trying to create graphs representing Wikipedia, I was compelled to discuss this issue.

In class, whenever we’ve drawn a network or a graph on the board or in our assignments, they have been relatively small (usually less than 10 nodes, almost always less than 20). Compare this to a sample the size of Wikipedia. As I am writing this, Wikipedia has approximately 2.3 million articles and 12.6 million pages (which include user pages, discussion pages, templates and articles which don’t link to any other articles (leaf nodes)[1]. Imagine trying to graph this by hand.

Although computer tools to generate graphs from datasets abound, creating large graphs is an extremely complicated problem, one which has significant amounts of research and publications dedicated to it. While the issues involved in this problem are interesting, I would rather focus on some interesting views of Wikipedia.

Chris Harrison has an interesting pair of projects related to this. Wikiviz[2] starts with a Wikipedia entry and builds a graph by traversing Wikipedia out to 5 levels. While only covering a tiny fraction of Wikipedia, these graphs still become amazingly complex. He uses size and color to emphasize hubs on the graphs. For a more manageable view, his project ClusterBall[3] takes a single page and goes out two levels, showing the interrelations between all of these pages in a circular graph. Both of these yield very beautiful pictures. Be sure to check out the movies of the ClusterBalls forming as the algorithm works through all the nodes.

Another tool that relates graph theory and Wikipedia and provides a way to visualize the links between articles is WikiMindMap[4]. WikiMindMap lets you select a Wikipedia article and view all the links to other articles it contains (they are separated into sections, so you have to select a section of the article to see the links). You can either view each article that is linked to the page or select it as the new root of the graph and see all of its links. The tool provides a new way to browse Wikipedia.

I would encourage everyone to take some time to think about how a graph of the entirety of Wikipedia would be both useful and cumbersome. While it would be a great tool to analyze the relationships between a large amount of topics in all areas of study, it would also be impossible for a human to comprehend. With 2.3 million nodes and tens or hundreds of millions of edges, there is simply no way to take it all in. (Of course, such a graph would also take a significant amount of computing resources to generate). In this, Wikipedia shows us both the power and limitations of graphs to visualize information.

[1] http://en.wikipedia.org/wiki/Special:Statistics

[2] http://www.chrisharrison.net/projects/wikiviz/index.html

[3] http://www.chrisharrison.net/projects/clusterball/index.html

[4] http://www.wikimindmap.org/

Posted in Topics: Education

No Comments

Sequential Voting as an Information Cascade

Sunday, March 30th, 2008 11:02 pm

Contributed by: Boolean

Almost two and a half months have passed since the Iowa caucuses, and we seem (at least on the side of the Democrats) to be at least another few months away from choosing a presidential nominee. We hear the constant debates over February momentum and superdelegates, but does this painfully drawn out process to select a party’s best candidate actually lead to different results than one simultaneous nation-wide election?

Well, as we know from the study of information cascades, early decisions can have a profound impact, for better or for worse, on later choices. But how do cascades affect sequential elections? Do later voters inefficiently follow the herd and pick a candidate that they don’t like, or do the inferred signals of previous voters lead to a more informed decision. In “Information Asymmetries and Simultaneous Versus Sequential Voting”, Rebecca B. Morton and Kenneth C. Williams asses the effect of different voting processes on electoral results, making use of many information cascade concepts to analyze the qualities of each system.

The results of experiments show quite strikingly that sequential elections do lead to different voter choices than simultaneous elections. Furthermore, the study finds that the information cascade of sequential voting, by allowing later voters to infer the signals of earlier voters, actually lead to voters choosing candidates that better reflect their preferences, a “better” electoral outcome. While simultaneous elections generally favor better-known candidates, sequential elections allow lesser-known candidates to gain momentum. This sequential process often leads to a more moderate nominee, a choice that makes more voters content.

So, as we sit through the remaining months of the nominating struggle, we can console ourselves with the fact that chances are we’ll eventually make the right decision.

Here’s the original article:

“Information Asymmetries and Simultaneous Versus Sequential Voting”

Rebecca B. Morton and Kenneth C. Williams

http://www.jstor.org/view/00030554/di011604/01p0355r/1?frame=noframe&userID=80540094@cornell.edu/01c0a848740050ba75b&dpi=3&config=jstor

Posted in Topics: Education

No Comments

Circular Mills and Information Cascades

Sunday, March 30th, 2008 10:37 pm

Contributed by: klaygenie

Information cascades occur very often in nature. One typical example is the case of army ants. Throughout their history forager ants developed an evolutionary trait whereby, when separated from the group of other forager ants, one ant chooses a random direction and the other “lost” ants simply follow the ant in front of them. Eventually, the army of ants ends up in what is called a “circular mill,” or a large circle of ants. Some mills can reach up to a 1,200 foot circumference (with a lap time of 2 ½ hours) and will continue until the ants collapse and die or a group accidentally stray from the circle.

This type of behavior can be exhibited in humans also. The following article describes several examples that we observe in everyday life. Information cascades are everywhere around us: imagine fads like Tamagotchi and Pokemon, the housing market bubble, and fashion trends. While obviously information cascades such as these do not result in death as with the circular mill, they can sometimes be very harmful. When cascades collapse, people can lose millions in situations such as failed investments or real estate ventures. Since they are so easy to trigger, we often get caught up in them without even realizing it. It is important to make sure that people are properly informed in their decision making, eventually with big decisions like buying stock. Despite their harmful effects, information cascades are built into our genes, and we must learn ways to avoid getting caught up in herd behavior.

http://www.philippahuckle.com/ph/document.php?document=running_on_empty.

Posted in Topics: Education, General, Health

No Comments

The “trickle down effect” and information cascades

Sunday, March 30th, 2008 10:37 pm

Contributed by: dakru

One of the easiest ways to see the trickle down effect was described briefly in Malcolm Gladwell’s The Tipping Point. Malcolm talks of how one fashion trend, the Hush Puppies shoes, were able to come out of near death because of a few trend-setting hipsters. But what does Miranda Priestly, Meryl Streep’s fashion editor character in “The Devil Wears Prada”, have to think about this whole idea. The difference between what really happens in fashion and what Miranda Priestly says happens in fashion are very different things.

The Hush Puppy trend began with a few local trend-setters and grew into a nationwide craze within a couple years. This could definitely be described as a trickle up. When designers caught on to how hot hush puppies were with the trendy hipsters they couldn’t get their hands on the shoes fast enough for their own runways. Anyone buying the shoes in this early stage would be labeled an early adopter and definitely had to have a little bit more edge and confidence to try something so new. But once the shoes were on runways and celebrities all the average adopters and the laggers would start buying up the hush puppies no matter what. Miranda Priestly wasn’t too far off, but left out one important detail about how fashion trends begin.

A fashion trend isn’t just created out of thin air by some higher up fashion editor or designer. In a style similar of most art, fashion takes its cues from what is happening around the world in style and pop culture. Miranda Priestly said that a character was “wearing the sweater that was selected for [her] by the people in this room.” Miranda was not too far off though, because without the fashion industry catching on to these trends and highlighting them there may not be any trends at all. No matter who is right they both make it very clear that a fashion trend starts somewhere and causes a cascade of information to other people.

No matter where you want to believe that fashion trends begin it’s easy to see that they are great examples of an information cascade. Just like in the idea of an information cascade there are a group of early adopters. These are users that take the initial risk and try out a new fashion trend. These can either be specific groups of trendy people or fashion designers and editors thinking up new exciting things that could become trends. The next group are the people that watch these early adopters. These could be fashionistas that follow the designers and big name fashion publications such as Vogue. This group could also contain people that are inside the trendy group that started the trend. These second round adopters only have the initial adopters information of whether this is a real trend or not to go off of and still have a decision whether or not they want to take part in this fashion choice. The third group of people consist of the average consumer. Most people watch TV and movies to see what is popular, but mainly rely on their friends. If they see enough of their fashion trendy friends or celebrities wearing a product they will rationally act on the trend as well, whether or not they truly love it. This is where the information cascade comes in and also where another information cascade starts to happen.

Just as the fashion trend becomes popular to the general public, the trend setters now feel they need something new and start a cascade to fade out the old trend. The same way a trend comes in through the three groups the trend leaves. Who really creates fashion might be in debate, but there is no doubt that every fashion trend follows the rules of an information cascade. Next time you wonder about why you just bought something, think about how an information cascade gave you almost no choice in the matter.

Posted in Topics: Education

No Comments

The Reverse Dutch Auction: Overbooked Airlines and LULUs

Sunday, March 30th, 2008 10:16 pm

Contributed by: beefcake

http://www.cato.org/pubs/regulation/reg14n4-inhaber.html

I found this article on the Cato Institute’s website as part of their Regulation magazine. Quoting directly from the About section of their site, “The Cato Institute was founded in 1977 by Edward H. Crane. It is a non-profit public policy research foundation headquartered in Washington, D.C. The Cato Institute seeks to broaden the parameters of public policy debate to allow consideration of the traditional American principles of limited government, individual liberty, free markets and peace. Toward that goal, the Institute strives to achieve greater involvement of the intelligent, concerned lay public in questions of policy and the proper role of government.”

The article discusses how the concept of a typical Dutch auction can be applied in the reverse fashion to solve situations involving undesirable outcomes for the participants. Two of these situations that were described in the article were the designation of the location of LULUs (locally unwanted land uses) such as hazardous wastes, halfway houses for prisoners, radioactive detritus, and trash; and how airlines determine who to kick off a flight when it is overbooked. In both cases parties exist that absolutely do not want to change their current preference, riding the plane at the current time or living without said LULU in their backyard, but there are also others who would be willing to be compensated to accept the undesirable outcome.

There is a good chance the reader is familiar with the process that airlines use when a flight is overbooked. Traditionally, a free ticket to any destination is offered to the first people to relinquish the amount of seats that are needed. If in that case no one decides to accept the offer, bonuses continue to be added until someone does. This could include an extra $50 and increasing until the right number of people has accepted. This is in essence a reverse Dutch auction. Instead of there being one price that is lowered until someone accepts, the compensation gets increased until someone accepts.

A similar process is used for the designation of LULUs. Once an environmentally adequate location has been determined by thorough study, the communities involved are subjected to the reverse Dutch auction procedure of being offered compensation for allowing the LULU into their community. This process has been implemented after riots and excessive protesting occurred consistently when sites were chosen using statistical criterion. The main protest was “not in my backyard” or NIMBY trying to get the site to be moved elsewhere. The reverse Dutch auction operates on the premise that everyone has their price and that by offering things like tax exemptions, parks and playgrounds, city funding, etc; the majority of residents will accept and pass the resolution.

The reverse Dutch auction encourages people to place their own values on the undesirable outcome. It also allows for only the interested parties to participate in the auction. If you wanted to remain on the particular flight you simply need to hold your ticket and do nothing. Likewise, if you do not want the LULU in your backyard, simply do not change your mind despite the compensation. It is also the most efficient process to accomplish the task and reach the desired goal. The interested parties must act as soon as the company meets their price to avoid being beaten by the competition that is also looking to get compensated. This article took a concept of an auction we learned about in class and applied it in a modified way to discuss real world processes.

Posted in Topics: Education

No Comments

A Nod to William Vickery and Second Price Auctions

Sunday, March 30th, 2008 9:57 pm

Contributed by: perfectstranger

Even though information cascades with and without direct effects have dominated our classroom discussion lately, this entry is going to go back to our discussion of auctions. The article discusses some of the accomplishments of William Vickery (the William Vickery of VCG prices) and how he integrated the concepts of game theory and auctions. In multiple contexts, we have affirmed that in a second price auction with 2 bidders, it is a dominant strategy to bid your true value. When I first heard that, I took the fact as a given; however, this article opened my mind to the fact that an auction is just one big game, each person reacting to the actions of other buyers. What I failed to consider is that the auction is a bilateral game. Bidders must choose which strategy is best to maximize their utility. However, sellers also face a decision as to which type of auction to use to maximize payoff. As is obvious from our dissection of different auctions, clearly the choice of auction to employ is directly related to bidder behavior. The seller should choose the auction mechanism that gets buyers to reveal their true value. The question that arises then is what auction to employ?

As noted in our text and by the article, there are four general types of auctions: the English ascending bid auction, the Dutch descending auction, the first price auction in which the buyer pays what he or she bid and the second price auction in which the buyer does not pay his or her bid but the bid of the second highest bidder. We assumed that all of these auctions were in the “private” context in which each bidder’s value is independent of the other bidders’ values (with common values, we get into the context of the “winner’s curse”). What I learned in this article is that the Second Price bid auction was actually designed by Vickery himself. In my reading of our text, I thought that Vickery played an instrumental role in studying the effects of such an auction, but I failed to recognize that he actually generated the auction that has served as the basis for much of our studies. The article gives an example (much like many that we have gone over in class) why a bidder’s best strategy is to bid his or her true value no matter what the bidding behavior is of other bidders. If you bid below your true value, there is a chance that the other bidder will obtain the object when you actually valued it more and could have gained a greater surplus. If you bid above your true value, there’s a likely chance that you may overpay for the auctioned item. So Vickery’s created a system in which buyer behavior was consistent and predictable: always bid your true value. But why would a seller want to use this method? Sure a buyer will reveal his or her true value, but the seller doesn’t get this price, the seller only receives the second highest price. It would seem more logical for the profit-maximizing seller to use a first-price auction framework whereby the winner pays whatever his or her bid was.

However, such a conclusion would ignore the fact that the buyer’s bid would then not only be a factor as to whether the buyer obtains the good but also determine the price he or she pays. In the second price framework proposed by Vickery, the price was an exogenous factor not determined by the bidder, so the buyer only had to worry about obtaining the good. Therefore, the rational bidder would bid his or her true value. If that bidder valued it the highest, he or she would obtain the auctioned item and wouldn’t have to worry about overpaying. If the bidder did not get the item, then he or she would intuit that another bidder just valued it more. Under a first price auction scheme, the dominant strategy to bid your true value does not exist. If it were, there would be no chance of consumer surplus as seen in second price auctions. As discussed in our text and reiterated in the article, those competing in a first price auction will “shade” their bid to something lower than their true value. The level of shading depends on the number of other buyers in the auction with bids approaching true values as the number of bidders increases. Though the level of shading varies in different contexts, Vickery found that (contrary to initial beliefs) sellers generate the same revenue in a first-price sealed bid auction as well as a second price auction. At first, such a conclusion may seem surprising, but once considering bidder behavior, it becomes clear. Under the second price auction, buyers have no incentive to hide their true values, so even though the seller won’t receive the highest bidder’s value, he or she will receive that of the second highest bidder. Under the first-price scheme, buyers have an incentive to hide their true values, so the seller will receive a discounted amount of the highest value which is exactly equal to the revenue obtained under the second-price scheme.

The most interesting result is that under certain circumstances that all auctions generate the same seller revenue as long as the item goes to the bidder who values it the most and there is no fee or reward for entering in the auction. These two caveats to Vickery’s auction theory are what keep research into auctions alive. In non-theoretical auctions, we observe these caveats in action. For example, in Ebay or in auctions for airwave frequencies, the buyers have to pay a fee to participate. The initial fee may extort behavior and may eliminate some of the high-value buyers thus altering seller revenue. Furthermore, the Vickery auction scheme does not consider redistribution in auctions. Perhaps the seller provides a bid matching scheme for some bidders rather than others which would greatly affect seller revenue. All in all, Vickery introduced an auction framework that was molded to form the basis of auctions we see used by many companies such as Ebay and Google. The 1996 Nobel prize winner has definitely contributed to a profitable form of e-commerce as well has added an interesting component to network analysis and game theoretic motives of profit maximization.

http://www.beyonddiscovery.org/content/view.page.asp?I=3685

Posted in Topics: social studies

No Comments

Beanie Babies: Victims of an Information Cascade

Sunday, March 30th, 2008 9:34 pm

Contributed by: cloud7

During the height of the Beanie Baby craze in the late 90s, throngs of buyers scooped up the little collectibles, hoping that their purchases might lead to future riches. The manufacturer’s plan to introduce a multitude of new characters to the Beanie Baby lineup led people to feel compelled to purchase as many as they could in the hopes that at least one of them might become rare and expensive, just as a Wall Street investor may buy shares of several promising IPOs. Unfortunately, interest in the toys ultimately waned, and so did the prices that they fetched at auction, leaving collectors with little plush animals that did little else than take up space.

Indeed, these hopeful collectors were victims of an information cascade. As we learned in class, if two people choose to do something because they think it is a good idea (i.e. they observed a high signal), a third, rational person will also choose to do it, regardless or whether he observes a high or low signal. In the case of Beanie Babies, many people bought them because they observed others doing so and assumed that it had to be a good idea, even if they personally did not see the appeal. Thus, it would have only taken a relatively small number of people who incorrectly assumed that collecting Beanie Babies was a good idea to start the cascade, eventually compelling scores of other people to buy them, as well.

http://seattletimes.nwsource.com/html/businesstechnology/2002020751_beaniebabybubble31.html

Posted in Topics: Education

No Comments

Customer Recommendations as Sources of Information

Sunday, March 30th, 2008 5:05 pm

Contributed by: andykupe

The article “Amazon Rival Adopts Customer Recommendation Scheme” describes the recent actions of The Nile, an online bookstore similar to Amazon.com. The site is targeted toward people in Australia and Asia. The Nile will use software from Avail Intelligence to create customer recommendations, similar to the feature on Amazon. With this “word-of-mouth” technology, users of The Nile will be able to find new books that may interest them. The new technology will also allow for recommendations to be made in real time, ensuring that customers will be referred to works that are relevant and up-to-date. Preliminary tests conducted by The Nile demonstrate that customer recommendations are correlated with increased sales.

The customer recommendation technology used by The Nile takes advantage of some of the ideas from information cascades. Recommendations are made by people who know about a certain product; this information has the potential to be very helpful for those who do not know much about the product. Thus, customers use the information from recommendations when making their decision about whether or not they should buy a product—the information from other people’s choices is instrumental for users who are considering buying a product. A recommendation validates the decision to buy a product, which may explain the increased sales that The Nile experienced in its preliminary tests; customers could be more confident that the product was reliable and worthwhile if it was recommended by someone else. Recommendations reflect information cascades in that customers use information from other people’s choices to make their own decisions.

There are further implications from customer recommendations. Some customers may be seen as more trustworthy since they may recommend many products that others like. These trustworthy customers may acquire more influence; people may rely on recommendations from the trustworthy customers more than they do for other recommendations.

Posted in Topics: Education

No Comments

PageRank in different contexts

Networks… in the comics?

Visualizing Wikipedia and the Art of Drawing Large Graphs

Sequential Voting as an Information Cascade

Circular Mills and Information Cascades

The “trickle down effect” and information cascades

The Reverse Dutch Auction: Overbooked Airlines and LULUs

A Nod to William Vickery and Second Price Auctions

Beanie Babies: Victims of an Information Cascade

Customer Recommendations as Sources of Information

Information

Categories

Previous Posts