This is a supplemental blog for a course which will cover how the social, technological, and natural worlds are connected, and how the study of networks sheds light on these connections.


Evidence for game theory

http://plato.stanford.edu/entries/game-theory/#Behav

Although this article is long, I will focus on the section titled, Game Theory and Behavioral Evidence. One of the major criticisms of game theory is that it does not make specific enough predictions and that it does not adequately predict behavior. The assumption that the players be perfectly rational is argued by some to be unrealistic, and while it works well for the mathematical model of game theory, it is not based on behavioral evidence. However, it has been shown empirically that game theory can be used to predict the behavior of lower order animals with great accuracy. Why the discrepancy between low order animals and higher order ones, including humans? It is believed that since low order animals lack the ability to learn and create societies, traditional game theory models them very well. In light of this, other variations on game theory such as evolutionary game theory have been developed. Evolutionary game theory allows the players to develop preferences as they play the game. This implies that the game is being played multiple times, as opposed to traditional game theory in which the game is played once, and each time is considered independently of the other times. Unfortunately, even evolutionary game theory has a problem. Humans tend to, sometimes arbitrarily, change preference. When this happens, the model breaks down.

In researching game theory, I came across various people who believe that game theory is dead. They believe that game theory’s vague predictions and assumptions are fatally flawed, and therefore, game theory should be replaced with other theories. Clearly there is a place for game theory in this world as it models low order animal behavior well. How difficult would it be to extend game theory to include higher order animals? During our lectures on game theory, we were instructed that there is no reason why a player’s goal cannot include things about other players. Is it possible to add some irrationality to a player? Game theory is a model, we must therefore take into account the simplicity of the model when evaluating it. If another model makes predictions as good as game theory but it is simpler, then that model is better. The question then becomes, will changing game theory to more adequately predict human behavior destroy its appeal as a relatively simple model? Are there models that more accurately predict human behavior, and are they simple enough to be useful?

Posted in Topics: Education

No Comments

Connectedness of the Blogosphere

Discover magazine recently had an interesting article the connectedness of the blogosphere. Since the web is so dynamic, with new blogs being created almost constantly (about one every 2 seconds) and outdated blogs being deleted/disconnected from the connected component of the blogosphere at a similar (but much slower) rate, it is hard to model the exact state of the blogosphere at one point in time. As a result, social network expert Matthew Hurst gathered link data between blogs for six weeks and published a plot of the connectedness of the blogosphere. This graph produces some interesting results:

1. Certain website have significant prestige (similar to a high hub score in the HITS algorithm). The blog DailyKos has half a million visitors a day and as a result any link from DailyKos would cause the target of that link to increase in popularity dramatically. This prestige, however, builds on itself as now many other blogs will link to DailyKos. The huge white dot in the graph

2. Different online communities loosely separated with the rest of the blogosphere. The popular LiveJournal blogging community is large and well connected within itself but there are only (relatively) a few links between this mini-community and the most well-connected part of the blogosphere. Similarly, a sizable mini-community exists for blogs related to pornography, whethere it be news and information or actual pornography itself. Also, there is another outside group of bloggers: sports blogs. These blogs are also connected within themselves, but have the added property of linking in toward the large component of the blogosphere frequently. Again, there are fewer links to these mini-communities, as it seems to be somewhat separated from the rest of the blogosphere. What is interesting, however, is that technology and political blogs seem to be very well connected to the rest of the blogosphere. The white dot noted by the number 2 represents the blog Boing Boing, a very popular technology blog that links to many other tech blogs, is in the center of the most connected portion of the blogosphere. Similarly, Michelle Malkin’s (a popular syndicated columnist) blog is in the center of all the political blogs. Here we also have political blogs in the well-connected portion of the blogosphere.

These results seem to indicate that the type of blog is a significant determining factor in the importance or connectedness of the blog in the blogosphere as a whole. The fact that politics and techonology dominate the blogosphere is probably a result of the demographic of Internet users sophisticated enough to enter the blogosphere: those up to date with technology issues and those who have a strong opinion on current issues. The results from this study only seem to confirm what the state of the web seems to be.

Posted in Topics: Technology, social studies

No Comments

The Small World Web

In the paper The Small World Web by Lada A. Adamic, the author attempts to demonstrate that the largest strongly connected component of the link graph representing the internet satisfies the small world phenomenon. That is, he shows that sites in the largest component are separated by an average of 4.2 hops which is roughly the same for a random graph of the same size, while the clustering coefficient is .081 which is larger than the clustering coefficient of a random graph of the same size. For those unfamiliar, the with the clustering coefficient is defined as follows: at each vertex v, if v has k_v neighbors then there can be at most k_v*(k_v-1) directed edges between them. If C_v is the percentage of present edges, then the clustering coefficient is just the average of C_v over all vertices. According to the original paper of Strogatz and Watts, a graph satisfies the small world phenomenon if the average number of hops between sites is approximately equal to a that of a random graph of the same size while the clustering coefficient is larger than that of a random graph.

It’s important to note that the author only considers the web at the site level as opposed to the page level. This means that two web pages housed at the same URL are considered the same, which makes sense as a valid simplification since web pages within a site are usually strongly connected. In addition to proving that the web is a small world graph at the site level, the author goes on to propose a new search algorithm based on hubs/authorities on small world graphs. The idea is as follows: first off, query the search words and gather a list of sites containing pages which are relevant to those words. Next, find the strongly connected components of the returned sites, and calculate the average number of hops across each component to get an estimate of what site may be the center of that component (i.e. the distance between that site and any other is roughly the average number of hops). When this is done we can give spanning trees rooted at the centers of the components, and calculate the clustering coefficients for each. The search engine will then return the centers to the users along with relevant data such as the size, the clustering coefficients, average number of hops, and spanning trees in a ranked fashion.

This type of search has a benefit in that if the center returned to you is not very useful, chances are that a useful page may be a few clicks away as the components are small world graphs. The presented search seems to build off of the hubs/authorities idea as the centers returned will most likely be hubs, while pages they link to will be authorities. The author goes on to make some arguments why the search will return good results, but I do not believe that it’s very practical. In most searches these days, it is very probable that Wikipedia would be included in a fairly large component as it links to many other sites which probably have paths back to it, so you may encounter very large components. Calculating the components, necessary quantities and spanning trees would then take a bit of computing power and probably wouldn’t be returned quickly enough. Also, with the popularity of Wikipedia as it is, it’s probable that Wikipedia would be the center of a very large component, so it would be returned first. This is no more helpful than just querying Wikipedia by itself.

Another issue with the paper is that it was written around the year 2000, and the web has evolved a great deal since then. It may not satisfy such a nice small-world phenomenon anymore, and with Wikipedia dominating, I can’t imagine this search would be all too effective.

Posted in Topics: Education

No Comments

Information Diffusion in Blogs

The following article takes a look at how information diffuses throughout blogs: http://delivery.acm.org/10.1145/990000/988739/p491-gruhl.pdf?key1=988739&key2=5402837711&coll=GUIDE&dl=GUIDE&CFID=20792120&CFTOKEN=30091602

There were two levels of analysis involved in this study. First they considered a topical approach: seeing how various world events and even local events affected spread topics throughout blogs. The researchers called this approach the “macroscopic” approach.

The second approach considered individuals inside the blogs and how they spread certain topics. This approach was called the “microscopic” approach.

One thing to note is that in class our discussion centered on people who come into physical contact with each other. The internet, however, makes the bottlenecks placed on the spread of topics and information much less. Today, many millions of people have weblogs, including members of various media outlets (such as CNN and FoxNews).

Under the macroscopic analysis, the researchers found that most topics on the web can be separated into three categories: just a spike (low activity to high activity, back to low activity), spiky chatter (multiple spikes throughout time; highly sensitive spikes), and mostly chatter (moderate activity but for a much longer time).

Spiky chatter would most likely be world events that impact a great deal of people. This could be something like ipods. Every time Apple makes an innovation in their ipod line, many bloggers post about what their hopes are for the product and how much they look forward to having one. Just a spike could be something like a superbowl (unless it’s a particularly good one). There will be many posts from analysts and fans as to whom they think will win. However, after the event is over, not many posts will be made.

Just chatter could be something which is an everyday occurrence, such as class sizes in schools.

Individuals were modeled differently from topics. Researchers modeled the spread of information through individuals as an infectious disease transferring through a society. This is a bit different from the way we modeled diffusion in networks in class. We had the diffusion stopping if a threshold number of neighbors who accepted the information was not made (like in class), information in a web based society can only be stopped if someone does not own a means to access the information.

Although an interesting article, studying the world wide web is difficult. The researchers in this article do a good job, but the internet is such a large domain that an accurate study of how information diffuses through blogspace warrants more study.

Posted in Topics: Education

View Comment (1) »

The Behavior of Global Information Cascades on Random Networks

http://www.pnas.org/cgi/reprint/99/9/5766.pdf

Duncan Watts’ 2001 paper “A simple model of global cascades on random networks” analyzes the conditions under which global information cascades occur, using a simple model with varying parameters. A global information cascade is just a cascade that influences most of the nodes in a large network, initially formed by a small number of nodes starting in a novel ’state’. Similarly to one model we studied in class, Watts’ model consists of individuals in a network where each person observes the state of his neighbors and decides to enter state X if a certain threshold of his neighbors are in state X. Watts explored the global effect of using different thresholds for each individual, drawn from a particular distribution. He also used distributions to determine the number of neighbors that each individual has, and uses the parameter z to indicate the average number of neighbors each individual has. Watts analyzed cascades mathematically and eventually came up with a cascade condition, which is an equation that predicts when a cascade will occur in this simple model. He constructs a graph relating the threshold of each node, the average degree (# neighbors) of the nodes, and whether this combination had the capacity to form a global cascade. These cascades tended to form when each individual has a smaller number of neighbors, and when the threshold is low. His mathematical findings were supported by experimental data, using simulated networks with the same parameters. He found that when the network is not very well-connected, then the capacity for a global cascade to form is limited by the overall connectivity, and well-connected nodes are critical for global cascades to occur. When the network is well-connected, then the capacity for a global cascade is limited by the ’stability’ (threshold) of individual nodes. It is also interesting that this type of network seems very stable to various shocks applied to it, but dramatically and quickly cascades when the proper conditions are met. Another parameter that seemed to affect the likelihood of global cascades is the heterogeneity of thresholds and node degree. Watts found that when individual thresholds were diverse, global cascades were more likely to occur, but when the vertex degree of nodes was diverse, they were less likely to occur. I found many of these findings quite interesting and non-intuitive, and very applicable to the study of network theory. Large information cascades clearly have a huge effect on the decisions that people make every day, and their mechanism is not very well understood. Watts provides some basic analysis on the conditions under which global cascades are more likely to occur, and even though they are based on a simple model, they can definitely be insightful when analyzing and predicting real-world information cascades.

Posted in Topics: Education

No Comments

Tipping into Cyclical and Chaotic Phenomena

We recently focused on simple tipping phenomena. Typically, these models involve a strictly increasing function, where the number of people attending rises with people expected to attend. However, what if, at some point, additional forces cause this function to begin to decrease - congestion costs, for example?

(Much of the mathematics behind this is based upon my own research in high school, and I can provide my old paper if one wishes.  I have made several programs exploring this sort of dynamic in a mathematical context.  This is a new application of that research.)

The new situation can be modeled as follows:

The first time some recurrent event happens, some initial number of people attend.

Suppose x people attend the event. With only this information about previous attendances, f(x) people would want to attend the event the next time it happens. The simplest way to interpret this is that x people attended last time, so everyone uses only that information to determine whether they attend this time.  f(x) is then in fact a smooth function of x.  (However, f(x) can end up needing to take outside considerations into account. A variant could let f(x) consider multiple previous values of x - which could model decisions depending on whether the trend is increasing, for example - but this becomes much harder to analyze using the tools I will briefly introduce.)

The simplest model would probably be a parabola that is zero at x = 0 and again at x = 1, and peaks halfway. So let f(x) = ax(1-x), where a is some parameter.  The function peaks at f(0.5) = 0.25a.  The parameter a represents a sort of popularity - if things are just right between crowded and sparse, how many people would maximally show up?

Here’s a diagram of what sort of thing happens, for a = 2:

Web Diagram example

Above, we picked a random initial value x_0, and then iterated as shown. Graphically, drawing up to the function evaluates the function at our current x, and drawing across to the diagonal updates the x value to the next x.

Want to see and explore these for yourself? I would recommend Winfeed, from http://math.exeter.edu/rparris/winfeed.html (click link at the title near the top). Download it, run it, and choose to make a new web diagram (you might have to change the function to the logistic option.)

Now we use lots of different a and plot the resulting value(s) when things get stable. The surprisingly-interesting result is called the Feigenbaum diagram. (The variable a is labeled r in the following version.)

The Logistic Map

So things don’t always reach stability! Why?

Well, first, analyze the the left side. At values less than 2, less than half the max ever wants to go, so there is very little interesting behavior - if there are too many, the population immediately shrinks to less than half. Then there is no longer crowding and attendance grows slowly until it reaches equilibrium.

Beyond 2, the equilibrium happens past the halfway mark, where the function peaks. The slope is negative at the equilibrium - if more people come, the next attendance drops below the equilibrium, and the reverse is true (if attendance is a little below, more than the equilibrium come). Values oscillate around the stable value until it converges there: too many come, then too few, then only slightly too many, and only slightly too few, so on. On web diagram, the values seem to spiral into the middle of the diagram.

But what’s with the fork? Well, eventually the slope around the point becomes too negative and it no longer converges - a small change in x creates an even greater change in the next step, and things diverge away from the equilibrium. But it manages to then find an “orbit” of period 2 - if you solve the function taking steps 2 at a time, you end up with the solutions of those two branches. But as you keep increasing a, this happens again, and again, and again… And you get orbits of 2, 4, 8, 16, 32….

And then at one point, things fall apart completely! Chaos reigns the region beyond about 3.7, with occasional scatterings of periods of other numbers (such as 3, which is the big “window”). The diagram is in fact a fractal - if you zoom in repeatedly, there are little pieces that are almost identical to the overall picture.

I’m just introducing this topic, so I won’t cover the math behind all these different phenomena in detail (I may choose to flesh this out in the short paper). But they all occur with this same simple model!

What does it mean? My interpretation is that as peak popularity and desirability increases (but congestion factors continue to apply), volitilitiy and chaos increases. (As real-world systems usually don’t follow a model exactly, it would be unlikely that something would hold the same a value to hit and stay in the period-3 window, for example.)

An actual example of such behavior, I’ve heard anecdotally, is the periodic swing in the population of locusts.

But one might object that this chaos is not actually observed. The usage of capacity that describes of the NY mass transit system at rush hour probably needs a value of a greater than the chaos point of 3.7ish. If a subway is about half-full, more might be willing to ride, but then when a subway crowds, the model drops riders’ willingness much too sharply. The actual behavior is not this chaotic day-to-day. However, I would argue that there is a time lag here which dampens possible underlying chaos. It is not true that all people re-evaluate their decision to use mass transit on an everyday scale, so there can be a much more gradual decrease in ridership. Also, people take into account the trend of ridership, so if it seems that other people on a too-crowded line are not riding as much anymore, they might be more willing to continue riding and wait it out, which adds additional damping and prevents chaotic overshoot.

Another cautionary example: Although, for example, Yankees attendance might seem to follow model which should end up chaotic, the demand for their tickets greatly outstrips the available supply. Therefore, not all the people who want to attend actually do so. The expected number of people attending the next game would become equal to the capacity of the stadium, not the number of people who wanted to attend. Therefore, there would be an equilibrium stuck on the capacity of the stadium. The proper way to model to set the capacity of the stadium to 1, and then let f(x) rise until it hits 1 and then stay there.

Posted in Topics: Education, Mathematics, Science, social studies

No Comments

Reputation in online auctions

In class we discussed the characteristics of the market for lemons. In this case there is asymmetric information between the buyer and seller. While presumably the seller knows whether the car is good or not, the buyer cannot tell. This situation is very similar to the online market of buyers and sellers found in online auctions.

Several online auction sites, including eBay, rely on feedback systems in order to help minimize the risk involved in a transaction between two people who have had no previous interaction with one another. A paper by Daniel Houser and John Wooders provides an interesting analysis of these systems and their economic effects. As in the market for lemons, buyers experience risk due to the uncertainty of the transaction. The seller may not deliver the item and its quality cannot be guaranteed. However, unlike the market for lemons the seller also experiences risk since the buyer may fail to pay for good. Houser and Wooders examined the effects of eBay users feedbacks scores to determine if the reputations of buyers, sellers, or both have a statistically significant effect on selling price.

In order for these feedback systems to be effective, positive feedback profiles should allow sellers to obtain higher prices for the items they sell, providing an economic incentive for establishing and maintaining a good reputation. In such systems, a seller’s feedback score essentially represents the probability that he will deliver the item as described. This is analogous to the fraction of cars that are good in the market for lemons. As we saw with the market for lemons, this suggests that the buyer’s expected value for the item should increase as feedback score improves. This produces the desired reinforcement of seller’s reputations. However, we know that in second price sealed bid auctions that the each persons equilibrium bid is their expected value of winning. This suggests that the bidders reputations does not have an effect on price.

These assumptions were consistent with the results observed in the Houser and Wooders study. It was found that the seller’s reputations based on the numbers of positive and non-positive comments left have a statistically significant impact on selling price. As expected, positive comments increase selling price and and negative comments decrease price. It was also shown that bidder reputation has little effect on price. Based on the principles of the market for lemons, if the probability that buyers would fail to deliver the item was too high buyers would assume that all sellers would default on the transaction or their offer price would be so low that only bad sellers would use the site. This situation would cause the market to collapse and eBay would fail. Thus, the feedback system helps provide buyers with a reasonable estimate of a probability of a good transaction and ensures that these probabilities are high enough that eBay’s market prospers.

Posted in Topics: Education

No Comments

Yahoo’s Social Network

 

“Yahoo’s Undercover Social Network”

 

http://blogs.business2.com/utilitybelt/2007/04/yahoos_undercov.html

In the world of Internet social networks, whose dominant participants are those such as Myspace and Match.com, a new player has emerged. According to the article “Yahoo’s Undercover Social Network,” Yahoo Answers is the fastest growing component of Yahoo. In the past year Yahoo Answers has more than quadrupled its number of users. In the time span of one year, Yahoo Answers has increased its audience from 2.3 millions users to 15.7 million users, not bad compared to another major player in the network: Facebook.com, which can claim about 19 million users per month. (Although one point to make note of is that the browsing of Facebook.com is limited to members only, while Yahoo Answers is not.)

The article ponders a reason behind this enormous increase in the members of the Yahoo Answers network. One possible reason is the fact that Google is no longer participating in the Answers service. The increase in Yahoo’s users may in part, be thanks to former Google members joining the Yahoo network. However, the more significant reason might be Yahoo Answers’ new feature: “invite a friend.” With this new addition members of the network can now “invite” their friends to join, increasing the number of nodes in the network. Yahoo Answers will expand with the addition of the new nodes that join on their own, but the network will further expand with the addition of new nodes that existing nodes invite to join the network. With the entrance of all these new nodes, connections can be made between existing nodes and new nodes, creating edges and bridges, and possibly further increasing the size of the network.

Another important point that the article makes note of is the question topics that are being asked in Answer networks such as Yahoo Answers and Amazon’s Askville, a new network that has also joined the Answers business. The types of discussions that are being had and the questions that are being asked in these networks are of another variety than that of the topics being discussed in Facebook or Myspace. The article points out that not only are these networks expanding the size of their own networks, they are expanding the entire ‘market’ of these social networks as well.

 

Posted in Topics: Education

View Comment (1) »

1-UP’ing the Competition

Network Effects and Comeptition: An Empirical Analysis of the Home Video Game Industry

An article published a little while back in Wiley InterScience takes a very interesting look into the role network effects and externalities play in the video game industry. How does network size (user base) aid or harm a major competitor in this industry? Moreover, what does it mean to succeed in such a closed market?

This paper proves even more interesting as it was written a few years ago, which in terms of technology is quite a long time. With the advantage of time we can now look back and see how accurately their model and predictions depict the state of the industry. Shankar and Bayus focus primarily on the 16-bit generation of consoles - namely the Nintendo SNES vs. Sega Genesis period - for which much data existed at the time of writing. The take a look at externalities in competition of two incompatible products. Shankar and Bayus used data on hardware sales, prices, advertising expenditures, and installed customer base for the two companies to predict Nintendo’s eventual control of the market, despite Sega’s almost two year lead in release time over Nintendo.

These same conclusions about the different factors that affect the gaming industry can be applied to today’s market and major contributors. If the amount that release time, initial user network size, price, etc. contribute to the success of one platform over another remain the same over time, can we predict what lies ahead for the now seventh generation of video games? Who among the major hardware contributors - Sony, Microsoft, and Nintendo - will guide the video gaming future?

Posted in Topics: Technology

No Comments

Cheating - Not an Evolutionarily Stable Strategy

Cheaters and Chumps: Game Theorists Offer a Surprising Insight Into the Evolution of Fair Play

Cooperation has long been prevalent in human civilization and society. This article sets up an example in which both players cooperating are rewarded, and both players cheating do poorly, while one player cheating and one attempting to cooperate gives the largest possible reward to the cheater, and no reward for the player cooperating, similar to the situation of the prisoner’s dilemma. Seeing cheating and cooperating as strategies similar to the example of the hawk/dove situation that was discussed in class, one may assume that cheating is an evolutionarily stable strategy. A population of cheaters may not fare well when compared with a population of cheaters, but they would fare better than if they were a cooperator among cheaters (detailed as the “what-a-chump scenario” by the author.

However, altruism and cooperation are prevalent in modern times, and this article proposes an explanation as to why cooperation as a trait is not simply modeled by the hawk/dove example. The author cites studies where participants will punish cheaters – even at the expense of their own reward, with the motive of revenge, from a sense of justice. Assuming that this sense of justice and altruism is a genetic trait, there must be a genetic basis for it being maintained in the population.

So, if random acts of cooperation do not pay off, what is wrong with comparing cooperation with the hawk/dove example? The author cites genetic-relatedness, which may breed familial cooperation. This could provide a basic unit from which cooperation could evolve, by self-interactions. Exclusion of selfish/cheating individuals forces competition of cheaters with each other, while cooperators interact with each other, skewing rewards to those cooperating. These interactions may not be one-shot deals, and as such, cheaters may be identified and ostracized, further preventing the one-sided cheater/cooperator interactions. Repeated interactions with punishments for cheaters both deters cheaters, and allows for “potential payback” The author cites that society has even fostered relatedness unwittingly to forge cooperation, with examples of the army supporting a brotherhood-like bond, and an effect of teams forming substitutes for familial relationships.

Posted in Topics: Education

No Comments