This is a supplemental blog for a course which will cover how the social, technological, and natural worlds are connected, and how the study of networks sheds light on these connections.


University Rankings as Symmetry Breaking

Let’s suppose for a moment, (this is purely hypothetical, of course) that there is a small group of elite institutions of higher education. And suppose that each is about of equal caliber, each attracting roughly the same quality of students. This would present us with a perfectly symmetric system, a veritable boulder sitting atop a hill. However, as admissions to these highly selective institutions become more sought over, companies begin to see a business opportunity. For instance, the News Corporation, which publishes a weekly periodical, begins to rank these virtually equivalent schools to sell more issues. The only problem is that the quality of a school is far from quantifiable, depending on countless factors, some of which are independent to each student. So this News Corporation decides to pick somewhat arbitrary measures of a school’s quality: admissions rates, average standardized testing scores of its admits, etc. As chances have it, the measures that they use, for this particular year, give the number one spot to a given school that we’ll call School H, the number two spot to School P, and so on.

Here’s where the unforeseen consequences take place. Since potential applicants to these institutions are clamoring for quantified, easy to understand information about their prospective choices, they turn to the rankings to help them make their decision. Assuming that there are no other outside influences, all of the best candidates naturally make College H their first choice, College P their second choice, and so on. As a result, College H gets the greatest pool of applicants, and in turn, the lowest admissions rate, and the highest average score on standardized tests. The impact of this turn of events is twofold. Not only does the ranking propagate itself since it influences that very statistics it measures, but as more of the best and the brightest flock to School H, the other, otherwise equally good, colleges are left with smaller and less qualified applicant pools. Not only does this fact hurt the other school’s rankings, but it actually hurts the quality of the institution, since the caliber of the students influences the caliber of the school. Once again, the ranking propagates itself, widening the gap between the schools, and this time, actually causing such a disparity to exist.

Clearly, and unfortunately, a similar situation exists in real life. In Time Magazine, a few months back, a reporter noted that Harvard is miles ahead of the competition in prestige. And here at Cornell, you can’t go more than a few weeks without bumping into a mention of the US News and World Report rankings. Even though we all know that trying to quantify an entire college into a single number is absurd, prospective applicants seek guidance from the rankings. Further, since these rankings have a self-propagating nature, the problem, left alone, may only get worse.

Posted in Topics: Education

Comments (3) »

Analyzing and Designing Networks

I chose to review a paper by Milo, Itzkovitz, Kashtan, Reuven, Levitt, Shen-Orr, Ayzenshtat, Sheffer, and Alon titled “Superfamilies of Evolved and Designed Networks.” This paper tackles the problem of how to characterize networks despite vast differences in scale. It accomplishes this by identifying “network motifs” within the structure of various networks. Here a network motif is one of the thirteen directed graphs possible on three nodes or triad. The figure below shows the 13 possible triads.

 

Milo et al. analyzed networks present in different bacteria, larger organisms like fruit flies and worms, the world wide web, social networks, and finally models of different languages. These networks fell into four “superfamilies” with the world wide web and social networks showing similar structure and the other categories of networks showing similarity within themselves. These “superfamilies” were characterized as having different relative mixes of the various triads. For instance, the networks within bacteria had triad #7 more often than expected if the graph were a random graph. Triad # 7 represents a “feed forward” loop which is used to react to changes in the environment. Triad #10 was more present in networks describing the internal workings of larger organisms. This triad as well as triad #9 represent a two node feedback system that is regulated by a third node. The world wide web and social networks had similar structures with the “clique” triad (triad #13) most prevalent. As a side note, this family of networks had few triads that violated triadic closure (triad #6). Finally networks analyzing languages showed that the triads 7 -13 were underrepresented compared to a random graph, possibly because of the way words belong to certain categories and meaningful clauses have words from many categories.

 

Other than providing a useful way to describe the structure of networks, this paper also discusses how the timescale the network must respond on influences the structure of the network. This addresses the question asked in lecture about how response times might affect how clusters form. In bacteria where the signal process time must be on the order of minutes, the network is cascade adverse as this passing of information over many steps is not as efficient as a direct path. These networks can be characterized as a “rate limited network” because the response rate of the individual components is on the order of the desired response time of the entire network. In contrast, when the network is describing a higher level organism such as the synaptic connections between neurons, the speed at which the individual components work is much greater than the response time of the overall network. We therefore see more feedback loops and cascade elements. The languages and social structures are more networks of association, independent of time, and therefore have the more heavily linked structures. The paper suggests using higher order subgraph profiles, but I think the triad description is illuminating by itself. This technique of analyzing a network might also be used in reverse to design a large network: instead of generating a graph with random connections you might bias the connections so that certain triads formed. This might lead to a network that behaves more like a rate limited network or the other sorts discussed.

If you are interested in how to analyze the structure of a large network for your class paper, this short but fascinating paper might be a good resource for you.

Posted in Topics: Education, General, Mathematics, Science

View Comment (1) »

Is Global Warming Just a Fad?

So I should start with a disclaimer, since it’s not very popular right now to defy the idea that global warming is ruining our planet. DISCLAIMER: I am a strong supporter of protecting the environment. I think finding an alternative source for energy is should be one of our nation’s top priorities. And regardless of my views on global warming, I don’t think pumping pollutants into our atmosphere is a good thing.

With that said, I would like to point out some interesting things. I performed 4 searches on Google News’ archives, each one simply containing the words ‘global warming’, not in quotes. I set the dates to be 06/01/2006-Present (picked because that’s two months before An Inconvenient Truth came out), 06/01/2005-06/01/2006, 06/01/2004-06/01/2005, and 06/01/2003-06/01/2004. The number of results for each search were, 95,000 articles, 56,000 articles, 35,000 articles, and 20,500 articles, respectively. Note that each search does not contain articles coming up in previous years, so each is the number of new articles that year. Now obviously, not too much can be inferred by such a basic search, since one could argue blogs have become more popular in the last few years and are contributing to the increase in hits, among other things, but we can draw a couple conclusions.

The question that needs to be asked: Have we really affected things so much in the last 4 years that the number of news stories regarding global warming has quintupled? Or have the media latched onto a hot idea and made it into an epidemic? It all feels like a classic information cascade. A few people were talking about it and then a celebrity made a movie about it, and all of a sudden it become cool to promote environmental issues, and then it exploded! The number of news stories regarding global warming nearly doubled in one year! Now there’s always the argument that people have just now found out about this problem, and it was a huge problem before but no one knew about it. I find that hard to believe since I’ve known what global warming was since 2nd grade, and the fact that 3 years of searches from 2000-2003, each contain about 20,000 results.

We have to be wary sometimes of information cascades and how they can affect our perceptions without us really knowing. I know this is all a relatively weak argument, but it deserves some attention. Here’s some articles that argue things better than me:

An experiment that hints we are wrong on climate change

Inconvenient Truths

Why global warming fears are overblown

Posted in Topics: General, Science

No Comments

Symmetry Breaking in Tragic Violence

Slight changes to a balanced system can a dramatic effect.  The shooting at Virginia Tech is on all of our minds.  People want answers to understand why this tragedy happened.  While the world will most likely never know the true reasons Cho Seung-Hui lost control and killed over 30 innocent people, we can expect speculations.   Cho’s personality and behavior worried his classmates and his teachers.  He was a dark, violence-obsessed, loner, whose writing hinted at the terrifying world inside his mind.  His teachers recommended that he see a psychologist, but he never did.  I recommend reading The New York Times article “Experts Shy Away From Instant Diagnoses of Gunman’s Mental Illness, but Hints Abound,” which can be found at http://www.nytimes.com/2007/04/20/us/20psych.html?ref=us.  Professionals are interested to find out what made Cho lose control.  While many of his behaviors were suspicious and can now be pointed out as warning signs, his behaviors were “not all that uncommon,” according to Dr. Hare. 

This situation reminds me of the example we discussed in class about “symmetry breaking” where a ball is balanced on the top of a hill.  The slightest fluctuation could have a dramatic effect, sending the ball to the bottom of the hill on any side.  Like the ball at the top of the hill, it seems that Cho was balanced precariously in his life.  Though his teachers were worried about his safety and the violent nature of his writing, no one would have predicted his actions.  That is because something happened, possibly the slightest fluctuation, and he lost balance. 

It is possible that if more had been know about Cho’s issues, this tragedy could have been prevented; however, I am not sure whether it is important or not to know what change sparked his actions.  My main concern is that mental health awareness should be brought to the forefront.  Each person is like the ball at the top of the hill, able to go in any direction.  With the help of professionals, everyone has the chance to live a happier life.  I hope we can learn better ways to reach out to people like Cho.  One thing we should all try to do is deemphasize the stigma of going to a “shrink.”  There are thousands of people, in college and otherwise, struggling to make it through each day, and getting help can prevent their lives from tipping in the wrong direction.

Posted in Topics: Education

No Comments

Chatting Not Enough? Now You Can Duke It Out Through AimFight!

AimFight.com

AimFight.com/whatisaimfight

AimFight is a new AOL website created as an extension for AOL Instant Messenger (AIM). Each AIM user has a “Buddy List,” in which all of the user’s online friends are listed. From this one can construct a time-dependent social network in which users are nodes and edges are directed one-way between nodes if one node has the other on its Buddy List. The AimFight website itself is a flash application which allows the user to enter two AIM screen names (a user’s online identification) for participation in an AimFight. The more “popular” user wins the AimFight.

The actual details behind the networking algorithm used to determine popularity hasn’t been released, but the basics are available.

    The algorithm works sort of like a stripped down PageRank method used by Google to rank websites. Only “buddies” who have you on their buddy list, as in are linked to you, count for your popularity score. However, buddies with more in-links themselves count more than those that have less in-links. This is analogous to a website being linked to by another site with a high hub score. In the AimFight algorithm though, there is no authority score and no updating with subsequent rounds. AimFight also only counts out to three buddies for the popularity score. While this is not as complex an algorithm as PageRank, AimFight still does effectively the same thing in ranking the “popularity” of AIM users.

Posted in Topics: Education

No Comments

“SUPA-STAHH!” Is it talent via the power law that brings musical success, or just a tone-deaf following?

In his paper, Superstardom in Popular Music: Empirical Evidence, William A. Hamlen, Jr. talks about the two views of the popular music market. (If that link doesn’t work, try here. You might need to access through the Cornell library system.) I have come across both views as well in my perusal of scholarly articles on the subject. One view is that people don’t quite know anything about actual music or singing quality and simply follow everyone else in their preference of songs, as in a non-informative information cascade. The other view is referred to as the “Superstar Phenomenon” (attributed to Marshall 1947 and Rosen 1981) in which a small difference in voice talent leads to a disproportionately large level of popularity. These people then earn tons of money and take over the industry.

This second view relates to the power law, in which these superstars are the unexpectedly hugely popular part of the population. Marshall and Rosen, among others, propose that a slight increase in talent can coincide with a huge increase in popularity, thus obeying the power law.

According the Hamlen, this superstar view is the one that is widely accepted by the common press and economic literature, but without any evidence. He uses a log-linear demand equation with variables that represent various attributes of the singer and the demand. For example, he attempts to quantify voice quality with external objective ranking systems. (In one case they examined one word sung by many singers – the word, obviously, was “love.”) Later on they compared their voice quality rankings between men and women and different types of singers to be sure that the coefficient did not vary between regimes. Other testing variables that went into the equation involved the singer’s sex and previous career length.

Hamlen found that career longevity was the most influential factor in record sales, followed by the advantage of being a female singer. The fact that a singer who has already sold plenty of albums is typically shown to have higher record sales relates to the “rich get richer” scheme from lecture. The singing talent also showed a significant correlation to a singer’s success, with an R2 = 0.79. This high value shows that the first theory – that pop music listeners cannot discern talent – is not quite right. However, the correlation isn’t close enough to 1 to say that the “Superstar Phenomenon” is completely accurate either.

So in the end Hamlen did not find empirical evidence for this second theory. He did show, however, that the correlation between record sales and previous success is much higher than that between sales and actual quality. So it appears that there is a smaller-scale version of superstardom, and that we music listeners deserve some credit for our taste.

The less experimental evidence that I personal can see is in the comparison of the success between Britney Spears and Christina Aguilera. I don’t know too much about this, but from my perspective it looks like Britney (disregarding her current behavior) ended up more popular and with more records sold, while Christina was obviously way more talented. Actually Christina was very talented, while Britney was just a long-lasting face that marketed well. So it looks like the talent helped Christina, but not as much as the strong career-length helped Britney.

This model is only a scraping of the numerous studies on music popularity. Many discuss network effects applied to a song’s or a singer’s emergence and popularity. In lecture we talked about the Salgankik, Dodds, and Watts study through a music downloading site which showed that popularity levels are unpredictable, but which agrees with Hamlen’s case that previously-demonstrated popularity helps current popularity, and talent isn’t completely disregarded by the music-listening and -buying audience. So Hamlen found that we can’t always blame everything in pop culture on blind herding and information cascades.

And it certainly is reassuring that Britney’s success might not have happened if we were to re-run history. But I like to think that Christina would still be going strong.

Posted in Topics: Education

No Comments

Dell gives customers what they want: Windows XP, not Vista

http://www.canada.com/topics/technology/news/gizmos/story.html?id=fbc529a0-f717-4819-885b-14799565c191&k=28728

Dell Inc. declared on April 19, 2007, that it will once again offer Windows XP to its customers when they purchase certain new machines. When Windows Vista first launched in January of this year, Dell stopped offering Windows XP on most home desktops and laptops. By the end of March, the company only offered XP on two models aimed at home users. However, in light of recent posts on Dell’s IdeaStorm Web site, Dell has decided to offer XP again as an option for four models of its Inspiron notebook and two models of its Dimension desktop PC’s. A plea entitled “Don’t eliminate XP just yet” racked up more than 10,700 votes on the IdeaStorm page, and was the main reason for Dells change of heart. “Michael Gartenberg, vice president and research director of JupiterResearch, said many consumers continue to buy XP because it’s familiar, it works with their existing hardware and programs, and is overall “good enough,” even though Vista boasts a prettier user interface and stronger security.”

This is a perfect example of diffusion in networks. In recent lectures, we have learned that decisions by people to adopt new technologies are heavily influenced by the people to whom they are directly connected in their social network. In this article, the new technology, Windows Vista, is having a hard time spreading due to the existing technology, Windows XP. Prior to January, the large majority of PC owners used Windows XP as their operating system. When Vista hit the market for the new year, a certain small number of early adopters decided to upgrade their computers to Vista. Vista is clearly the more powerful technology, but the small number of computers utilizing this technology is certainly a deterrent for many people. Consumers may already be comfortable with XP because of its dominance over the past five years. Many consumers who buy new computers don’t want to have to deal with adjusting to an entirely new interface, especially when many of their friends and family have the old existing technology. As a result, many of Dell’s customers have decided that they want to purchase new computers with the old operating system installed on them.

Windows XP is not as technically sound or secure as Windows Vista, but due to its heavy establishment, it’s going to take quite a long time for Vista to become the dominant technology. People will only switch to the better technology when they can get the payoff benefits from a sizable fraction of the population already using it. Currently, the majority of PC owners aren’t going to want to go through the trouble of transitioning over to Vista since there isn’t a large population using it, and XP is good enough for their needs. However, this technology will eventually spread as Microsoft continues to market it, and as suppliers slowly limit the amount of products they sell with XP as the operating system.

Posted in Topics: Education

No Comments

Blu Ray and HD DVD’s- Who needs them?

http://columbiatribune.com/2007/Apr/20070417Busi012.asp

Since the advent of Napster and digital music in the mid 1990’s, CD sales have decreased dramatically, declining in accelerating amounts since 2003. Now, with videos and movies becoming increasingly available over the Internet, will a similar decline in sales of DVD’s be observed? Already Wal Mart provides a digital video download service, allowing users to directly download movies that are available at the Wal Mart stores. Similiarly, Blockbuster plans to unveil a “digital rental service” by the end of the year. With faster, fiber optic Internet connections poised to become the norm, downloading a 700 Mb movie to a personal computer will take under 10 minutes, far less than the time required to drive the local Best Buy and purchase a DVD, and comparable to the download rates of music back before broadband Internet became widespread. With down-loadable movies becoming increasingly easier to obtain, the only question that remains is if it will eventually become the technology of choice, supplanting new HD-DVD/Blu-Ray and replacing DVD storage all together.

While other services providing down-loadable movies have been created, the adoption of this technology throughout the network has been slow. However, now with companies like Wal Mart and Blockbuster adopting the technology as well, the “celebrities” of the video retail networks have entered the equation. Both Wal Mart and Blockbuster control sizable market shares and thus have many “in links” into their network. With that in mind, the only thing preventing the spread of digital media is the users willingness to adopt it.

To “switch” to Hd-DVD/Blu-Ray technology, one will have to put front the cost for purchasing an HD-DVD/Blu-Ray player. The Blu-Ray/HD-DVD combo players that are on the verge of entering the market run will cost from 300 to 1000 dollars. With the S-video input common on even the most basic new TV and computers, the cost of playing a digital video is simply whatever Radio Shack is going to charge you for the S-video cable and perhaps a sound wire so that the video does not have to play off of your computer speakers. All told, such products will run someone about 20 dollars. Thus, with the adoption of down-loadable movie technology by the “celebs” of the industry and the much easier means with which someone can “upgrade” from DVD’s to this new technology, we should see a rapid network cascade in which Blu Ray and HD DVD’s go the way of the 8 Track and Beta.

Posted in Topics: Education

No Comments

Statistical Discrimination in Fantasy Sports Leagues

Statistical discrimination is the concept of using the average behavior of a group to judge an individual representative of that group. Some forms of statistical discrimination are common, such as charging smokers and non-smokers different health insurance rates. However, statistical discrimination is harmful when it does not rely on sound statistics, veiled racism, or unobservable traits. Paul Heaton at the University of Chicago investigates the prevalence of statistical discrimination in fantasy football, basketball, and football among sports fans in his paper

White Men Can’t Jump?-Discrimination Evidence From Fantasy Sports Leagues.

Previous studies of statistical discrimination in sports usually focused on player salaries. This is not a sound measure of discrimination since there are many variables that influence a player’s salary including possible fan biases, executive biases, or hard to quantify player traits such leadership skills or media exposure. The study focuses on possible fan biases by using the data of hundreds of thousands of fans and, since a fantasy sports player’s success in the game is based on only athletes’ statistics, there is an objective measure of an athlete’s value. He defines discrimination as two players, controlling for individual ability, team, position, etc, who receive different salaries because of their race.

After analyzing the data from the sports league, he concludes that “there is little evidence discrimination against Blacks or Hispanics in football, basketball, or baseball leagues.” He also argues that his study is more successful than previous ones since he is able to use data from a broad cross-section of the population. And, furthermore, that he avoids a pitfall of previous studies by using a situation where unobservable characteristics play little or no role in player valuations, those traits can not influence the athlete valuations.

Posted in Topics: Education

No Comments

Randomness and Network Effects in Popular Music

This New York Times Magazine Article ”Is Justin Timberlake a Product of Cumulative Advantage?” discusses how the entertainment industry relies on the hitting it real big with a blockbuster to offset many failed investments. But, what causes one artist to became enormously popular is hard to explain and explains why studio executives so bad at predicting which of their many potential projects will become a home run. Recent research suggests that predicting hits is impossible. Predicting success is not a matter of just anticipating the preferences of the millions of individual people. One of the wrong assumptions is that people make decisions about what they like independently of one another. People tend to like what other people like. Differences in popularity are subject to what is called “cumulative advantage,” or the “rich get richer” effect. As a result, tiny, random fluctuations can blow up, leading to long-run differences among indistinguishable competitors — a phenomenon similar to the“butterfly effect” from chaos theory.

Matthew Salganik and Peter Dodds, and Duncan Watts conducted a Web-based experiment where more than 14,000 participants registered at our Web site, Music Lab (www.musiclab.columbia.edu) listened to, rate and, if they chose, downloaded songs by bands they had never heard of. Some of the participants saw only the names of the songs and bands, while others also saw how many times the songs had been downloaded by previous participants. This second group, “the social influence condition”, was further split into eight parallel “worlds” such that participants could see the prior downloads of people only in their own world. All the artists in all the worlds started out identically with zero downloads. Because the different worlds were kept separate, they subsequently evolved independently of one another. If people know what they like regardless of what they think other people like, the most successful songs should draw about the same amount of the total market share in both the independent and social-influence conditions and the best songs should become hits in all social-influence worlds. What they found was the opposite.  In all the social-influence worlds, the most popular songs were much more popular and the least popular songs were less popular than in the independent condition. At the same time, the particular songs that became hits were different in different worlds. Certain songs reached a tipping point. Introducing social influence into human decision making made the hits bigger and also made them more unpredictable. “Good” songs had higher market share, on average, than “bad” ones, but ones own reactions were easily overwhelmed by his or her reactions to others. For example, a song in the Top 5 in terms of quality had only a 50 percent chance of finishing in the Top 5 of success. Social influence played as large a role in determining the market share of successful songs as differences in quality. This is just like a chapter out of tipping point. Long-run success of a song depends on the decisions of a few early-arriving individuals. Their choices are amplified and eventually locked in by the cumulative-advantage process. The “randomness” according to the article is that the early adopters who are chosen randomly make many different decisions resulting in market being unpredictably. This example displays a great example of network effects and popularity as a network phenomenon where the attractiveness of a song increases with the number of people using it. The models discussed in class would be helpful to better evaluate this experiment.

Posted in Topics: Education

View Comment (1) »