» Cornell Info 2040

Cornell Info 2040 - Networks

This is a supplemental blog for a course which will cover how the social, technological, and natural worlds are connected, and how the study of networks sheds light on these connections.

Effective Information Networking For Health Care

Monday, March 24th, 2008 4:44 pm

Contributed by: c8888

Before the break, our classroom discussions focused on information networks and search. The Internet allows access to tons of information, however the underlying structure of that information can play a critical role in improving – or making worse – its own usefulness. Search engines such as Google and Yahoo have enjoyed meteoric rises because of the utility they offer in sorting through the tangled knot of information on the internet and making it useable. They have also made money on their ability to direct traffic to specific sites for a fee. The key service provided is one of matching information sought with information retrieved, and it depends on a carefully constructed set of databases and programs.

Google, Microsoft, and other pioneers of the content desired to content provided matching (aka search) industry are focusing these services in other contexts than just the internet. One of their most ambitious projects tackles the health care industry’s notorious difficulties with record keeping. This effort is reported on in a CNN Money article (republished from Investor’s Business Daily). http://money.cnn.com/news/newsfeeds/articles/newstex/IBD-0001-23934810.htm The various companies developing tools for this effort want to collect patient data about prior conditions and health characteristics and link that information through algorithms to data available about side-effects, latest research, and more. The goal would be to provide patients and their doctors with a complete picture of health and one which would be transferable across state borders and doctor specialties.

The staggering cost of health care in America would benefit from the adaptation of national data standards and electronic collaboration. The article cites a study where hospitals were able to cut costs by 6% and avoid unnecessary hospitalizations due to lack of information. It further projects national health care costs to fall by 4% to 8% if the practice of electronic record systems becomes more wide spread. Currently only about 20% of doctors use this program.

Massachusetts is considering a law that would required all health care providers to convert to electronic records by 2015. This daunting task is also a business opportunity for some, as reported in the Worcester Telegram & Gazette News at http://www.telegram.com/article/20080323/NEWS/803230321/. The article also cites a Rand Corp. study from 2005 that if all US Hospitals and doctors converted to electronic records $77 Billion could be saved. The study also predicts the elimination of 200,000 bad drug reactions per year. Studies such as this have prompted the Massachusetts action.

One final article in the Australian newspaper The Age reports on that government’s efforts to implement a similar policy. http://www.theage.com.au/news/national/database-to-link-patients-and-doctors/2008/03/24/1206207012399.html The savings and health potentials of switching to electronic networks for medical information are taking hold globally. The effort is extrodinarily costly in the short-term, especially for smaller practices, however the investment is beneficial for office practices, doctor workload, and patient health. It will be exciting to see how the move to integrated, organized networks improves the health care industry.

Posted in Topics: Health, Technology, social studies

No Comments

Information Retrieval Via Social Networks

Monday, March 24th, 2008 4:33 pm

Contributed by: beliveinpixiz

Though Googling “Cornell” will return - sure enough - www.cornell.edu as the first result, the PageRank algorithm has limitations that may be better overcome by networks that are more akin to social instead of information networks. A Search Engine Watch blog post, Is Twitter the New Google Alternative?, describes how people are increasingly turning to Twitter for information that isn’t easily retrieved through search engines. Though Twitter is designed as a “social networking and micro-blogging service” (Wikipedia) which lets people post short updates on their profile pages, it turns out that Twitter is also functioning as a tool for many to pose questions and get answers. According to the article, some are using it for business purposes to get advice on company and site improvements. Lisa Bledsoe, one person who uses Twitter in this way, states in the article:

“Because I deliberately cultivated a Twitter community of my industry peers, I knew they could give me the answer quickly. I can also ‘refine’ my ’search’ on Twitter because I’m talking to actual people, as opposed to posing questions to an algorithm.”

As discussed in class, the process of ranking search results suffers from problems such as multiple ways to say the same things, multiple meanings for the same term, differing authoring styles, differing intents behind queries, and the overall abundance of possibly relevant (and possibly irrelevant) information. However, questions posed over social networking sites are able to eliminate some of these problems. If a person fully types out a question with grammatical context (as opposed to keywords), other human readers will likely be able to correctly interpret the questioner’s meanings and intent. Furthermore, by taking advantage of focal closure to network with other experts in the same business, people like Bledsoe reduce the chance that illegitimate authors (children, conspiracy theorists, etc.) will respond to their queries. Though waiting for answers usually takes longer than a fraction of a second, social networking sites like Twitter offer the advantage of human interaction, a big plus over search engines in some cases.

Unsurprisingly, the popularity of answer sites is also increasing for likely the same reasons. According to the Search Engine Watch blog post What a Searcher Wants: Answers, “the US market share of Q&A site traffic grew a whopping 118% last year.” Just browsing some open questions on Yahoo! Answers, it’s clear why some questions (e.g. Best time during start-up to seek venture capital and timetable for getting money?) are more effectively posed on this sort of site instead of typed as keywords into a search engine. Very specific questions that nonetheless involve lots of common keywords would be less likely to turn up helpful results using a search engine. Of course, open questions on Yahoo! Answers may be answered by just about anyone, but the idea that other people may directly serve as better sources of information than web pages nonetheless persists.

Still, it wouldn’t do to disregard the obvious benefits of search engines. It’s much more efficient to Google “Cornell” for the university homepage instead of contacting peers on Twitter. As Bledsoe concedes:

“Searching for the right information isn’t necessarily an ‘either/or’ situation (either I use either Google or Twitter), it’s sometimes a ‘both/and.’

Posted in Topics: Education, Technology

No Comments

Changing Congress Collaboratively

Monday, March 24th, 2008 2:41 pm

Contributed by: dk523

Larry Lessig, the creator of the Creative Commons and a prominent name in Internet law, recently launched Change Congress. Change Congress is a multipartisan movement created to reform politics by combining existing data and ideas in a “Google-mashup” which will ultimately allow voters to see which candidates support change. The problem which Lessig sees in the current government is that there is too much influence from large corporations and the wealthiest Similar to Lessig’s work done at the Creative Commons, which allows users to mark their content for free distribution, Change Congress currently allows voters as well as congressional candidates to pledge for reform in four main characteristics,

no money from PACs
ban from earmarks
support public finance of public elections
support total transparency.

Step 1: Users will pldedge a combination of the four characteristics and then get a badge which they can display online; Lessig is hoping to spark a political “Livestrong” trend in which it will be hip to display a pledge (which hopefully will pass a tipping point).

Figure 1: Pledge icon to be placed on voters and candidate’s websites.

Step 2: The website will track reform pledges using “wikified tools” to build a network of users, what types of reform they are looking to support, and match congressional candidates who have the same ideals. This will be done by mapping actual support and pledged support on the Google Map.

Figure 2: Map showing amount PAC donations for Ithaca’s congressional district

Step 3: Fund the reform which people have pledged for by allowing people to directly pledge money to candidates that support reform.

As we have discussed in class, the web is a series of pointers that refers to syntactic data: search keywords point to word matches, urls point to specific files on servers. One of the major changes that the internet is moving towards is a semantic web in which information is connected by the data behind the words. Lessig’s big picture seems not only to be promoting change in Congress but also a progressive movement in how we think about the internet as a tool. Each pledge badge displayed will link back to a user’s congressional representative and can be used to build a network of users. Change Congress is crowdsourcing at its finest. It takes a “Wikipedia” approach in which wiki-users gather data about their congressional districts. By forming a community of users who each do their own research on candidates, their pledge status and their actual numbers, Change Congress can aggregate large amounts of data and overlay it on an easily visible map.

While we don’t yet have foolproof semantic web crawlers that can read a politician’s sentences and understand them, this may be a first step towards aggregating information and using the internet as more than a broadcasting tool but as an democratially interactive tool in which voters can contribute not just money but their effort in promoting transparency. With many semantic search start ups and Yahoo!’s recent announcement of their move to semantic web search, it is not out of the question that we will soon have the tools to analyze data on the internet based on the content instead of the word syntax. Relating back to the course, this would change the game in keyword based advertising because users, instead of searching for certain keywords, would essentially be searching for ideas. Advertisers would then need to buy these ideas in order to secure advertising slots.

If Lessig is successful and does happen to stumble across an internet epidemic with a high stickiness factor, we may see a change not only in the structure of the legislative branch but also a change in how the internet can be used as a collaborative tool.

http://www.huffingtonpost.com/lawrence-lessig/fix-congress-first_b_92456.html?view=print

http://lessig.org/blog/

http://change-congress.org/

Posted in Topics: Technology, social studies

No Comments

Hardy-Weinberg/ Nash Equilibrium

Monday, March 24th, 2008 1:50 pm

Contributed by: espy

http://findarticles.com/p/articles/mi_qa3659/is_200606/ai_n17175649/pg_3

In looking into the Hardy-Weinberg equilibrium, this principle helps explain the frequencies of alleles and genotypes in a population that remains constant from generation to generation. This of course occurs only when certain factors are kept constant. This principle works hand in hand with Mendelian segregation and recombinations of alleles. The five factors include random mating, large population, no gene flow, no mutations, and no natural selection forces. At first glance these factors seem really hard to find on earth, since our planet is constantly changing.

In this article Black, Wise, Wang, and Bittles study the ethnic diversity in the People’s Republic of China. While studying the Hardy- Weinberg equilibrium I noticed that this equation was not so different from that of Nash equilibrium in random cases. Ultimately this equation helps understand the significance of data. The studies included the maternal allelic frequencies. Though throughout history there are numerous cases of invasion and an increase in genetic flow the Han, Hui, Sala, and the Tibetans have common maternal origins as concluded by the high frequency of the D haplogroup. These studies were found to be difficult to make conclusions with. There are consistencies because of the large population, one of the factors needed for the Hardy-Weinberg Equilibrium. Even though some relevancies were found within groups of people they cannot be concluded definite facts as because of the varied deviation.

Posted in Topics: Science

No Comments

Auctions and Information Cascades Lead to Cheaper Vacations

Monday, March 24th, 2008 1:18 pm

Contributed by: ok37

Priceline.com, a travel site, offers everything from hotel rooms to car rentals. While many have already heard of the company (mainly through the tongue-in-cheek William Shatner ads airing on television), most do not realize the specifics of proper bidding strategy—as discussed in lecture. This, along with the details of BiddingForTravel.com, a bidding advising site which facilitates information cascading, are discussed in a recent New York Times article.

Based on user descriptions of its bidding process, Priceline appears to follow both a first choice auction (since bidders pay their bid price if they win) and a generalized auction (owing to several available ‘slots,’ e.g. rooms or flights), although it is impossible to be completely sure. What I can be sure of regarding the whole process, according to the New York Times, is the following:

First, potential vacationers submit their desired price, along with a preferred time of travel. Next, Priceline searches with its partners, affiliated hotels and airlines, if they choose to accept the offer. If Priceline is able to find a match, the deal is complete. If not, the whole process can be repeated (obviously with a different bid or different departure dates).

Although the format of the auction is not perfectly clear—in reality, it might actually be a trading network scheme—we can be assured that all bidders are attempting to maximize their payoffs regardless of what any other Priceline user is doing at any given time. Simply put, no one would like to overpay for the good. Alternative means of purchasing the good (e.g. purchasing directly through the airline or resort) are inevitably responsible for determining each bidder’s value. This type of background research inevitably leads to information cascades.

BiddingForTravel.com lists winning bids for Priceline auctions for all potential vacationers to see. It keeps consumers informed on proper bidding techniques and helps them understand what they should be paying for a weekend in the city of their choice. More specifically, it helps people decide between ‘accepting’ or ‘rejecting’ two alternatives. For instance, should I change my bid of $1000 for a holiday in Mexico if past winning bids were approximately $1500? Most likely yes, if several bidders have paid $1500 in the past and the signal strength is high. Assuming that BiddingForTravel.com gets enough publicity, bidders should be using this information to improve their expected payoffs by bidding similarly.

Posted in Topics: Education

No Comments

Application of the PageRank algorithm to evaluate microarrays

Monday, March 24th, 2008 9:59 am

Contributed by: vivien pillet

In September 2005, Julie L. Morrison et al. published an article in the journal BMC Bioinformatics called “GeneRank: Using search engine technology for the analysis of microarray experiments.” This article can be found here: http://www.biomedcentral.com/1471-2105/6/233

Every cell in an organism has the same genes, but different genes are “turned on” and “turned off” in different cells. Cells in the kidney, for example, express different genes than cells in the brain because they serve different functions. A microarray is an experimental technique where scientists can see the expression levels of thousands of genes in a group of cells. Scientists can then compare the microarrays to see the relationship between the cell’s function and gene expression. Microarrays may be compared between different organs, different times during embryonic development, cells treated with drugs versus cells not treated with drugs, cancerous versus non-cancerous cells, etc.

Because of the huge sets of data involved, one challenge of microarray analysis is to prioritize the most important genes to analyze. Morrison et al. applied Google’s PageRank idea to the analysis of microarrays. In class, we discussed how PageRank ranks webpages. Similarly, GeneRank ranks the genes analyzed in the microarray. Analogous to the webpages that appear at the top of a search query that the web user will look first, the goal of GeneRank is to give a clear answer for what genes scientists should investigate first.

The structure of GeneRank closely mirrors the structure of Google’s PageRank. As we discussed in class, the basic idea behind PageRank is that webpages are highly ranked if there are links to it from other highly ranked webpages. Similarly in GeneRank, a gene is highly ranked if it is associated with other genes with high ranks. While a webpage is a node in PageRank, each gene is a node in GeneRank. In PageRank, hyperlinks represent edges. In GeneRank, edges are represented by previously known biological knowledge about the association between the two genes connected by the edge. Thus, GeneRank draws attention to the structural network connecting different genes.

Traditional analysis of microarrays depends on the expression fold changes; a gene is considered important if there is a large difference between its expression level in the control group versus the experimental group. With GeneRank, the expression data is combined with connectivity data—how related the gene is with other genes, and especially with other highly-ranked genes. The relative contributions of expression and connectivity data to a gene’s priority score can be altered in the GeneRank algorithm. The authors suggest that GeneRank should be used alongside the traditional way of analyzing microarrays.

Posted in Topics: Education

No Comments

The Housing Bubble and Information Cascade

Monday, March 24th, 2008 12:57 am

Contributed by: brucetu

We are in the midst of the subprime mortgage crisis, and the housing bubble has burst. People are placing blame on many variables - from the predatory practices of mortgage lenders, unscrupulous analysts, to lack of regulation in the markets - but could the housing bubble have come from the idea of information cascades?

The idea is very simple, and very similar to the examples used in class. Individuals in a group are looking to spend their money and are deciding whether or not they should buy some property/real estate. Let’s say that each individual’s information is useful, but not definitive, and not clear enough to make concrete decisions as to whether or not there is a housing bubble. Now each individual makes decisions sequentially (i.e. no one enters Town Hall Meetings to buy property) and reveals their decisions by actions - in this case, bidding on a house and rising the price.

Now suppose houses are low in value. Person 1 makes a wrong decision and bids on the house and raises the price. Person 2 faces no problem if his own information validates Person 1’s decision to pay a high price. He faces a problem if his beliefs contradict with Person 1’s, in which case he would conclude that he has no worthwhile information and must make an arbitrary decision - by flipping a coin per se.

The result is that now 2 people could be making a bad decision by raising the price. As others make purchases at rising prices, more and more people will conclude that these buyers’ information about the market outweighs their own. This is akin to the housing bubble.

Information cascades gives us insight into how rational people can make irrational decisions. Were all these people stupid? It can’t be. They were just caught up in the bubble and the information cascade.

Link: http://www.ohio.com/editorial/commentary/16246947.html?page=1&c=y

Posted in Topics: Education

No Comments

Think one more time before you make that next big financial decision

Sunday, March 23rd, 2008 11:34 pm

Contributed by: netjw

http://www.nytimes.com/2008/03/02/business/02view.html?pagewanted=print

Throughout the course, we have now studied game theory and auctions which all deal with situations in which people are faced with decisions to make. The article above is about information cascade and herding behavior which also deal with situations that people face when they have to make an important decision based on useful but incomplete information.

The article discusses people’s behavior from information cascade and herding behavior in the financial markets. For instance, people with investment decisions do not see any risk and only see the prospect of huge investment returns. Why is this the case? Well, some would say that people are irrational. However, I would like to believe that people are in their right minds when they are risking huge sums of money. In 1992, three economists, Sushil Bikhchandani, David Hirshleifer and Ivo Welch reveal the answer to the people’s behavior. They explained that when people are faced with an important decision and have useful but incomplete information they tend to disregard their own judgment and follow others’ decisions.

When buyer A purchases stock thinking that it is a spectacular investment(even though it is not), Buyer B contemplates about whether to purchase that specific stock even when his own private information contradicts buyer A’s purchase. In this case, Buyer B is likely to disregard his own private information and proceed to purchase that stock that has low investment value in reality. With buyer A and B purchasing the stock, many other prospective buyers will also conclude that the stock is of great investment value. Then, the stock’s price skyrockets and it becomes a bubble that will burst in the same manner that it became a bubble. After couple of buyers sells the stock, the other shareholders become more and more pessimistic and also get rid of stock quickly, which generates a downward cascade.

This article shows that it is quite easy and simple to create cascading behaviors from people and that it is actually hard to detect whether something is a bubble or not. Even Alan Greenspan, as the article shows, was never sure whether the stock market was a bubble or not in the 1990’s. Experts like Greenspan are affected by the cascading behaviors of others. If the price of stock were constantly rising, why would any expert call it a bubble? Therefore, we must come to a conclusion that individuals are not rational as Bikhchandani-Hirshleifer-Welch proposed in their original paper.

Next time when you are facing a crucial financial decision, do me a favor to stop yourself and ask yourself one question. “Am I making the decision based on my own information and not based on what others are doing?”

Linke to the orignial paper by Bikhchandani-Hirshleifer-Welch

http://www.jstor.org/view/08953309/di014715/01p0058j/0

Posted in Topics: Education

No Comments

e-Bay Breaks the Second Price Auction

Sunday, March 23rd, 2008 10:35 pm

Contributed by: coldphyre

Although e-Bay is “essentially a second price auction,” the mechanics of the system creates an environment that doesn’t follow the implications of a second price auction. The proxy bidding allows each bidder to bid up to his max bid, which, in most cases, should be his true value. The price that he pays would be the second highest max bid of all the bidders, which the e-bay system keeps track of. Unlike the normal second price auction, however, e-Bay has the highest bid listed as well as a deadline for each listing, which allows people to “snipe” a listing with a last minute bid.

Sniping has become one of the staple ways to win a bid against who would normally outbid you. It prevents a bidding war, where two or more bidders continuously try to outbid each other until another person reaches their max bid, which is how an auction is supposed to work. Instead, on e-Bay, many of the bidders are concerned with getting a good deal, and thus, do not always bid the perceived true value of the item. If one bids low early in the auction, the competing bidders have a more difficult task of predicting the value of the winning bid, so the bids on a listing are often low compared to the last 10 minutes of the bidding. By bidding at the last minute, one can win the auction without having the personal highest max valuation, all because no one else can respond.

Although sniping is a very effective strategy, ultimately, for the rational (or less patient) individual, bidding one’s true value is still the dominant strategy. It may not secure the best deal, or sometimes, not even the item itself, but it is the most rational strategy. As e-Bay auctions can last anywhere from hours to weeks, the dominant strategy also becomes the best way to save time.

Reference: http://www.usatoday.com/tech/science/columnist/2006-06-25-physics-of-ebay_x.htm

Posted in Topics: Education

No Comments

Save the Players

Sunday, March 23rd, 2008 10:33 pm

Contributed by: hannahstory

There have been a few posts over the past couple months that have touched on blood doping and steroid use by professional athletes (see references to these posts below). It is true that in cycling, track and field, baseball, and other professional sports, each athlete is faced with a prisoner’s dilemma where it would actually be irrational considering the payoffs to not use performance-enhancing drugs. Each of these posts adds something to the picture, but they’re also missing important points that I will try to address here. The article listed below does a thorough job of providing a detailed history and analysis of drug use in professional sports, presents statistics that are both convincing and startling, and makes recommendations that are plausible as well as impactful.

In the cycling and track and field worlds, blood doping means adding extra red blood cells to your blood so that more oxygen can be carried at once. This can be done through straight injections or by using a drug called r-EPO to boost production of red blood cells. According to reliable estimates, “between 50 and 80 percent of all professional baseball players and track-and-field athletes have been doping,” which is staggering considering that in the media we only hear about a handful of athletes actually being investigated. In this way the public has the wrong impression about drug use in sports. In many cases entire teams organize their use of performance-enhancing drugs as part of their “medical program”. Let’s say you’re a professional cyclist. Using r-EPO along with other drugs can make your performance 10-20% better, and this can translate into around a 150 second difference in a 31-mile time trial, or a 160 second difference on a 6-mile steep climb, which is substantial. If you choose not to use, you’ll almost surely get much less money than you would (unless your parents endowed you with some awesome genes), and possibly even cut from your team. Almost no one even gets tested for these drugs, let alone caught; even if they do, the punishments are not very severe. So why wouldn’t you?

Until even the 1990s, the payoffs for athletes were such that it wasn’t worth it to cheat, but with the arrival of r-EPO most players have to either “cheat or lose”. The fact that this practice is so widespread can be explained by experiments done on many-person variable-round prisoner’s dilemma scenarios. What generally happens is that the players initially cooperate (play fair), but “once defection by confessing [cheating] builds momentum, it cascades throughout the game”, and this is what happened in the professional sports industry. The reason why everyone doesn’t just agree to not cheat and forego the health and reputation risks is because the consequences of getting caught cause a code of silence to be maintained among the athletes. There’s no incentive for someone to confess initially, because they’ll be threatened by their teammates, fined, discredited, and lose all of the fame and fortune they would have gotten. This is why Michael Shermer’s recommendations for how to alter the payoff matrix so that no one wants to cheat are sound (I only summarize some of them here). He suggests that there should be no repercussions to athletes who cheated before 2008, as long as they don’t cheat anymore. Also, testing should be done by more reliable bodies, should happen to more athletes more often, and the methods should be improved by giving scientists rewards for keeping up with the new drugs. Last, the penalty for getting caught should be harsher and affect your whole team, not just you.

The only problem with these recommendations is that (at least in my opinion, mostly because I was so surprised by some of the content in this article) the general public doesn’t realize how common drug use is. So many people would be crushed by this knowledge, and faith in professional sports could potentially dwindle. This doesn’t give the owner of Major League Baseball much incentive to carry out any of these recommendations, and it doesn’t give any player, current or retired, incentive to come forward and help the cause, no matter how it benefits the entire sports world. If the public were aware that this problem is as big as it is, then it might not be a big deal (and it might actually be looked upon more favorably) for a player to confess to doping and/or drug use. This is why articles like this are so important.

The Doping Dilemma: Game theory helps to explain the pervasive abuse of drugs in cycling, baseball and other sports

By Michael Shermer

http://www.sciam.com/article.cfm?id=the-doping-dilemma

(This is only an e-preview - I have the full article on paper)

3/21/08 “Doping is a Dominant Strategy” - http://expertvoices.nsdl.org/cornell-info204/2008/03/21/doping-is-a-dominant-strategy/

3/5/08 “Athletes’ Prisoner’s Dilemma” - http://expertvoices.nsdl.org/cornell-info204/2008/03/05/athletes-prisoners-dilemma/

2/27/08 “Use of Steroids by Professional Athletes” - http://expertvoices.nsdl.org/cornell-info204/2008/02/27/use-of-steroid-by-professional-athletes/

Posted in Topics: Education

No Comments