This is a supplemental blog for a course which will cover how the social, technological, and natural worlds are connected, and how the study of networks sheds light on these connections.


Digg’s Algorithm

Digg is a social content website with community-based news articles. “It combines social bookmarking, blogging, and syndication with a form of non-hierarchical, democratic editorial control.” News stories and websites are submitted by users, and then promoted to the home page through a user-based ranking system. This system differs from the hierarchical editorial system that many other news sites employ.In the algorithm used by the Digg creators, readers can look at all the stories submitted by other users. When a story receives a certain number of “diggs” based on the community votes, the story will be promoted to the home page. However, Digg’s ranking algorithm isn’t simply comparing the number of votes that each submission gets and then promoting the stories with the most votes. There are many factors that are considered by the creators. Some of the elements in Digg’s ranking algorithm are in the following:

1. Recent participation rank of users: When a story is submitted and while voting on stories, the rank and recent successes of users are taken into account. If a user’s story acquires a fast succession of “diggs” from “high-value” users, that user is most likely going to be promoted faster with a lower number of “diggs” than if the “diggs” were from “low-value” users. This is relevant to Google’s PageRank discussed in class. Although in this case, we are not talking about documents (webpages) but users. In the PageRank algorithm, links made by “important” documents, weigh more heavily and help to make other documents “important.” In both algorithms, this is an essential element because if a document is linked by an important document, then it should be more valued than a document linked by a less important document. For example, a document linked by the New York Times would be valued more than a document linked by the Cornell Daily Sun.

2. Friends system: The friends system allows the development of a group of users who read and trust each others articles. Users can combine articles and track friends into something similar to a watchlist. According to the creator, Kevin Rose, “You’re into what they’re digging, you’re into what they’re submitting and commenting on. And digg takes all that information under your own personal profile on digg and combines it all together in a single feed for you to pick through.”

Digg’s “friends system” goes along with the idea of social-affiliation network we learned about in class. In this case, two Digg users, can experience focal closure, which occurs when two people with a shared activity become friends. In this case, the shared activity is the Digg website, and the two people are two users who read and trust each other’s stories. As a result, focal closure can potentially develop a tightly knit group of users who all share and trust each others stories/articles.

http://blogs.zdnet.com/web2explorer/index.php?p=108

http://blogs.zdnet.com/web2explorer/?p=109

http://searchengineland.com/071128-122432.php

http://en.wikipedia.org/wiki/Digg

Posted in Topics: Technology

No Comments

The Next Step in Internet-based Social Networks

On his website’s blog, John Breslin included an article he helped author for IEEE magazine titled “The Future of Social Networking on the Internet.” The link to the above mentioned post is here .

In the article, Breslin points out that while the internet has worked well connecting people for many different purposes, flaws remain in the social networking sites (SNS) of today preventing the creation of a single network that allows people complete access to all online content as well as each other. Some reasons for this include:

1. While physical social networks are designed to help people work together on common interests, evidence suggests that online social networks lack these common goals. Rather, a SNS user may make a connection for no other reason than to increase their number of friends. (Facebook being a good example.) As Breslin states, “These explicitly established connections become increasingly meaningless because they aren’t backed up by common objects or activities.”

2. Different networking sites do not work together. If one wants to join another social network, they must rebuild their profile and their network of connections from scratch every time.

To provide more meaning to online social networks, Breslin suggests that networks should be designed such that people form connections around the interests they share. This “object centered sociality” better simulates real life social networks and focuses the online network around the content that keeps people interested and continuing to visit the site, such as blog posts, photos, video, etc. With this approach, people will connect over things they have in common, rather than more or less at random.

Several efforts are already underway to achieve more interoperability between different SNSs. To achieve this, a method of representing connections between different people and content. One way Breslin suggests this can be achieved is through the Semantic Web (Wikipedia Article) which has the means to create an agreed upon format for representing people, content and the relations between them that can be used by all SNSs, allowing them to effectively integrate with each other.

There are several currently ongoing attempts to achieve some of the goals stated in Breslin’s article.

Among these are the Friend of a Friend project and the Semantically Interlinked Online Communities Initiative which are using the Semantic Web to expand ways to create and connect content on SNSs, And OpenID which provides a method to create one user name and password for the various websites people visit that require a login. With these effors underway, it will be interesting to follow how the online social network evolves in the next few years.

Posted in Topics: Education

No Comments

Gaming Twist on Ivy League Rivalries

Ivy League schools now have a new area to compete in - that of online gaming, via GoCrossCampus, or GXC. Developed by four undergrads at Yale and one at Columbia, GXC is a strategy territory-conquest game, much like Risk. The difference is that the battlefield consists of college campuses across the USA. Each player is given a set number of armies, and allowed to move once each day. The goal, of course, is to conquer everything. (Last Winter, Cornell turned the armies of Dartmouth into their vassals, apparently.) So far, more than eleven thousand players have taken part, and the game has been funded by both WGI Fund and Easton Capital.

What differs between GXC and any other online game with a twist? GXC extends across the realm of digital and common day life. Rather than individual people sitting at separate computers all the time and doing things towards their own means, the network of players is grouped by school, and act as a team, making GXC’s structure resemble a school intramural group. Additionally, GXC’s impact on players’ lives involves them banding together, rather than setting each person apart. In some schools, it’s a social event. (Commanders are designated, and even give motivational speeches.) It can even be seen as an online supplement to real-life networks.

The article compares the phenomenon to Facebook - while not nearly as widespread and with different purposes in mind, both applications are a networking service between schools and within them. The game has popular enough that it has extended into the political realm, with GoCrossPoliticalBash08, which, as you can guess, pits presidential candidate supporters against each other instead of university students.

Original Article: http://www.nytimes.com/2008/03/21/technology/21ivygame.html?_r=1&ref=technology&oref=slogin

Posted in Topics: Education, Technology

No Comments

an amature guide to silicon valley real estate investments, credits to constricted sets

http://realtytimes.com/rtpages/20041020_siliconvalley.htm

The Silicon Valley is located south of San Francisco in beautiful nothern California. In addition to year-long sunny, 80 degrees weather, the Silicon Valley is also the home of the DotCom boom, Stanford University and three public high schools that rank top 50 in the nation.

map of silicon valley and vincinity

The Silicon Valley is also known for its ridiculous home prices. In the city of Palo Alto,for example, it is not unusual for average 3 bedroom cottage to cost over 1.5 million. Many expected the real estate bubble to burst when recession hit in 2001, however, the houses whose prices dropped significantly are the gated mansions in Los Altos hills or Atherton that the average middle-class household would only dream of purchasing. Purchasing homes thus becomes a significant problem for young couples without established savings or credit.

Recently, the real estate prices of the Ardenwood area in the city Fremont across from the bay are rising at a noticeably faster rate.. but why? The theory of price formation in networks tells us to raise the prices of items in constricted sets in order to create a perfect matching. Although the Dot Com boom is over, the internet and computer industry is still growing rapidly. New softwares, games, electronic devices (iphone anyone?) are constantly being developed and making the silicon valley residents wealthier and wealthier. The income of the buyers is increasing, thus even though the real estate price keeps climbinb, the “total energy” in this network have not yet diminished.

since prices only rise within constricted sets, a good real estate investment strategy is to seek areas near silicon valley that are affordable at present and expected to attract people in the future. During the boom, a large proportion of people flooding into the silicon valley area are experienced, age 40+ individuals. Nowadays, computer science knowledge is renewed faster than ever, and the majority of people settling into that area are engineers fresh out of college or graduate school. Since computer science is one of the most promising industries today, this trend is expected to continue.knowing this, investors should then look to areas near the silicon valley with good elementary school districts, such as Fremont’s Ardenwood area. Fremont school district’s high schools are no match for the nationally famous Gunn and Lynbrook high schools in the Silicon Valley, but their elementary schools are ranked top in the state. although the real estate prices in the heart of the silicon valley is still rising, it is rising at a much slower pace than did before. young couples with young children and limited savings tend to look for housing that will satisfy their demands at presence, namely elementary education. Although the weather and location of the Silicon Valley is attractive and thus creates a constricted set, the unreasonable prices are pushing more and more investors to look for other options that will satisfy their current needs. The ardenwood area’s prices are already climbing at a faster rate for this particulary reason. In the future, as more and more children of educated couples enter the school district, the high school and middle school’s quality will begin improving too, and perhaps yet another silicon valley will be born.

Posted in Topics: social studies

No Comments

Does Do Not Track mean a new distribution of online ads?

http://www.readwriteweb.com/archives/do_not_track_legislation_could_change_ad_landscape.php

The article starts by mentioning online advertising including the possible Do Not Track bill and Facebook advertising/privacy issues. The article then gets into push vs. pull marketing. With the Do Not Call list and Spam filters in e-mail the only form of push marketing left is junk mail. This is why online advertising has become so huge. It was a new form of pull marketing (specifically search driven CPC). However this form of advertising is showing problems now: the price to advertisers is rising because of the auction model, it is most effective for impulse buy products (around $100 and under), and is open to click fraud.

So what are the alternatives to CPC (Cost per Click)? There’s the old CPM (Cost per Impression) model. This worked when the web was static html pages, but in today’s more complicated websites this doesn’t work as well (for various reasons which this article does not go into).

What does all this have to do with Do Not Track? Most ads now are shown to users based on cookies, or data that has been tracked about their recent searches and web sites they’ve visited. This allows advertisers to display ads related to topics the user has been thinking about recently, which translates to more clicks because the ads are relevant. If Do Not Track goes through it ends this (called Behavioral Targeting), which is currently the most effective way to advertise on the web. There are several competing possibilities to replace CPC if Do Not Track harms it as much as some expect. These alternatives include CPA (Cost per Action- swings heavily in the advertisers favor because they only pay when users give them something whether it’s a sale or a e-mail sign up), VRM (Vendor Relationship Management- which is still being developed, but centers on the user saying what they want and vendors responding to that- see here for more information: http://projectvrm.org/ ), and a return to brand advertising (with new creativity from the advertisers).

Is the future of CPC really as bleak as this article makes it sound to be? One of the major flaws that the article claims CPC has is rising price due to the auction model. As we discussed in class, with the auction pricing of ad slots, a social welfare maximizing result should occur. This is true if the advertisers bid their value for the slot (the optimal strategy). As long as advertisers don’t bid (or pay) over their value for the slot, this shouldn’t be a problem. Competition will always drive profits lower, and if there are lots of companies competing for a few ad slots, the prices will be high and profits low (think of the buyer-trader-seller networks with competition). This isn’t a problem with CPC and a new model is not going to change that.

Do Not Track could still be a serious threat to CPC, as its effectiveness would decrease significantly. However how is CPA really different? You would want users to see relevant ads so they click on them and take an “action”. So Do Not Track would harm CPA as much as CPC.

What Do Not Track would actually do is take power away from the ad serving company (Google, DoubleClick- not the company that is advertising) because their current power is their ability to match users and advertisers. This power is also evidenced by the complaints of advertisers mentioned earlier for raising prices (also again mentioning the buyer-trader-seller graph- there are many more buyers and sellers: users and advertisers, then there are traders: ad serving companies, so the ad serving companies have the power). Losing the ability to match users to advertisers harms not just the ad serving companies, but the advertisers as well because their ads would be less effective. So what is needed is a new way to match consumers to advertisers. The only solution mentioned in this article that offers a new way to match is VRM. This leads me to believe that VRM could be the new model for online ads (with power in online advertising going to the company that successfully does the matching).

Posted in Topics: Education, Technology

No Comments

Video Road Hogs Stir Fear of Internet Traffic Jam

http://www.nytimes.com/2008/03/13/technology/13net.html?scp=1&sq=internet+networks&st=nyt   This article brings the recent alarm that has been brewing over the past months about the staggering rise of Internet data into focus. Because of flashy methods being used to communicate between people, video and phone (on the internet) as well as the communication of ideas through clips of videos, pictures, movies, interactive gaming, and social networking websites on the internet, more bandwidth than ever before is being used. The alarm stems form research showing that by 2011, the internet demand may even exceed the network capacity; in all probability, true effects would be slower downloading, not so much an entire network crash. But others feel that there may not be such a problem since technology and innovation is getting better and better surrounding the issue of Internet and access to it. Further it is the ability of people to afford and attain tools and technology like cables and high speed networks in order to maintain fast downloading. This is also a main reason why the Internet isn’t as fast in other nations as it is here in the United States. Thus, many believe the internet will be and can be salvaged by technology, and others believe, because of the availability and presence of so many websites, links, videos, games, and interactions, that the high vulnerability of the internet make it susceptible to many lost opportunities.    The association between this article and the class teachings can be seen through idea of information networks, nodes, hyperlinks, and edges. There is a difference between the World Wide Web and the Internet, in that the World Wide Web is built using the technology of the Internet. The web is just an application of the Internet, even though it is the prevailing one, consisting of a virtual network made up of pages and hyperlinks. The pages can be viewed as nodes, each being connected by hyperlinks or edges. Hyperlinks, as in today’s world wide web allow the user to move from webpage to webpage by clicking the text or image serving as the hyperlink. These edges and nodes form the directed graph aspect of its information network whole. As a directed graph, each webpage of node points to another node through the use of its hyperlinks or edges. The Internet has continued to grow so rapidly because of its advancements and also because of the more and more nodes and edges that are being formed around the World Wide Web. The massive increase in activity has caused estimations of 100% increase per year of web usage.  Thus, the dilemma that is introduced, the possible downfall of the Internet, can be directly related to the growth of these nodes and edges. The accumulation of these fundamental web elements has generated more bandwidth use, especially with the more interactive, creative websites being formed. Before search engines, people had to “crawl” through pages and pages in order to find desired information. Since that point, solutions have been made to counter the problem. But, since the search engine formation, so many sites have been created because of the ease to find each website through engines. As a result, the World Wide Web, and the Internet as a whole has increased very rapidly, causing the “fear of internet traffic jam”. So, with bandwidth use drastically increasing, as a result of the prevalent information network elements like nodes and edges which connect billions of pages, society will need to see more innovation allowing for such high volume of downloads and nodes, while maintaining the same downloading speed, in order for the World Wide Web, as we know it, to efficiently exist. 

Posted in Topics: Education

No Comments

Road coloring theorem solved

The road coloring conjecture was proved recently after almost four decades by an Israeli mathematician. The theorem states that given a finite number of roads, it is possible to draw a universal color-coded map that leads to a particular destination regardless of the point of origin. This is equivalent to saying there is a set of synchronized instructions from any point within a network to a certain other point that would work regardless of which node you start from.

“For example, consider the vertex marked in yellow. No matter where in the graph you start, if you traverse all nine edges in the walk “blue-red-red—blue-red-red—blue-red-red”, you will end up at the yellow vertex. Similarly, if you traverse all nine edges in the walk “blue-blue-red—blue-blue-red—blue-blue-red”, you will always end up at the vertex marked in green, no matter where you started.”

Directed graph with synchronized coloring

This problem has many real world implications, from message and traffic routing to search algorithms to finding directions. In the cases of graph networks or the trading graph networks discussed in class, it shows that is is always possible to find or track a route to a certain node from any other node. So the flow of information within a network can always be traced from a set of synchronized instructions for every node. Lost emails can always be tracked down and directions to a friends house are available from a single set of instructions from anywhere in the network. With regards to searching algorithms and online advertising, this process could be manipulated to direct traffic towards a particular site.

Referenced articles:

http://en.wikipedia.org/wiki/Road_coloring_problem

http://ap.google.com/article/ALeqM5g2lh1_jNDbrmhNoMlwkZTfLeCw8gD8VHBPIO0

http://arxiv.org/pdf/0709.0099v4

Posted in Topics: Education, Mathematics

No Comments

Viral Marketing

Viral marketing is an advertising technique that uses social networks to promote brand awareness or sales. The unique point is that the advertisers try to accomplish the goal of spreading their message through self replicating processes, hence the name viral marketing since the idea of self replication is similar to a computer virus. The tools advertisers use to employ viral marketing include advertisements on television but more importantly the internet. Posting a video on a popular internet media site such as YouTube is essentially free and it has to potential to be seen by many more people than a 30 second television ad. The idea of self replication is applied to viewers who see the ad and pass along the message voluntarily to others. The key here is “voluntarily”. Thus advertisers try to embed their message within things that may catch the viewers attention. I’m sure most people have seen one of the many flash ads that are designed as interactive games which you can actually play in your browser. You may predict that such advertisements draw in a relatively small crowd since most people do not want to stop what they are doing to play a simple game usually involving simple mouse clicks. It may be surprising then that on average 50% of people who see such advergames stop what they are doing to play and the average playing time is 25 minutes! These “advergames” are just one of the many techniques that advertisers employ to attract a costumers attention. You can read more about advergames at http://en.wikipedia.org/wiki/Advergaming Other forms of viral promotion include videos, online books, email, software, and images.

Research shows that the average customer will tell three people about a product that is marketed to them in a way that they find interesting. This effect can cascade and grow exponentially which is what viral marketing tries to take advantage of. By targeting social networks which have a high potential to spread viral messages, viral marketers are always looking for a way to spread their message fast and effectively.

You can read more about viral marketing at http://www.tamingthebeast.net/articles/viralmarketing.htm

Posted in Topics: Education

No Comments

Google announces ad loading times will affect their AdWords Quality Score

As we learned in class, search engines such as Google harness the widespread demand for keyword-based advertising to generate bucketfuls of revenue. Though we discussed much of the details surrounding the use of generalized second price auctions to match slots with advertisers, we assumed fixed clickthrough rates for slots and therefore did not take into account the ramifications of ad quality on keyword-based advertising.

However, Google cares very much about ad quality. It especially does not want advertisers with low-quality ads (which will have low clickthrough rates) to bid for slots with very high prices per click. Instead, Google wants those expensive slots to go to advertisers with high-quality ads that more users will click on more often so it can make more money. To solve this problem, Google has implemented a Quality Score system. High quality ads will receive higher Quality Scores, and will be automatically placed in better positions so they are more noticeable to users. By intuition, we can see that this system rewards both Google and the advertisers with high-quality ads.

And recently, Google has declared advertisement page load times to be an yet another indicator of ad quality, and has decided to punish advertisers with slow ads by lowering their Quality Score and reward advertisers with fast ads by doing just the opposite. Google especially benefits from fast-loading ads because users accustomed to seeing fast-loading ads on Google and having a better overall web experience with Google will, over time, click on more and more ads displayed in Google’s search results, thus generating more revenue for Google. This new evaluation system also benefits the advertiser, because users are less likely to abandon fast websites.

 

Referenced Articles:

http://www.seroundtable.com/archives/016457.html

http://adwords.google.com/support/bin/answer.py?answer=87144

 

Posted in Topics: Education

No Comments

Developing an Exchange Network Simulator

http://links.jstor.org/sici?sici=0731-1214%28199524%2938%3A4%3C519%3ADAENS%3E2.0.CO%3B2-O

The above link will direct you to an article by Barry Markovsky from Sociological Perspectives. Markovsky created a model for predicting which nodes have power in Network Exchange Theory experiments. This article discusses his approach to creating “X-Net,” a computer program that simulates outcomes of resource exchanges in networks. His program uses an “iterative” rather than “analytic” approach that he says produces more realistic, experimental results.

Markovsky had little experience with programming, but he used QuickBASIC, which is based on Microsoft’s BASIC programming language. In order to feed the computer network graphs, he used matrices of 1’s and 0’s to indicate nodes and edges in the network. The computer model could simulate Network Exchange Theory well because it satisfied one condition of network exchange theory, that all nodes have a common objective and do not demonstrate idiosyncratic behavior. However, the theory does not explain how negotiations should take place and how a computational model should handle the individual interactions between nodes. Markovsky approached this problem by dividing his program into various levels. The top level was the most general—the experiment as a whole, and the lowest was the most specific—the “actions” level, which included the offers and counteroffers made by the nodes. X-Net removes nodes from the network that have already made exchanges until no nodes remain.

It is interesting to see how a computational model can run experiments on networks. It makes sense to use the computer to carry out many of rounds of each experiment, which involves many iterative processes. I am still uncertain about where X-Net falls in between practical experiments and mathematical predictions for Network Exchange Theory experiments. I would like Markovsky to elaborate on how his model’s results compare to theories and practical tests. It would be interesting to use his program to simulate some of the network exchange experiments we discussed in class and compare the results to our predictions. I searched for a download of X-Net on Google, but I could not find one. Markovsky says that his model is important and can be used to make generalizations about network exchange, but it would be dangerous to make these generalizations without support from both theoretical and practical data.

Posted in Topics: Education

No Comments