The Six Degrees of Wikipedia

A friend recently sent me a link to The Six Degrees of Wikipedia, an online tool that finds the shortest path between any two articles on Wikipedia using links in that article. While certainly an addicting little procrastination tool (seriously, try to “beat” it), it is interesting to consider this in the context of what we have recently been studying in class. The website is not updated all that often (as in, synchronized with the current Wikipedia database), but the Wikipedia homepage currently claims 1,755,365 articles in English.

Obviously, the number of articles on Wikipedia is significantly less than the world population, but the average number of Wiki links per Wikipedia article is also probably significantly less (although I was unable to find any real data on this). For example, not including articles on dates and years (an option of the 6 Degree tool), the article “Jon Kleinberg” has 11 outlinks to other Wikipedia articles (Sadly, the article must be too recent to be included in the database searched by the 6 Degree tool). With the current population of the world (according to Google) being around 6,525,170,264, roughly a factor of 1000 more than the number of English articles on Wikipedia, we find that Wikipedia is an even smaller world than the world’s social network, assuming Professor Kleinberg’s Wikipedia entry is indicative of the average entry and that he doesn’t know 11,000+ people on a first-name basis.

What could make up for this, however, is that all of the articles on Wikipedia don’t necessarily have something in common. All people, or at least most of us, are social beings. We are all part of the same large social network and inhabit the same world, breath the same air and drink the same water. It is true that all articles on Wikipedia are, well, articles on Wikipedia - they at least have that in common. But what, exactly, would the band Dispatch have in common with Torrenza, a technological initiative/project by Advanced Micro Devices? Yet, plugging in “Dispatch (band)” and “Torrenza” give just 4 degrees of separation:

Dispatch (band)

1990s

AOL

Advanced Micro Devices

Torrenza

which is somewhat remarkable.

The return path is similarly short, requiring only 5 steps:

Torrenza

Computer hardware

Floppy disk

Zimbabwe

Elias Fund

Dispatch (band)

Apparently, the floppy disk is somehow related to Zimbabwe, which is related to a nonprofit organization created (or inspired) by Dispatch, the jam band from Boston. Again, somewhat unbelievable, but looking through the path articles proves all of this.

While there is not near enough data available to actually create a model justifying this small-world phenomenon on Wikipedia, we could possibly come up with one if we could know the average number of outlinks and inlinks to any given article on Wikipedia (outlinks are only links to other Wikipedia articles and inlinks are only links from other Wikipedia articles). Using the same intuition we used when first discussing the world’s social network, the number of outlinks should be equal to the number of inlinks given our closed network - every outlink from one article must be an inlink to another. From this information we could possible determine the average redundancy (ie, the average number of outlinks that are not directly shared by an article’s “neighbors,” with its “neighbors” being the articles it directly links to) and thus create a fairly accurate model predicting the average “degree of separation” between any two Wikipedia articles.

Posted in Topics: Education

Responses are currently closed, but you can trackback from your own site.

Comments are closed.



* You can follow any responses to this entry through the RSS 2.0 feed.