Websites as Graphs

Websites As Graphs
This site provides a way of visually modeling the HTML tags throughout a website. The nodes are color coded to represent different types of tags - links, images, DIV, line breaks, etc. The model is created via a Java applet, which lets you watch the graph as it’s created. The nodes and edges for larger sites take longer to “fall into place.” There is a Flickr site where people can post images of their personal graphs. Comparing the site itself to its visual model shows some very interesting things. For example, the model for http://www.spinning-jennie.com, a personal blog, has a large central symmetrical cluster with primary orange nodes (orange is for line breaks and block quotes). The focus on orange is not surprising for a blog, which is essentially a lot of formatted text with optional links and images.

You can compare this to a less clustered graph such as the one for the BBC’s homepage. This model shows a site with many more distinct pages and a wide variety of content (images, tables, outside links, etc.) It seems that branches of the tree with mostly dead-end blue nodes represent pages with mostly links to outside sites. In the upper right of the BBC graph, a cluster of nodes with blue nodes leading to dead-end purple nodes likely represents a page with a lot of links to individual BBC photos.

One can also make inferences about the structure of a site and the style of the person who coded it - the BBC site is laid out with a lot of tables (red nodes), as compared to another site which might use frames or CSS positioning.

I ran the class weblog through the modeler and got this. There is a large cluster of blue and grey nodes off to the side, then a sprawling tree of oranges and greens. Yellow on one site likely represents the input form for new entries. I’m not sure what the circle of blues represents - a page with ONLY links and no formatting is rare.

Posted in Topics: Technology

Responses are currently closed, but you can trackback from your own site.

2 Responses to “Websites as Graphs”

  1. The Humor of Networks at forever-digital Says:

    […] This can easily be seen in the following images generated on Websites as Graphs thanks to a tip from catrionag (click to enlarge): The above image shows the network of a search for the term “Networks”. To test out the theory presented in the xkcd comic, I decided to go on an adventure, click on links within articles, and see where it took me. Within 12 clicks I arrived at “Cadbury Creme Egg”. For comparison, take a look at the following image showing the Websites As Graphs version of the “Networks” page and the “Cadbury Creme Egg” page. The networks are nearly identical. Yet, despite that, if I challenged you to go from “Networks” to “Cadbury Creme Egg” in those same 12 clicks, odds are it would take you and awfully long time to do it. Go ahead - try it, then tomorrow you can tell your professor you couldn’t finish your homework because you were too busy looking for Cadbury Creme Eggs. […]

  2. Cornell Info 204 - Networks » Blog Archive » The Humor of Networks Says:

    […] This can easily be seen in the following images generated on Websites as Graphs thanks to a tip from catrionag (click to enlarge): The above image shows the network of a search for the term “Networks”. Note the connectedness. So far Bellomi and Bonato’s statement is holding up. There are no “islands”. Everything is connected, even if it’s only by that one lonely node between the two giant clusters. The central nodes, being the pages that are more of references lists, help to connect you from what you were looking for to something completely random and seemingly unrelated. […]



* You can follow any responses to this entry through the RSS 2.0 feed.