Flaws in the PageRank Algorithm

Link: http://en.wikipedia.org/wiki/PageRank

 

I find the page rank system that we discussed in class to be overly simple. Yes it works for the application of simple networks that we are looking for, but I really didn’t understand how that could be used on such a large scale. The Wikipedia page that I found on this algorithm really goes in depth on how the algorithm is put into practice. It looks at the damping factor that is involved in depth. I found it interesting that the reason the damping factor is put in is to rule out random clicks on the links. But can it not be said that all the links that you click on are random clicks. Even though the damping factor drives down the scores I still feel like it over compensates for the random clicking. Looking at the equation that is given, the probabilities are completely arbitrary as well. Does someone realistically go through all the web pages that possible for a certain subject and think about the probably the person clicked on it by mistake. So is the damping factor really doing anything that is not completely subjective? Google can only guess at these probabilities and that fact alone makes the page rank system have some variability within it.

 

The website also goes into the problem of false and manipulated page ranks. This algorithm is flawed in the fact that once someone figures out the system, there is no way to stop them from getting high page ranks. Ultimately, there can be topics that the relevance of any page that comes up in the ranks can be close to zero. If an advertising firm could potentially take over a topic that is searched and block out any true links and replace them with their own by creating pages that link to each other. All it takes is the firm getting into one highly rated page and linking to itself through as many pages as it is willing to create. I would not be surprised if there was a group of advertisers that are doing this successfully on topics that are not as popularly searched. Google has no way to check every single topic searched for being spammed and advertisers know that. Although the algorithm seems to work, I feel that eventually every search will just end up being manipulated by advertisements and no one will ever get the type of relevant information that they are actually looking for.

Posted in Topics: Education

Responses are currently closed, but you can trackback from your own site.

Comments are closed.



* You can follow any responses to this entry through the RSS 2.0 feed.