Google Keeps Tweaking Its Search Engine

http://www.nytimes.com/2007/06/03/business/yourmoney/03google.html?pagewanted=2&sq=pagerank&st=nyt&scp=1

This article from the New York Times discusses Google’s ongoing mission to improve its search engine. Specifically, the company aims to better satisfy its users and ultimately decrease the number of people who leave the site not having found what they were looking for. In order to close this “gap between often (finding what you want) and always,” Google engineers are looking to improve its “ranking algorithm” by implementing a solution they have dubbed QDF (”Query Deserves Freshness”).

There are several problems Google has faced during this “tweaking” process:

- The sheer scale of web pages and users the site must constantly deal with

(This relates to the “problem of abundance” discussed in class)

- Fraudulent websites filled with ads, pornography, or financial scams

(This relates to the game theory principle aspect of web search discussed in class- since the world reacts to what Google does, people can take advantage of this and write pages designed to score highly)

- Figuring out exactly what the user is searching for (ie: “apples” usually means the user is searching for fruit, while “apple” means the Mac computer)

(This relates to the Intent of the Searcher problem discussed in class - it is often not clear from a 1-word query what the searcher is looking for)

- “Freshness” - how many recently constructed/changed pages should be included in the results?

(We discussed in class that web pages change rapidly)

While all of the problems mentioned above have posed significant challenges, the problem of freshness has proven to be particularly troublesome. Would it be better to provide the most up-to-date information or to display pages that have proven to be more reliable over time? Until recently, Google has preferred the latter. The company, however, is now trying to fix this problem so the issue is not so black and white. QDF is “a mathematical model that tries to determine when users want new information and when they don’t.” Although the exact mechanism behind QDF has yet to be completed, Google engineers believe the solution lies in determining whether or not a topic is “hot.” To be more specific, if numerous web sites are all dealing extensively with a particular topic simultaneously, QDF reasons that users are going to want the latest information on this subject.

The latter half of the article then goes on to describe the Google search engine’s method of ranking pages and how the firm’s engineers are working to improve search quality in other aspects as well. Currently, Google uses a system of “signals” and “classifiers.” It ranks the pages with a system of more than 200 types of information, or “signals.” Pagerank, the system of ranking discussed in class, is only one of these signals. The collected signals then proceed into “classifiers,” formulas that attempt to infer useful information about the search in order to send the user to the most relevant, helpful web sites. While developing this elaborate system, Google also had to develop a system to account for user typos and ambiguous search terms.

Google clearly has a seemingly never-ending amount of work on its hands as it continues its mission of creating the most efficient, helpful search engine. As the Internet constantly expands and changes, Google must continue to adapt and develop solutions to the problems that accompany this growth. New difficulties, in addition to those mentioned above, will inevitably arise, and the search engine will most likely have to become increasingly complex. In addition, Google must not only face this task of constantly updating and revising its search engine, but it must also do so in the presence of competitors such as Yahoo! and Microsoft. If Google does manage to implement its QDF solution, among others, that finally closes the “gap between often and always,” it will most likely remain a mystery, as secrecy is needed to protect the coveted solution from competitors and fraudulent Web designers. For Google to reach this point, however, it seems that it has an indefinite amount of “tweaking” yet to do.

Posted in Topics: Education

Responses are currently closed, but you can trackback from your own site.

Comments are closed.



* You can follow any responses to this entry through the RSS 2.0 feed.