Community:Search/StemmedFields

From NSDLWiki

Jump to: navigation, search

The Lucene search engine supports the concept of "stemmed" search fields.

In these fields, text tokens are reduced to their "stemmed" form before storing, and search terms are "stemmed" also before searching. The result is a sort of "fuzzy" search, but with a higher level of semantic content than a simple statistical match of search terms to tokens.

So, for example, a text field of "deformed frogs" would be stemmed to "deform frog", and stored that way in the index. It could then be retrieved by searches against any of the following words:

  • deform
  • deformed
  • deforming
  • deforms
  • frog
  • frogs
  • froggy
  • frogging

Lucene uses the Porter stem filter.

Personal tools