TNS Internal:NDR/gSearch

From NSDLWiki

Jump to: navigation, search



Overview of gSearch for NDR


  • subsecond delay between changes to fedora and updates to the index
  • primarily used for full stream datastream indexer

Constructing the Search Document

Code Location: CVS/repository/NDRgenericSearch

  • code from fedora
  • stripped down
    • comes with four configurations
    • can administer simultaneously (not needed)
  • designed to be active web project (plugin)
    • look at .settings (sets up active web project)
    • no ant file

Directories/Files of Interest

  • README (at root of project) - written by Tim
    • lots of comments about what was changed
  • /src
    • location of java source code from fedora
  • /WebContent
    • web
  • /nsdlMods/java/dk/defxws/fgslucene/IndexDocumentHandler
    • changed IndexDocumentHandler class to accept structured xml
    • gsearch was taking this and trying to parse it and would die
    • now passes on the structured xml as a field value, thus putting the xml itself in the index
  • /WebContent/WEB-INF/classes/config/index/NDRFoxmlToLucene.xslt
    • takes Fedora object and turns it into an index object that goes into lucene
    • each fedora object document becomes an index object document
    • look for <IndexField> which defines fields
    • look for <xsl:variable> which defines getting info from other objects to include in this object
  • /WebContent/WEB-INF/classes/config/index/basicIndex/GFindObjectsToResultPage.xslt
    • added a namespace to the output stream

Focuses on metadata

  • does all relsext for all object types
  • processes metadata datastreams for metadata object types
  • commented out getting
  • does not index ncs:collectRecord



  • problem with namespace change (See README)
  • Field name: droplist will show all fields that have been indexed

Integrate with NDR API

  • take gsearch output, stuffs it into jaxb objects, and then jaxb uses a different schema for producing the output
  • http call to get the gsearch document
  • gfind api call (gsearch api)
  • getting stored copy from lucene (as opposed to getting anything directly from NDR)
  • constructed object in gsearch to construct a collection object that can be returned as a whole

gSearch API

  • couldn't find any documentation at
  • look at rest call from UI
  • gfind is the api call
  • pass in lucene query to gfind

  Find all metadata objects returning 10 at a time...
Personal tools