TNS Internal:NDR/API/2.0/UseCases/buildNCSforNSDL

From NSDLWiki

Jump to: navigation, search

Contents

[hide]

Use Case: Building a Library Collection, with NCS Managed Metadata only, for a Digital Library

Reference Materials


Building Process

The process described here makes assumptions about the existence of certain Administrative Support Objects. The list of objects that can be referenced in Use Cases without creating them first is in the Assumptions section.


Description:

NCS manages collection metadata for all Library Collections in NSDL. Additionally, NCS manages the metadata for resources in some Library Collections in NSDL. The following describes the process for creating a new Library Collection for which it does not maintain the item level metadata and the process for creating a new Library Collection for which it does maintain the item level metadata. This is written from the perspective of NCS.


The basic process for setting up this Library Collection in the NDR is described in Basic Process with more details using example data listed in the API Calls section.


NOTE: This example uses only v2.0 API calls that will be available by August 2010 and v1.0 API calls.


Basic Process

  • Create a Metadata Source for each means of receiving metadata about resource in the Library Collection
    When: this process if repeated for each source of metadata for every Library Collection
    Created By: each application to be responsible for this process (Dash Board may handle this onBehalfOf other applications)

API Calls

Create Library Collection

API Calls:
  • addCollection (generic collection composite object)
  • addMetadata (ncs_collect, aka collection metadata)
addCollection (details)
Input
Signing Agent: 2200/20100301ANA       <!-- NCS Agent -->

<inputXML>
  <parentCollection>2200/2010030801T</parentCollection>     <!-- parent: NSDL Library -->
  <parentCollection>2200/20100302ANCN </parentCollection>   <!-- parent: NCS Collections -->
  <title>Biological Sciences Gateways and Resources</title> 
  <description>The Biological Sciences Gateways and Resources collection is comprised of web portals, web 
      sites, and individual digital resources in many areas of the  biological and life sciences, including 
      agriculture, botany, ecology, genetics, microbiology, natural history, marine biology, zoology, and 
      others. Here may be found educational materials for life science educators and learners (prekindergarten 
      through graduate school), resources intended for the general public, and materials aimed at biological 
      sciences research communities.</description>
  <contacts>
    <contact email="nsdlsupport@nsdl.ucar.edu">NSDL Support</contact>
  </contacts>
  <resourceURL>http://www.nsdl.org/collection/biological-sciences</resourceURL>
  <externalIdentifier source="NCS">NSDL-COLLECTION-000-003-111-903</externalIdentifier>
</inputXML>

NOTE: NSDL Library is added as a parent only for Library Collections that are part of NSDL. For NCS managed collections that are not part of NSDL, don't include NSDL Library as a <parentCollection>. Everything else remains the same.

NOTE: NCS Collections is a parent to allow for queries like, get me a list of all collections in the NDR that NCS knows about. This is an administrative convenience that might be of use to NCS. It is not required for NDR functioning.

NOTE: The externalIdentifier at the Library Collection level should be for the primary application responsible for maintaining the metadata related to the Library Collection. If this becomes the Dashboard, then this should be the Dashboard's external identifier. The NCS external identifier, if it would continue to exist, would be part of the GCCO for the NCS metadata source. For now, the NCS externalIdentifier is used as the externalIdentifier since it is the application maintaining the collection metadata.

Ostwald 17:34, 15 April 2010 (UTC) - small technicality - the NCS externalIdentifier above (which refers to the ncs_collect record in the NCS) seems appropriate here, since the NCS is the application maintaining the collection record. However, I don't think we can assume that this ID will be part of the GCCO for the NCS metadata source, because the NCS metadata source will already be related to the Library Collection through the NDR relationships.

--Elrayle
Here is my take on this. If the NCS manages metadata about the Library Collection, then its external ID for the collection metadata record belongs in the Library Collection GCCO. If dashboard takes over this functionality, then what ever it uses as an ID would be the external ID of the Library Collection GCCO. Basically, the application that manages the collection metadata gets to store its ID as the external ID at the Library Collection GCCO level.

The level below the Library Collection GCCO represents the metadata sources that contribute metadata about resources in the Library Collection. Since multiple applications can be metadata sources for the same Library Collection, it is at this level that the application specific external ID for a Library Collection would be stored. So in either case described above (i.e. NCS managing collection metadata or Dashboard managing collection metadata), if the NCS manages metadata for resources in the Library Collection, and thus is a metadata source, it should place its external ID in the NCS metadata source GCCO that is a child of the Library Collection GCCO.
/Elrayle

Output
returns: 2200/20100303BLC  (collection aggregator handle)
addMetadata (details) (ncs_collect, aka collection metadata)
Input
Signing Agent: 2200/20100301ANA       <!-- NCS Agent -->
 
<inputXML>
  <metadata>
    <properties>
      <uniqueID>NSDL-COLLECTION-000-003-111-903</uniqueID>
    </properties>
    <relationships>
      <metadataFor>2200/20100303BLC</metadataFor>                 <!-- handle returned by addCollection for Library Collection -->
      <metadataProvidedBy>2200/20100303BLC</metadataProvidedBy>   <!-- handle returned by addCollection for Library Collection -->
    </relationships>
    <data>
      <format type="ncs_collect"> 
        <meta> 
          <record 
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
              xmlns="http://ns.nsdl.org/ncs" 
              xsi:schemaLocation="http://ns.nsdl.org/ncs http://ns.nsdl.org/ncs/ncs_collect/1.02/schemas/ncs-collect.xsd">
            <general>
              <recordID>NSDL-COLLECTION-000-003-111-903</recordID>
              <url>http://www.nsdl.org/collection/biological-sciences/</url>
              <title>Biological Sciences Gateways and Resources</title>
              <description>The Biological Sciences Gateways and Resources collection is comprised of web portals, web 
                  sites, and individual digital resources in many areas of the  biological and life sciences, including 
                  agriculture, botany, ecology, genetics, microbiology, natural history, marine biology, zoology, and 
                  others. Here may be found educational materials for life science educators and learners (prekindergarten 
                  through graduate school), resources intended for the general public, and materials aimed at biological 
                  sciences research communities.</description> 
              <subject>Biological science</subject>
            </general>
            <collection>
              <dateTime>2008-03-14T14:28:37Z</dateTime>
              <brandURL>http://nsdl.org/images/brands/NSDL-COLLECTION-000-003-111-903.jpg</brandURL>
              <OAIvisibility>public</OAIvisibility>
              <pathway>false</pathway>
              <collectionSubjects>
                <collectionSubject>Biological and Health Sciences</collectionSubject>
              </collectionSubjects>
              <imageWidth>100</imageWidth>
              <altText>Biological Sciences Gateways and Resources</altText>
              <imageHeight>30</imageHeight>
              <collectionPurposes>
                <collectionPurpose>Educators and learners</collectionPurpose>
                <collectionPurpose>Researchers and professionals</collectionPurpose>
              </collectionPurposes>
              <contacts>
                <contact name="NSDL Support" email="nsdlsupport@nsdl.ucar.edu"/>
              </contacts>
            </collection>
            <educational>
              <educationLevels>
                <nsdlEdLevel>Informal Education</nsdlEdLevel>
                <nsdlEdLevel>Pre-Kindergarten</nsdlEdLevel>
                <nsdlEdLevel>Elementary School</nsdlEdLevel>
                <nsdlEdLevel>Middle School</nsdlEdLevel>
                <nsdlEdLevel>High School</nsdlEdLevel>
                <nsdlEdLevel>Higher Education</nsdlEdLevel>
                <nsdlEdLevel>Technical Education (Lower Division)</nsdlEdLevel>
                <nsdlEdLevel>Technical Education (Upper Division)</nsdlEdLevel>
                <nsdlEdLevel>Graduate/Professional</nsdlEdLevel>
              </educationLevels>
              <types>
                <dcmiType>Collection</dcmiType>
              </types>
            </educational>
          </record>
        </meta> 
        <info> 
          <nsdlAboutCategory>collection</nsdlAboutCategory> 
          <repositoryPrimaryIdentifier>http://www.nsdl.org/collection/biological-sciences</repositoryPrimaryIdentifier> 
          <metadataNamespace>http://ns.nsdl.org/nsdl_dc_v1.02/</metadataNamespace> 
        </info> 
      </format> 
      <format type="nsdl_dc"> 
        <meta> 
          <nsdl_dc:nsdl_dc 
              xmlns:nsdl_dc="http://ns.nsdl.org/nsdl_dc_v1.02/" 
              schemaVersion="1.02.000" 
              xmlns:dc="http://purl.org/dc/elements/1.1/" 
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
              xsi:schemaLocation="http://ns.nsdl.org/nsdl_dc_v1.02/ http://ns.nsdl.org/schemas/nsdl_dc/nsdl_dc_v1.02.xsd"> 
            <dc:title>Biological Sciences Gateways and Resources</dc:title> 
            <dc:description>The Biological Sciences Gateways and Resources collection is comprised of web portals, web 
                sites, and individual digital resources in many areas of the  biological and life sciences, including 
                agriculture, botany, ecology, genetics, microbiology, natural history, marine biology, zoology, and 
                others. Here may be found educational materials for life science educators and learners (prekindergarten 
                through graduate school), resources intended for the general public, and materials aimed at biological 
                sciences research communities.</dc:description> 
            <dc:identifier>http://www.nsdl.org/collection/biological-sciences</dc:identifier> 
          </nsdl_dc:nsdl_dc> 
        </meta> 
        <info> 
          <nsdlAboutCategory>collection</nsdlAboutCategory> 
          <repositoryPrimaryIdentifier>http://www.nsdl.org/collection/biological-sciences</repositoryPrimaryIdentifier> 
          <metadataNamespace>http://ns.nsdl.org/nsdl_dc_v1.02/</metadataNamespace> 
        </info> 
      </format> 
    </data>
  </metadata>
</inputXML>

NOTE: metadataFor and metadataProvidedBy both use the handle returned from addCollection for the Library Collection. This requires a change to the addMetadata code to lookup the metadataProvider's handle from the Collection Aggregator.

Ostwald 17:34, 15 April 2010 (UTC) - concern - The ability to lookup a MDP given an AGG seems to imply that in our Collection Model, each Aggregator can be associated with only one MDP? This is at odds with the core NDR Object model. We may run into trouble since apps using NDR APR 1.0 can break this assumption.

--Elrayle
I recognize your concern. The proposed GCCO does imply a one-to-one MDP to AGG relationship. Existing objects in the repository will be modified to comply with this. This will not be an issue for NSDL's use of the NDR.

There are two types of rules in the NDR. Formal ones that are enforced by the code. And informal ones that are not enforced. The current implementation of Collection Composite Objects is an informal rule. It is a programming pattern that applications adhere to. I can list any number of inter object relationships that applications could create that would be basically ignored by the NDR.

The new rule defined by the creation of GCCOs does not prevent apps from using the NDR API 1.0 calls to violate this, which could potentially cause problems if the NDR needs to select from several MDP to AGG relationships and is unable to determine correctly the one that was created as part of the GCCO. One question to answer would be should code be added to prevent this scenario. A simple solution would add a property to the MDP added with the GCCO that marks it as part of the GCCO. Any additional MDPs would be ignored, but could be used by applications to do whatever they want. NSDL would not make use of this.

I'm not sure that this represents a problem for outside applications. I assume the concern is related to applications that have downloaded Edupak and are perhaps using it in a way that conflicts with this change. Perhaps you could expand on your concern.
/Elrayle

Ostwald 17:34, 15 April 2010 (UTC) - clarification - From my understanding, the addCollection call creates a GCCO (including a Collection Metadata object that is related to the Collection Aggregator by metadataFor). Does the addMetadata call here (with ncs_collect payload) create ANOTHER Collection Metadata object (also related to the Collection Aggregator)? Would it be more appropriate to simply modify the existing Collection Metadata object, perhaps adding the ncs_collect as an additional data stream?

--Elrayle
I agree that this is a little klunky. The way things are designed now, the only "metadata" about the collection that is officially maintained by the GCCO is title, description, and contact information. This is passed in with the addCollection call. True collection metadata is much richer than this. An alternate solution is to allow the passing in of a metadata payload as part of the addCollection call. This does complicate things and forces us to address some issues that are currently being postponed until after the August deadline. Specifically, there has been no resolution for how metadata related 2.0 API calls will handle multiple formats. The current definition of the API calls allows specification of a single XML format and a single XML stream for the metadata in the specified format. My proposal represents an attempt to stay in the scaled down feature list for the August deadline without creating limitations that prevent future functionality as defined in the other proposed 2.0 API calls. Perhaps we should make this one of the items of discussion in the followup meeting.
/Elrayle

NOTE: The nsdl_dc presented in this example may not be the exact result of the NCS transform process from ncs_collect to nsdl_dc.

Output
returns: 2200/2010030501T  (collection metadata handle)

Create Metadata Source for NCS Managed Resource Metadata

API Calls:
  • addCollection (generic collection composite object)
  • addMetadata (dcs_data, aka appdata)
addCollection (details)
Input
Signing Agent: 2200/20100301ANA       <!-- NCS Agent -->

<inputXML>
  <parentCollection>2200/20100303BLC</parentCollection>   <!-- parent: Biological Sciences Gateways and Resources Library Collection -->
  <title>Biological Sciences Gateways and Resources managed in NCS</title> 
  <description>Resources for Biological Sciences Gateways and Resources that are managed in the NCS.</description>
  <contacts>
    <contact email="nsdlsupport@nsdl.ucar.edu">NSDL Support</contact>
  </contacts>
  <externalIdentifier source="NCS">NSDL-COLLECTION-000-003-111-903</externalIdentifier>
</inputXML>

Ostwald 17:48, 15 April 2010 (UTC) - Again, I'm not sure this use of the externalIdentifier makes the best sense for the NCS, but it is convenient to have this attribute here in general for whatever purpose apps want to use for.

Output
returns: 2200/20100303NMP  (collection aggregator handle)
addMetadata (details) (dcs_data, aka appdata)
Input
Signing Agent: 2200/20100301ANA       <!-- NCS Agent -->
 
<inputXML>
  <metadata>
    <properties>
      <uniqueID>NSDL-COLLECTION-000-003-111-903appdata</uniqueID>
    </properties>
    <relationships>
      <metadataFor>2200/20100303NMP</metadataFor>                 <!-- handle returned by addCollection for NCS metadata source -->
      <metadataProvidedBy>2200/20100303NMP</metadataProvidedBy>   <!-- handle returned by addCollection for NCS metadata source -->
    </relationships>
    <data>
      <format type="dcs_data"> 
        <meta> 
          <dcsDataRecord 
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
              xsi:noNamespaceSchemaLocation="http://www.dlese.org/Metadata/dcs/dcs-data/dcs-data.xsd">
            <recordID>NSDL-COLLECTION-000-003-111-903</recordID>
            <isValid></isValid>
            <statusEntries>
              <statusEntry>
                <status>New</status>
                <statusNote>Record Created</statusNote>
                <editor>ginger</editor>
                <changeDate>2008-03-14T14:28:38Z</changeDate>
              </statusEntry>
              <statusEntry>
                <status>_|-final-1201216476279-|_</status>
                <statusNote>Sync with NDR</statusNote>
                <editor>jonathan</editor>
                <changeDate>2008-03-17T20:37:36Z</changeDate>
              </statusEntry>
              <statusEntry>
                <status>_|-final-1201216476279-|_</status>
                <statusNote></statusNote>
                <editor>miller</editor>
                <changeDate>2008-07-18T16:15:16Z</changeDate>
              </statusEntry>
            </statusEntries>
            <lastTouchDate>2009-09-09T11:28:50Z</lastTouchDate>
            <ndrHandle></ndrHandle>
            <validationReport></validationReport>
            <ndrInfo>
              <syncError></syncError>
              <ndrHandle>2200/20080317203738000T</ndrHandle>
              <lastSyncDate>2008-09-12T10:45:34Z</lastSyncDate>
              <metadataProviderHandle>2200/20080317203737583T</metadataProviderHandle>
              <nsdlItemId>oai:nsdl.org:ncs:NSDL-COLLECTION-000-003-111-903</nsdlItemId>
            </ndrInfo>
            <lastEditor>ginger</lastEditor>
          </dcsDataRecord>
        </meta> 
        <info> 
          <nsdlAboutCategory>appdata</nsdlAboutCategory> 
        </info> 
      </format> 
    </data>
  </metadata>
</inputXML>

NOTE: metadataFor and metadataProvidedBy both use the handle returned from addCollection for the NCS metadata source. This requires a change to the addMetadata code to lookup the metadataProvider's handle from the Collection Aggregator.

QUESTION: What would be appropriate values for <info>?

Output
returns: 2200/2010030502T  (appdata metadata handle)


Adding Metadata for Resources

API Calls:
  • addResource (resource)
  • addMetadata (resource metadata)
addMetadata (details) (resource metadata)
Input
Signing Agent: 2200/20100301ANA       <!-- NCS Agent -->
  
<inputXML>
  <resource> 
    <properties> 
      <identifier type="URL">http://instaar.colorado.edu/outreach/trees-and-vocs/</identifier> 
    </properties> 
    <relationships> 
      <memberOf>2200/20100303NMP</memberOf>    <!-- handle returned by addCollection for NCS metadata source -->
    </relationships> 
  </resource> 
</inputXML> 

NOTE: The call for adding a resource is the same as the current process for adding a resource.

Output
returns: 2200/2010030701T  (resource metadata handle)
addMetadata (details) (resource metadata)
Input
Signing Agent: 2200/20100301ANA       <!-- NCS Agent -->
  
<inputXML>
  <metadata>
    <properties>
      <uniqueID>NSDL-COLLECTION-000-003-111-903appdata</uniqueID>
    </properties>
    <relationships>
      <metadataFor>2200/2010030701T</metadataFor>                 <!-- handle returned by addResource -->
      <metadataProvidedBy>2200/20100303NMP</metadataProvidedBy>   <!-- handle returned by addCollection for NCS metadata source -->
    </relationships>
    <data>
      <format type="ncs_item"> 
        <meta> 
          <record 
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
              xmlns="http://ns.nsdl.org/ncs" 
              xsi:schemaLocation="http://ns.nsdl.org/ncs http://ns.nsdl.org/ncs/ncs_item/1.02/schemas/ncs-item.xsd">
            <general>
              <recordID>BIOSCI-000-000-000-238</recordID>
              <url>http://instaar.colorado.edu/outreach/trees-and-vocs/</url>
              <title>Trees and VOCs: Measuring volatile organic compounds from urban forests</title>
              <description>This web site describes a research project to measure volatile organic compounds 
                  emitted from species of trees and shrubs found in urban areas. Topics include a description 
                  of the project and a section on trees and air quality. A page updated each month or so 
                  reports field and lab work on the project. There is also a glossary, profiles of community 
                  partners, and profiles of the scientists and students involved in the project</description> 
              <subject>Education</subject>
              <subject>General science</subject>
              <subject>Agriculture</subject>
              <subject>Environment</subject>
              <subject>ozone</subject>
              <languages>
                <ISOcode>eng</ISOcode>
              </languages>
            </general>
            <educational>
              <educationLevels>
                <nsdlEdLevel>High School</nsdlEdLevel>
                <nsdlEdLevel>Higher Education</nsdlEdLevel>
                <nsdlEdLevel>Undergraduate (Lower Division)</nsdlEdLevel>
                <nsdlEdLevel>Informal Education</nsdlEdLevel>
                <nsdlEdLevel>General Public</nsdlEdLevel>
              </educationLevels>
              <types>
                <nsdlType>Event</nsdlType>
                <nsdlType>News</nsdlType>
                <nsdlType>Reference Material</nsdlType>
                <nsdlType>Glossary/Index</nsdlType>
                <nsdlType>Nonfiction Reference</nsdlType>
                <nsdlType>Report</nsdlType>
              </types>
              <audiences>
                <nsdlAudience>Educator</nsdlAudience>
                <nsdlAudience>General Public</nsdlAudience>
                <nsdlAudience>Learner</nsdlAudience>
              </audiences>
            </educational>
            <rights>
              <rights>Visit resource website for further information. Rights information not provided locally.</rights>
              <accessRights>
                <meansOfAccess>Free access</meansOfAccess>
              </accessRights>
            </rights>
            <contributions>
              <creators>
                <creator>Institute of Arctic and Alpine Research (INSTAAR) at the University of Colorado-Boulder</creator>
                <creator>The National Center for Atmospheric Research</creator>
              </creators>
            </contributions>
            <technical>
              <mimeTypes>
                <mimeType>text</mimeType>
                <mimeType>text/html</mimeType>
              </mimeTypes>
            </technical>
          </record>
        </meta> 
        <info> 
          <nsdlAboutCategory>appdata</nsdlAboutCategory> 
        </info> 
      </format> 
      <format type="nsdl_dc"> 
        <meta> 
          <nsdl_dc:nsdl_dc 
              xmlns:nsdl_dc="http://ns.nsdl.org/nsdl_dc_v1.02/" 
              schemaVersion="1.02.000" 
              xmlns:dc="http://purl.org/dc/elements/1.1/" 
              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
              xsi:schemaLocation="http://ns.nsdl.org/nsdl_dc_v1.02/ http://ns.nsdl.org/schemas/nsdl_dc/nsdl_dc_v1.02.xsd"> 
            <dc:title>Trees and VOCs: Measuring volatile organic compounds from urban forests</dc:title> 
            <dc:description>This web site describes a research project to measure volatile organic compounds 
                emitted from species of trees and shrubs found in urban areas. Topics include a description 
                of the project and a section on trees and air quality. A page updated each month or so 
                reports field and lab work on the project. There is also a glossary, profiles of community 
                partners, and profiles of the scientists and students involved in the project</dc:description> 
            <dc:identifier>http://instaar.colorado.edu/outreach/trees-and-vocs/</dc:identifier> 
            ...
          </nsdl_dc:nsdl_dc> 
        </meta> 
        <info> 
          <nsdlAboutCategory>resource</nsdlAboutCategory> 
          <repositoryPrimaryIdentifier>http://instaar.colorado.edu/outreach/trees-and-vocs/</repositoryPrimaryIdentifier> 
          <metadataNamespace>http://ns.nsdl.org/nsdl_dc_v1.02/</metadataNamespace> 
        </info> 
      </format> 
    </data>
  </metadata>
</inputXML>

NOTE: Applications no longer have access to the metadataProvider's handle. The metadataProvidedBy element uses the handle returned from addCollection for the NCS metadata source. This requires a change to the addMetadata code to lookup the metadataProvider's handle from the Collection Aggregator.

NOTE: The values for the <format>, <meta>, and <info> elements are exactly the same as that used for the current implementation. The above represents made up example values.

Output
returns: 2200/2010030503T  (resource metadata handle)
Personal tools