TNS Internal:CollectionAPI/AbstractAPI

From NSDLWiki

Jump to: navigation, search

This is a proposed API for creating, updating and deleting collections, items and annotations in an NDR. The API could be implemented as an XML-HTTP/REST API or a Java API.


Contents

[hide]

Overview

This API contains a minimum set of method calls necessary to define and manage collections, items and annotations in an NDR. Method calls are expressed as HTTP arguments, which may contain strings or XML, and responses are returned as XML. Alternatively, a Java implementation would accept Java Objects as arguments and return Java Objects or XML.

Authentication and authorization is required for write operations. It is assumed auth is handled at the transport level in the same way it is done currently for NDR (e.g. private/public keys, etc. at the transport layer).


Principals

agent - An agent is an entity (application) that can create, update and delete collections and metadataSources. Agents also have the rights of a metadataSource. For example, the NCS might be an agent that assigns metadataSource rights to OAI ingest and WFI.

metadataSource - A metadataSource is an entity (application) that can create, update and delete items in one or more collections. For example OAI ingest is a metadataSource for collections that are harvested, WFI for collections with feeds, NCS for direct cataloging.

consumer - A consumer is an entity (application) that can read collections information and act on it. There are certain restrictions on read operations as set by the agent or metadataSource, but in general, most data is available to all consumers. Examples: The Harvest Manager might read the collections information to generate it's view of harvests and trigger the OAI ingests, WFI might read the collections information to discover the collects it needs to ingest, their feeds, and how often to ingest them.


Concepts

Item - An item is a single XML document or binary object in the repository, which may contain metadata or primary content. All items must belong to a collection. Items are created using the PutRecord method and retrieved using the GetRecord and ListRecords methods.

Collection - A collection is a container for XML or binary items in the repository. A collection must exist before items can be inserted. Collections are created using the PutCollection method and retrieved using the GetCollection and ListCollections methods.

An Annotation is an XML item that extends or augments the content or metadata of another item. An annotation is like any other item except that it must also indicate the item that it annotates by providing a recordId. The NDR must have knowledge about the annotation record format to perform introspection to find and assign the annotates relationship. Annotations are created using the PutRecord method and retrieved using the GetRecord, ListRecords, GetCollection and ListCollections methods.

Repository content - Repository content represents collections, items and annotations that are intended for general use by consumers. For example a collection of metadata about education resources; A collection of annotations that contain star ratings. Repository content is created using the PutCollection and PutRecord methods.

Administrative content - Administrative content represents collections, items and annotations that are intended for use by agents and metadataSources for the purpose of internal bookkeeping and maintaining application state. An application may store any arbitrary data for it's own internal use as well as data used to communicate and collaborate with other applications. The collection record is one place an agent can store administrative data. Another typical scenario is for agents and metadataSources to store administrative data in an annotation and attach it to a collection recordID or an item recordId, then access it later to retrieve or restore state. Collaborating applications can inspect others annotations and act on them as appropriate. If desired the content may be restricted to one or more agents or metadataSources. Administrative content is created using the PutCollection and PutRecord methods.


Use cases

Some envisioned use cases.

  • DDS and CCS Use Cases - Description of some DDS and CCS use cases and how they would be implemented using the Collections API.

API Methods - Collections

PutCollection

Creates or updates a collection in the NDR.
PutCollection(collectionId, xmlFormat|binaryContent, collectionRecord, name, description, agentId, metadataSources[ ], restrictAccessTo[ ], groups[ ]);
  • collectionId - The identifier for the collection (handle)
  • xmlFormat|binaryContent - The XML format of the items that will reside in the collection, or binaryContent to indicate the collection contain binary data. For example 'msp2', 'ncs_item', 'dlese_anno', 'binaryContent' (reserved to indicate binary).
  • collectionRecord - A collection-level XML record of any format. For example, ncs_collect.
  • name - A simple name for the collection, for human consumption
  • description (optional) - A description for the collection, for human consumption
  • agentId - The identifier/handle for the agent that will have sole permission to update or delete this collection. The agent ID must exist prior to this call.
  • metadataSources[] - One or more metadataSource objects. A metadataSource is an entity that has rights to add items to the collection, for example WFI, OAI Harvest, NCS. Each metadataSource is defined by an ID and a name. If the given metadataSource does not exist it will be created.
  • restrictAccessTo[ ] (optional) - One or more metadataSourceId/agentId to restrict read access to this entire collection in all service responses. If omitted, the collection is accessible to all consumers.
  • groups[ ] - One or more group identifiers used for community specific partitioning and grouping of collections. For example a group might be nsdl to indicate the collection should be included in the nsdl.
Response - An XML container that provides the result of the request (success, error).


DeleteCollection

Removes a collection, its items and all related information in the NDR.
DeleteCollection(collectionId, agentId);
  • collectionId - The identifier for the collection (handle)
  • agentId - The identifier/handle for the agent that will have sole permission to update or delete this collection. The agent ID must exist prior to this call.
Response - An XML container that provides the result of the request (success, error).


GetCollection

Gets all data about this collection, not including the items in the collection. This read-only request has no access restrictions.
GetCollection(collectionId);
  • collectionId - The identifier for the collection.
Response - An XML container that contains all information about the collection that has been assigned using the PutCollection and PutCollectionAnnotation calls. This includes standard information about the collection (name, format, number of items, etc.), the collection record assigned by the agent and any attachments that have been assigned to the collection record. XML response shown below.


ListCollections

Gets all data about all collections (not including the items). This read-only request has no access restrictions unless specified. The response format would be identical to ListRecords except it contains additional, collection-related information.
ListCollections(metadataSourceId/agentId);
  • metadataSourceId/agentId (optional) - An identifier for a metadataSource or agent. If included the response includes only those collections that pertain to that metadataSource or agent. If omitted, the response includes all collections.
Response - A list of XML containers that contain all information about the collections that have been assigned using the PutCollection and PutCollectionAnnotation calls, e.g. a list of the same information packets returned by GetCollection. XML response is shown below.
   <record>
     <header>
       <recordId>id/aaaa</recordId>
       <datestamp>2009-10-12T22:56:23Z</datestamp> 
       <xmlFormat>ncs_collect</xmlFormat>
       <accessRestrictedTo>metadataSource|agent</accessRestrictedTo>
     </header>
     <!-- Response container from GetCollection and ListCollections requests -->
     <NDRCollectionInfo xmlns="xxx">
       <name>Beyond Polar Bears</name>
       <description>A collection about more than just polar bears</description>
       <collectionId>2200/20091013232544291T</collectionId>
       <contentFormat>
         <type>XML|binaryContent</type>
         <xmlFormat>nsdl_dc</xmlFormat><!-- If type=XML -->
       </contentFormat>
       <!-- High-level data about the items in the collection and how to access them -->
       <items>
         <numMetadataRecords>2345</numMetadataRecords>
         <access>http://ndr.nsdl.org/ndr/collectionsApi?verb=ListRecords&collectionId=2200/20091013232544291T</access>
       </items>
       <!-- Agent with permission to update/delete the collection AND items in the collection -->
       <managingAgent>
         <agentId>2200/xxxx</agentId>
         <agentName>NCS (human-readable name for convenience)</agentName>
       </managingAgent>
       <!-- Entities with permission to update/delete items in the collection -->
       <metadataSources>
          <metadataSource>
           <metadataSourceId>2200/xxxx</metadataSourceId>
           <metadataSourceName>WFI(human-readable name for convenience)</metadataSourceName>
         </metadataSource>
       </metadataSources>
       <groups>
         <group>nsdl</group>
       <groups>
     </NDRCollectionInfo>
     <metadata>
       <ncsCollect xmlns="xxx">
         <dataGoesHere>stuff<dataGoesHere>
       </ncsCollect>
     <metadata>
     <annotations>
       <annotation annoId="2200/20091013232544292T" submittedByEntityId="2200/xxx" 
           submittingEntityType="metadataSource|agent" accessRestrictedTo="metadataSource|agent">
           <xmlFormatXXX xmlns="xxx"> 
             <description>Annotations can contain any XML... for example an ncs_collect record.</description>
           </xmlFormatXXX xmlns="xxx">
       </annotation>
     </annotations>
   </record>
   


API Methods - Items

PutRecord

Creates or updates an XML record in the repository. Client (metadataSource/agent) must be authorized to write to the collection. Records that annotate another record, including collection records, would be recognized by their given xmlFormat and the annotates relationship would be applied internally by the NDR, allowing the annotation to be returned in GetRecord and ListRecords and ListCollections responses for the annotated resource or collection record.
PutRecord(collectionId, recordId, xml);
  • collectionId - The collection identifier/handle in which the item will be inserted. Collection must exist prior.
  • recordId (optional) - The identifier/handle for the item. Required if the recordId can not be derived from the XML payload. May be ignored if the recordId can be derived from the XML payload.
  • restrictAccessTo[ ] (optional) - One or more metadataSourceId/agentId to restrict read access to this record in all service responses. If omitted, the record is accessible to all consumers.
  • xml - The XML data. Operation may fail if xml is in the wrong format.
Response - An XML container that provides the result of the request (success, error) and echos the ID.


GetRecord

Gets a metadata record from the repository. This read-only operation may have certain access restrictions.
GetRecord(recordId,xmlFormat);
  • recordId - The identifier/handle for the item.
  • xmlFormat (optional) - The XML format to return the item in. If included, the repository will return the item in the given format or return an error if not supported. If omitted, the item is returned in it's native format.
Response - An XML container with the item XML. This includes the record assigned by the agent or metadataSource and any annotations that have been attached to the that record. Items and annotations that have restricted access rights are returned to authorized agents/metadataSources only. XML response shown below.


ListXmlFormats

Lists the XML formats available from the repository.
ListXmlFormats(recordId);
  • recordId (optional) - If included, the list shows all xmlFormats available for the record. If omitted, the list shows all xmlFormats available from the repository as a whole.
Response - An XML container with the available xmlFormats.


DeleteRecord

Removes a metadata record from the repository. Client (metadataSource/agent) must be authorized to write to the collection.
DeleteRecord(recordId);
  • recordId - The identifier/handle for the item.
Response - An XML container that provides the result of the request (success, error) and echos the ID.


PutContent

Creates or updates a binaryContent entry in the repository. Client (metadataSource/agent) must be authorized to write to the collection.
PutContent(collectionId, recordId, binaryContent);
  • collectionId - The collection identifier/handle in which the item will be inserted. Collection must exist prior.
  • recordId - The identifier/handle for the item.
  • contentType - The contentType / mime-type for the item.
  • binaryContent - The binary content (base64).
  • restrictAccessTo[ ] (optional) - One or more metadataSourceId/agentId to restrict read access to this content. If omitted, the content is available to all consumers
Response - The binary content or an XML container that provides an error message plus appropriate http 404, 401 header.


GetContent

Gets a binary item from the repository. This read-only operation may have certain access restrictions.
GetContent(recordId);
  • recordId - The identifier/handle for the item.
Response - The binary content, or error if not available.


DeleteContent

Removes a content item from the repository. Client (metadataSource/agent) must be authorized to write to the collection.
DeleteContent(recordId);
  • recordId - The identifier/handle for the item.
Response - An XML container that provides the result of the request (success, error) and echos the ID.


ListRecords/ListIdentifiers

Lists the metadata records in the collection or entire repository, similar to OAI ListRecords/ListIdentifiers but more flexible. ListRecords returns a list of the same data returned by GetRecord. This read-only operation may have certain access restrictions.
ListRecords/ListIdentifiers(collectionId, xmlFormat, resumptionToken);
  • collectionId (optional) - The collection identifier/handle to limit the items returned to. If omitted, all records in the repository are considered.
  • xmlFormat (optional) - The xml format in which the items will be returned. Only items that can be disseminated in this format will be returned. If omitted, items are returned in their native format.
  • query (optional) - A query to limit the records returned, similar or the same as the find operation in current NDR API. This would be a limited number of operations (need to discuss trade off between flexibility for clients versus overhead/complexity for the server. Do we need query functionality?).
  • resumptionToken (optional) - For large lists (say more than 2000 items), responses may be broken into multiple parts. This argument is used for flow management to retrieve the next portion of the complete list of records. If included, the other arguments are omitted.
Response - An XML container that contains a list of XML headers, metadata and annotations. Each container has a header element with the recordId, collectionId(s), and datestamp and a metadata element with the record XML and optionally an annotations element. ListRecords returns the header, metadata and annotations; ListIdentifiers returns just the header. If a resumptionToken is included in the response, this indicates that the list is partial and the resumptionToken should be used to fetch the next portion of the complete list. Multiple responses with resumptionTokens may be returned to complete the list.


   <!-- Response container included in the GetRecord, ListRecords and ListIdentifiers responses -->
   <record>
     <header>
       <recordId>id/xxx</recordId>
       <datestamp>2009-10-12T22:56:23Z</datestamp> 
       <xmlFormat>nsdl_dc</xmlFormat>
       <collectionId>2200/20091013232544291T</collectionId>
       <accessRestrictedTo>metadataSource|agent</accessRestrictedTo>
     </header>
     <!-- GetRecord and ListRecords return metadata, ListIdentifiers does not -->
     <metadata>
       <nsdl_dc:nsdl_dc xmlns:dct="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" etc>
         <dc:title>How to Convert Temperatures</dc:title>
         <dc:identifier>http://myschool.org/index.html</dc:identifier>
       </nsdl_dc:nsdl_dc>
     </metadata>
     <annotations>
       <annotation>
         <header>
           <recordId>id/aaa</recordId>
           <datestamp>2009-10-15T22:26:33Z</datestamp> 
           <xmlFormat>nsdlAnnotation</xmlFormat>
           <collectionId>2200/20091013232544291T</collectionId>
           <accessRestrictedTo>metadataSource|agent</accessRestrictedTo>
         </header>
         <metadata>   
           <nsdlAnnotation xmlns:dct="http://xxxx" etc>
             <recordId>id/aaa</recordId>
             <annotatedRecordId>id/xxx</annotatedRecordId>
             <title>This annotation is about...</title>
           </nsdlAnnotation>
         </metadata>
       </annotation>
     </annotations>
   </record> 


API Methods - Administration

PutAgent

Creates an agent in the NDR that has permission to create collections and metadataSources. What has permission to execute this request? How are keys created and distributed?
PutAgent(agentId,name,description);
  • agentId - The identifier/handle for the agent.
  • name - A simple name for the agent, for human consumption
  • description (optional) - A description for the agent, for human consumption
Response - An XML container that provides the result of the request (success, error).


GetAgent, ListAgents

Gets/Lists the agents(s) in the NDR and what collections they have rights to manage.
GetAgent(agentId), ListAgents( );
  • agentId - The identifier/handle for the agent.
Response - An XML container that provides the necessary data.


PutMetadataSource

Creates a metadataSource in the NDR that has permission to add items to one or more collections. Only agents have permission to execute this request. This only creates the metadataSource, but does not assign it permission to edit collections. A metadataSource can also be created using the PutCollection request, where permission is granted to the metadataSource to write items in a collection.
PutMetadataSource(metadataSourceId,name,description);
  • metadataSourceId - The identifier/handle for the metadataSource.
  • name - A simple name for the metadataSource , for human consumption
  • description (optional) - A description for the metadataSource , for human consumption
Response - An XML container that provides the result of the request (success, error).


GetMetadataSource

Gets the metadataSource in the NDR with this ID and what collections it has permissions to write to.
GetMetadataSource(metadataSourceId);
  • metadataSourceId - The identifier/handle for the metadataSource
Response - An XML container that provides the necessary data.


ListMetadataSources

Lists the metadataSource(s) in the NDR and what collections they have permission to write to.
ListMetadataSources(collectionId);
  • collectionId (optional) - If provided, list the metadataSources for the given collection; if not provided, list all metadataSources in the NDR.
Response - An XML container that provides the necessary data.


PutAnnotationRelationPath

Defines or updates the XPath to the annotated recordId for for a given XML format. Once defined, previous and subsequent records that reside in the NDR in the given xmlFormat will be inspected, their annotated records will acquire the implicit annotatedBy relationship, and the annotation XML will be returned along with the record in GetRecord, ListRecoreds, GetCollection and ListCollections responses. Restricted to agents and metadataSources.
PutAnnotationRelationPath(xmlFormat,xPath);
  • xmlFormat - The XML format of the annotation record, for example 'dlese_anno', 'nsdl_anno', 'ncs_application_data_anno', 'ncs_application_data_anno'.
  • xPath - The XPath to the recordId inside the annotation record.
Response - An XML container that provides the result of the request (success, error).


ListAnnotationRelationPaths

Lists all xmlFormats that have an XPath to an annotated recordId defined, and the XPath.
ListAnnotationRelationPaths( );
Response - An XML container that provides a list of xmlFormats that have defined annotated recordId XPaths, and the XPath that was defined.


Notes

This abstract API was first represented as a sketch on a whiteboard: http://ndrtest.nsdl.org/api/get/2200/test.20091015110012957T/content

Personal tools