Community:Collections and Metadata/OAIResourceIdentifiers

From NSDLWiki

Proposed OAI Identifiers for aggregated resource-oriented metadata

One of the challenges to redistributing aggregated, metadata records about resources is that the metadata is being harvested from multiple sources and the aggregated metadata record is essentially an abstraction. It therefore doesn't conform to our notion of derived oai identifiers in which the identifier that we serve is a composite of our identifier for the metadata provider (their Naming Authority) and the id that they assigned to that record.

Another challenge is that when retrieving aggregated metadata about a resource from multiple metadata providers, there may be many minor variations of the resource URL that we use to identify metadata records related to the same resource. While acknowledging that the 'equivalence problem' related to identifying the same resource served in multiple formats, from mirrored servers, and from different archives is indeed a much larger problem, we think it's still important to make some attempt to normalize minor URL variants that would otherwise balkanize the metadata related to a single URL.

Tim and I spent some time discussing these problems and came up with a proposal which is essentially this:

Create two oai identifiers for each URL for which we have metadata. For the URL http://www.exploratorium.edu/ this would consist of the usual nsdl oai identifier construct:

protocol: 'oai'
domain: 'nsdl.org'
the namespace of the process that will return metadata either: 'nsdl.uri' or 'nsdl.likeuri'
the url-encoded URI: http%3A%2F%2Fwww.exploratorium.edu
- oai:nsdl.org:nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu
  returns metadata related to the specific URI provided.
- oai:nsdl.org:nsdl.likeuri:http%3A%2F%2Fwww.exploratorium.edu
  returns metadata related to a normalized version of the URI provided.
Create a record in the MR for each of the resource identifiers that would be served
Each record would have an nsdl unique id == to the oai id of the record being served

Requesting resource-related metadata from the MR would become a simple matter of specifying an exact or a 'like' match and supplying a URI.

**dc:identifier Table**
dc:identifier of Resource supplied by metadata provider	mrec_id of original source record	mrec_id of nsdl.uri	nsdl_unique_id of nsdl.uri	dc:identifier 'Normalized'	mrec_id of nsdl.likeuri	nsdl_unique_id of nsdl.likeuri
http://www.exploratorium.edu	1001	2001	nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu	http://www.exploratorium.edu	3001	nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu
http://www.exploratorium.edu/index.html	1002	2002	nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu%2Findex.html	http://www.exploratorium.edu	3001	nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu
http://www.exploratorium.edu/	1003	2003	nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu%2F	http://www.exploratorium.edu	3001	nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu
http://www.exploratorium.edu	1004	2001	nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu	http://www.exploratorium.edu	3001	nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu
http://www.exploratorium.edu/	1005	2003	nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu%2F	http://www.exploratorium.edu	3001	nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu

Base on the above set of 5 metadata records from 5 collections, the mrec_id served and the original metadata records aggregated and returned for each request would be:

oai:nsdl.org:nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu, mrec_id = 2001, records served = 1001, 1004
oai:nsdl.org:nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu%2F, mrec_id = 2003, records served = 1003, 1005
oai:nsdl.org:nsdl.uri:http%3A%2F%2Fwww.exploratorium.edu%2Findex.html, mrec_id = 2002, records served = 1002
oai:nsdl.org:nsdl.likeuri:http%3A%2F%2Fwww.exploratorium.edu, mrec_id = 2002, records served = 1001, 1002, 1003, 1004, 1005

We think that we only have to add 2 fields to the existing uri_index table to hold the mrec_id for the actual_uri and like_uri aggregation records in order to create the necessary relationships

Serving nsdl_augmented metadata using the nsdl_aug metadata prefix

Making a request for a standard metadata record in one of the nsdl_augmented metadata formats (which are by design resource-oriented) would return an augmented record based on the nsdl.likeurl relationships.

Both of the following records would return the same metadata aggregated from the original records == 1001, 1002, 1003, 1004, 1005

Community:Collections and Metadata/OAIResourceIdentifiers

From NSDLWiki

Proposed OAI Identifiers for aggregated resource-oriented metadata

Serving nsdl_augmented metadata using the nsdl_aug metadata prefix

Views

Personal tools

Navigation

Wiki Search

Toolbox