TNS Internal:NDR/API/2.0/DataAnalysis/patternsNSDLCollections/NCS

From NSDLWiki

Jump to: navigation, search

Contents

[hide]

PATTERN: NSDL Collection Records (NSDL CR)

CAgg CMDP OAgent R M M-MDP Extra Parent
X ncs:CMF NCS X X NSDL CR


How to identify this pattern?

  • Get M with Query: * <http://ns.nsdl.org/api/relationships#metadataFor> <info:fedora/hdl:2200%2F__AGGREGATOR_HANDLE__>
  • Get CMDP with Query: <info:fedora/hdl:2200%2F__METADATA_HANDLE__> <http://ncs.nsdl.orgcollectionMetadataFor> *
  • If the CMDP query has results, then this pattern has been met.


Query to just get NSDL Collection Record style collection aggregators:

Actual Query:

  SELECT t5.o, t5.s, t17.o, t36.s, t36.o
  FROM (t5 JOIN t17 ON t5.s = t17.o)
  JOIN t36 
  ON t17.s = t36.s
  WHERE t5.o = '"Aggregator"'
  LIMIT 2000;

RESULTS: 248


Query with English interpretation's included:  (won't run)

  SELECT t5.o(objectType), t5.s(CAggr), t17.o(CAggr), t36.s(M), t36.o(CMDP)
  FROM (t5(objectType_table) JOIN t17(metadataFor_table) ON t5.s(CAggr) = t17.o(CAggr))
  JOIN t36(ncs:CollectionMetadataFor_table) 
  ON t17.s(M) = t36.s(M)
  WHERE t5.o(objectType) = '"Aggregator"'
  LIMIT 2000;


Parts of existing objects

Legend:

  used as part of v2.0 objects; surfaced through v2.0 get calls
  available in non-collection object
  not available via v2.0 calls
  Deprecated - won't be created or used going forward


Name Path Example Value Use in 2.0 Comments
Collection Aggregator (CAggr) MC-CAggr object See comments for each part
properties
crs:collectionID oai:crs.nsdl.org:4781 carry forward as legacy data created by harvest ingest??? ONLY for older collections.
datastreams
DC MC-CAggr-d:DC USE AS IS
dc:title Aggregator USE AS IS
dc:identifier [handle for this object] USE AS IS
serviceDescription DEPRECATED - will continue to exist as legacy data
dc:title Aggregator for Chem Ed DL DEPRECATED
dc:description Collection of Chem Ed DL items DEPRECATED
dc:type Aggregator DEPRECATED
image brandURL, title, width, height, alttext DEPRECATED
contacts name,email LC-CAggr-p:contact(s);
LC-CM-d:format_ndr_collection!contacts;
MS-CAggr-p:contact(s);
MS-CM-d:format_ndr_collection!contacts
requires reformatting for both
relationships
nsdl:associatedWith R handle LC-CAggr-r:associatedWith R USE AS IS; will be part of MS because it won't be removed
nsdl:memberOf Aggr 2200/NSDL_Collection_of_Collections_Aggregator LC-CAggr-r:memberOf parentAggr 1st pass, leave as is for MS;
2nd pass REMOVE from MS;
parent NDR Collection that makes this a Library Collection
nsdl:aggregatorFor Agent 2200/NCS LC-CAggr-r:aggregatorFor OAgent 1st pass, leave as is for MS;
2nd pass UPDATE to appropriate OAgent for MS type; ;
NCS is OAgent for LC
nsdl:authorizedToChange Agent 2200/NCS LC-CAggr-r:authorizedToBeChangedBy;
MS-CAggr-r:authorizedToBeChangedBy
causes new Metadata Source NDR Collection for NCS if not using this object for NCS MS
nsdl:authorizedToChange Agent 2200/NSDL_Harvest_Ingest MS-CAggr-r:authorizedToBeChangedBy causes new Metadata Source NDR Collection for Harvest Ingest if not using this object for HI MS;
should this be removed from LC or left as legacy data???
nsdl:authorizedToChange Agent 2200/20091105132121677T (WFI) MS-CAggr-r:authorizedToBeChangedBy causes new Metadata Source NDR Collection for WFI if not using this object for WFI MS;
should this be removed from LC or left as legacy data???
Collection Metadata Provider (CMDP) MS-CMDP object See comments for each part
Comes from M ncs:collectionMetadataFor CMDP
properties
nsdl:setName Chemical Education DL carry forward as legacy data
nsdl:setSpec ncs-NSDL-COLLECTION-000-003-111-983 carry forward as legacy data
ncs:nativeFormat nsdl_dc carry forward as legacy data
oai:visibility public carry forward as legacy data
datastreams
DC MS-CMDP-d:DC USE AS IS
dc:title Metadata Provider USE AS IS
dc:identifier [handle for this object] USE AS IS
serviceDescription -- DEPRECATED
DEPRECATED exactly the same as CAggr serviceDescription
relationships
nsdl:metadataProviderFor Agent 2200/NCS LC-CMDP-r: metadataProviderFor LC-OAgent 1st pass, leave as is for MS;
2nd pass UPDATE to appropriate OAgent for MS type; ;
NCS is OAgent for LC
nsdl:aggregatedBy Aggr handle MS-CMDP-r: aggregated by MS-CAggr USE AS IS
nsdl:authorizedToChange Agent 2200/NCS how is this different from CAggr's use of authorizedToChange?
which should cause new Metadata Source Collection?
nsdl:authorizedToChange Agent 2200/NSDL_Harvest_Ingest how is this different from CAggr's use of authorizedToChange?
which should cause new Metadata Source Collection?
Owner Agent (OAgent) LC-OAgent object NO UPDATES TO NCS Agent
2200/NCS is OAgent -- Used as is as OAgent for LC and NCS MS
Resource (R) MS-R object and LC-R object Using for both because it doesn't really matter. See comments for each part
properties
nsdl:hasResourceURL http://sunearth.gsfc.nasa.gov/ LC-CAggr-p:hasResourceURL;
LC-CM-d:format_ndr_collection!resourceURL;
LC-R-p:hasResourceURL
USE AS IS -- do we want to set up the properties of MS to point to this as well? Or just leave the relationship without any specific NDR collection connections?
datastreams
DC LC-R-d:DC USE AS IS
dc:title Resource USE AS IS
dc:identifier [handle for this object] USE AS IS
content_1 redirect to resourceURL USE AS IS
relationships
nsdl:memberOf Aggr 2200/20080317171343451T Aggr for NSDL Collection Records
nsdl:memberOf Aggr handle may or may not be related to the CAggr
Metadata (M) non-Collection Object USE AS IS;
ADD metadataFor relationship with LC-CAggr
properties
nsdl:uniqueID NSDL-COLLECTION-000-003-111-983 LC-CAggr-p:externalIdentifier;
LC-CM-d:format_ndr_collection!externalIdentifier;
NCS:MS-CAggr-p:externalIdentifier;
NCS:MS-CM-d:format_ndr_collection!externalIdentifier
created by NCS; same as ncs:recordId
nsdl:itemID oai:nsdl.org:NSDL-COLLECTION-000-003-111-983 carry forward as legacy data
ncs:recordId NSDL-COLLECTION-000-003-111-983 carry forward as legacy data created by NCS; same as nsdl:uniqueID
ncs:status NCSFinalStatus carry forward as legacy data created by NCS
ncs:isValid true carry forward as legacy data created by NCS
oai:visibility public carry forward as legacy data used to generate PUBLIC OAI and SEARCH OAI
datastreams
DC USE AS IS
dc:title NSDL Repostiory Metadata USE AS IS
dc:identifier [handle for this object] ex. 2200/2006... USE AS IS
dc:identifier [handle for this object] ex. hdl:2200%2F2006... USE AS IS
dc:date 2008-04-29T14-54-34Z USE AS IS
format_nsdl_dc USE AS IS
datastream will be unchanged
dc:title Chemical Education Digital Library LC-CAggr-p:title; LC-CM-d:format_ndr_collection!title datastream will be unchanged;
access via 2.0 metadata calls TBD
dc:description ChemEd DL aims to... LC-CAggr-p:description; LC-CM-d:format_ndr_collection!description datastream will be unchanged;
access via 2.0 metadata calls TBD
format_ncs_collect carry forward as legacy data datastream will be unchanged;
access via 2.0 metadata calls TBD
format_dcs_data carry forward as legacy data datastream will be unchanged;
access via 2.0 metadata calls TBD
relationships
nsdl:metadataProvidedBy MDP 2200/20080317171403522T NSDL Collection Records
nsdl:metadataFor Aggr handle metadataFor LC-CAggr
nsdl:metadataFor R handle metadataFor R
ncs:collectionMetadataFor MDP handle The object referenced by this relationship becomes the LC-CMDP;
access to this relationship is not available with 2.0 calls
Metadata Provider (M-MDP) for Metadata No modifications to this object. Not part of collection.
2200/20080317171403522T MetadataProvider for NSDL Collection Records
datastreams
serviceDescription -- DEPRECATED DEPRECATED
dc:contacts System Programmer, systems@nsdl.org DEPRECATED
Extra Parent Aggregator
N/A


Matching Existing Data to New NDR Collection Structure

The following are the objects expected to exist for NDR Collections. There are two types of NDR Collections created for NSDL Library Collection, that is, one NDR Collection for the Library Collection and one NDR Collection each for each Metadata Source. The following identifies how each object for the two collection types will be created or converted from existing objects.


Library Collection - NDR Collection info

In order to avoid having to touch every resource and every metadata object, the Library Collection will be created as a new NDR Collection using addCollection call.


Name From (api call) Conversion Comments
Object Field Type Identifier/Path
Collection Aggregator
properties
title [from inputXML] M datastream format_nsdl_dc -> dc:title
description [from inputXML] M datastream format_nsdl_dc -> dc:description
hasResourceURL [from inputXML] R property hasResourceURL
contact [from inputXML] CAggr datastream serviceDescription -> contacts requires reformatting
externalIdentifier [from inputXML] M properties nsdl:uniqueID
datastreams
DC
dc:title
"Aggregator" hardcode "Aggregator"
dc:identifier
[handle for this object] [handle for LC-CAggr object]
relationships
collectionComponentOf [handle for this object] [handle for LC-CAggr object]
associatedWith [handle for R] [handle for LC-R object]
aggregatorFor [handle for OAgent] [handle for LC-OAgent object]
authorizedToChange [handle for OAgent] [handle for LC-OAgent object]
Collection Metadata Provider
properties
none
datastreams
DC
dc:title
"MetadataProvider" hardcode "MetadataProvider"
dc:identifier
[handle for this object] [handle for LC-CMDP object]
relationships
collectionComponentOf [handle for this object] [handle for LC-CAggr object]
metadataProviderFor [handle for OAgent] [handle for LC-OAgent object]
authorizedToChange [handle for OAgent] [handle for LC-OAgent object]
Collection Metadata
properties
none
datastreams
DC
dc:title
"Metadata" hardcode "Metadata"
dc:identifier
[handle for this object] [handle for LC-CM object]
format_ndr_collection
title
[from inputXML] M datastream format_nsdl_dc -> dc:title
description
[from inputXML] M datastream format_nsdl_dc -> dc:description
resourceURL
[from inputXML] R property hasResourceURL
contacts!contact
[from inputXML] CAggr datastream serviceDescription -> contacts requires reformatting
externalIdentifier
[from inputXML] M properties nsdl:uniqueID
relationships
collectionComponentOf [handle for this object] [handle for LC-CAggr object]
metadataProvidedBy [handle for CMDP] [handle for LC-CMDP object]
metadataFor [handle for CAggr] [handle for LC-CAggr object]


Metadata Source - NDR Collection info

The majority of NCS pattern collections have a single source of metadata, either NCS or Harvest Ingest. The existing CAggr and CMDP will be used for the primary Metadata Source NDR Collection. If there are additional sources, they will be created using addCollection call.


Create 1-3 metadata source NDR Collections based on...

  • add Harvest Ingest if CAggr has authorizedToChange relationship with 2200/NSDL_Harvest_Ingest
  • add NCS if CAggr has authorizedToChange relationship with 2200/NCS
  • add WFI if CAggr has authorizedToChange relationship with 2200/20091105132121677T

NOTE: Primary metadata source collection is not created since it is using the existing objects.


The non-primary metadata source collections are created using addCollection call.


Name From (api call) Conversion Comments
Object Field Type Identifier/Path
Collection Aggregator -- use CAggr for primary MS
properties
title [from inputXML] M datastream format_nsdl_dc -> dc:title prepended with "Harvest Ingest Metadata Source for ", "NCS Metadata Source for ", or "WFI Metadata Source for ";
ADD to primary MS
description [from inputXML] M datastream format_nsdl_dc -> dc:description prepended with title of this MS-CAggr record.;
ADD to primary MS
hasResourceURL [from inputXML] NOT USED
contact [from inputXML] CAggr datastream serviceDescription -> contacts requires reformatting; same as for LC-CAggr;
ADD to primary MS
externalIdentifier [from inputXML] M properties nsdl:uniqueID if begins with "NSDL-COLLECTION", include for NCS; if begins with "oai:" include with Harvest Ingest; NOT USED for WFI;
ADD to primary MS if not WFI
datastreams
DC
dc:title
"Aggregator" hardcode "Aggregator" USE AS IS for primary MS
dc:identifier
[handle for this object] [handle for MS-CAggr object] USE AS IS for primary MS
relationships
collectionComponentOf [handle for this object] [handle for MS-CAggr object] ADD for primary MS
associatedWith [handle for R] [handle for MS-R object] USE AS IS for primary MS
aggregatorFor [handle for OAgent] [handle for MS-OAgent object] For primary MS: 1st pass, leave as is; 2nd pass, update as appropriate to NCS, HI, or WFI
authorizedToChange [handle for OAgent] [handle for MS-OAgent object] USE AS IS for primary MS;
may need to add another as appropriate to NCS, HI, or WFI
Collection Metadata Provider -- use CMDP for primary MS
properties
none
datastreams
DC
dc:title
"MetadataProvider" hardcode "MetadataProvider" USE AS IS for primary MS
dc:identifier
[handle for this object] [handle for MS-CMDP object] USE AS IS for primary MS
relationships
collectionComponentOf [handle for this object] [handle for MS-CAggr object] ADD for primary MS
metadataProviderFor [handle for OAgent] [handle for MS-OAgent object] For primary MS: 1st pass, leave as is; 2nd pass, update as appropriate to NCS, HI, or WFI
authorizedToChange [handle for OAgent] [handle for MS-OAgent object] USE AS IS for primary MS;
may need to add another as appropriate to NCS, HI, or WFI
Collection Metadata -- create from scratch for primary MS
properties
none
datastreams
DC
dc:title
"Metadata" hardcode "Metadata"
dc:identifier
[handle for this object] [handle for MS-CM object]
format_ndr_collection
title
[from inputXML] M datastream format_nsdl_dc -> dc:title prepended with "Harvest Ingest Metadata Source for ", "NCS Metadata Source for ", or "WFI Metadata Source for "
description
[from inputXML] M datastream format_nsdl_dc -> dc:description prepended with title of this MS-CAggr record.
resourceURL
[from inputXML] NOT USED
contacts!contact
[from inputXML] CAggr datastream serviceDescription -> contacts requires reformatting; same as for LC-CAggr
externalIdentifier
[from inputXML] M properties nsdl:uniqueID if begins with "NSDL-COLLECTION", include for NCS; if begins with "oai:" include with Harvest Ingest; NOT USED for WFI
relationships
collectionComponentOf [handle for this object] [handle for MS-CAggr object]
metadataProvidedBy [handle for CMDP] [handle for MS-CMDP object]
metadataFor [handle for CAggr] [handle for MS-CAggr object]


Contacts Reformatting

Source from CAggr-d:serviceDescription...

Format in serviceDescription:

   <contacts>
     <contact>
       <name>First Last</name>
       <email>first.last@test.com</email>
     </contact>
     ...
   </contacts>

Converts to for new LC NDRCollection...

Format needed for LC-CAggr-p:contact and MS-CAggr-p:contact:

   <contact>first.last@test.com (First Last)</contact>
   ...


Format needed for LC-CM-d:format_ndr_collection and MS-CM-d:format_ndr_collection:

   <contacts>
     <contact name="First Last" email="first.last@test.com" />
     ...
   </contacts>


Conversion Process

Pass 1:

  • create one Library Collection NDR Collection for each NCS Collection
  • determine the number of metadata source
    • determine primary metadata source
      • check 1st 25% of metadata objects in collection and determine there source
      • note if any come from an alternate source
      • make any ADD updates to this object
    • create one Metadtat Source NDR Collection for each non-primary metadata source


Pass 2:

  • make MODIFY updates to primary metadata source object
  • move any resources and metadata to appropriate metadata source if there is more than one source
Personal tools