Table of contents

1      OVERVIEW... 1

2      CURRENT STATUS.. 3

3      METADATA REPOSITORY COMPONENTS.. 3

3.1        METADATA RECORDS. 3

3.1.1     RECORD FORMAT. 3

3.1.2     RECORD CONTENT AND STRUCTURE.. 5

3.1.3     RECORD TYPE.. 5

3.1.4     RECORD RELATIONSHIP.. 6

3.2        METADATA REPOSITORY DATABASE. 6

3.2.1     CHOICE OF DATABASE MANAGEMENT SYSTEM... 6

3.2.2     ACCESS TO DATABASE.. 7

3.3        MRMS FRONT PORCH COMPONENT.. 7

3.4        INTERFACES TO MRMS TRUSTED SERVICES. 7

3.4.1     INTERFACE TO SEARCH ENGINE.. 7

3.4.2     INTERFACE TO ACCESS CONTROL MANAGEMENT. 8

3.4.3     INTERFACE TO CACHE SERVICE SERVER.. 8

3.4.4     INTERFACE TO WEB SERVICE SERVER.. 8

4      MAJOR DATA FLOW PROCESSES.. 8

4.1        DATA FLOW OF INGEST PROCESS. 9

4.1.1     Direct feeding by collection partners. 9

4.1.2     OAI Harvest 9

4.1.3     Web crawling. 9

4.1.4     Web uploading. 9

4.1.5     FTP uploading. 10

4.2        DATA FLOW OF METADATA UPDATE PROCESS. 10

4.3        DATABASE REPLICATION TO THE MRMS SEARCH ENGINE. 10

4.4        DATA FLOW OF METADATA EXPOSURE PROCESS. 10

4.5        DATA FLOW OF METADATA RETRIEVING BY TRUSTED SERVICES. 12

5      PLATFORMS.. 12

5.1        METADATA REPOSITORY APPLICATION SERVER.. 12

5.2        DATABASE SERVER.. 12

6      MAINTAINENCE OF METADATA REPOSITORY.. 12

6.1        LOGGING MANAGEMENT ARCHITECTURE. 12

6.2        ON-GOING DATA INGESTING.. 12

6.3        UPDATE TO EXISTING RECORDS. 12

6.4        SYSTEM BACKUP. 12

6.5        FAULT TOLERANCE ARCHITECTURE. 13

6.6        RISK MANAGEMENT ARCHITECTURE. 13

APPENDIX A.         XML Schema for Dublin Core Metadata Records.. 14

APPENDIX B.         XML Schema of LTSC (IMS) Metadata Records.. 16

APPENDIX C.         XML Schema for ADL (SCORM) Metadata Records.. 17

APPENDIX D.         XML Schema for MARC 21 Metadata Records.. 18

APPENDIX E.         XMLTag Set for GILS Metadata Records.. 21

APPENDIX F.         XML Element Tag Name for EAD Metadata Records.. 25
 


An Architecture Design of Metadata Repository Management System of NSDL Project

(As a draft version of November 27, 2001)

 
1          OVERVIEW

Metadata is generally defined as data about data. It is the key to the content data. In the metadata repository, metadata is regarded as “a set of prescribed properties of a resource”. The composition of those prescribed properties is expressed as the format of metadata records. The current version of NSDL (National Science Digital Library) will accept eight (8) different formats of metadata records (see 3.1.1 for details).

 

Metadata repository management system (MRMS) is a computing system to collect, store and maintain the metadata records in the digital library; it also provides the interfaces to other components and services in the digital library to build their services on the metadata repository.  Figure 1 shows the components and relationships among the components of the MRMS.

 

The scope of this document is limited to describe architecture of metadata repository management system. It is not concerned itself on functionality of the system, neither on what and how of implementation. The boundary of the concerns stops at the interfaces to other components in NSDL.

 

The purpose of this document is to describe the key concepts, the key components and the relationships among these components of the metadata repository management system. It is intended to serve as a coherent technical document for system design and development of the metadata management system and its interfaced systems in NSDL.

 

The primary goals of the technical architecture are as follows:

·        The system has to be an open system that will allow other services to build on while MRMS will carefully manage the intellectual property rights.

·        The system has to be scalable, portable and easy to augment for future development.

·        The system has to be feasible for production in the time frame of this project.

·        The system has to be efficient to accommodate very large amount of data and transactions.

·        The system has to be easy and efficient for other components of NSDL to interact with and to build services on.

 

 

 

 

 

<file:///C:/Documents%20and%20Settings/sgan/nsdl/arch/arch6_files/image002.gif>

Figure 1.  Architecture Sketch of Metadata Repository Management System
2          CURRENT STATUS

Architecture design of the metadata repository management system is built upon the prototype of metadata repository used in the www.siteForScience.org. Concurrently, the system functional requirement and design of MRMS is also under development.

 
3          METADATA REPOSITORY COMPONENTS
3.1             METADATA RECORDS
3.1.1        RECORD FORMAT

Eight metadata record standards will be accepted as the standard formats of metadata records as shown in the Table 1. However, the internal standard format to metadata repository database will be the Dublin Core (DC) format. Other seven formats will be converted to the DC format by crosswalk before they are ingested into the metadata repository database. The rest of this section will briefly discuss the record layout, XML schemas and crosswalk methods of these standards. More detailed information about these standards can be found in the web sites list in the Table 1, and appendix A-G.

 

Table 1. Eight accepted standards for MRMS

Standard

Maintaining Body

Dublin Core

Dublin Core Metadata Initiative (DCMI)

Dublin Core
+ DC-ED (proposed, based on GEM)

Dublin Core Metadata Initiative (DCMI)

LTSC (IMS)

IEEE Learning Technology Standards Committee (LTSC)

ADL (SCORM)

Advanced Distributed Learning Network (ADLNet)

MARC 21 

Library of Congress Network Development and MARC Standards Office

Content Standard for Digital Geospatial Metadata (CSDGM) (sometimes called FGDC)

Federal Geographic Data Committee

Global Information Locator Service (GILS)

U.S. Federal GILS???

EAD (Encoded Archival Description)

Library of Congress Network Development and MARC Standards Office

 
3.1.1.1              Dublin Core Standard

The Dublin Core v1.1 will be the internal standard of metadata records in the MRMS. The details about his standard can be referred to Dublin Core Metadata Initiative web site at:  http://dublincore.org. The XML schema of Dublin Core records is attached in the APPENDIX A.
3.1.1.2              Dublin Core with Education Extension Standard

This standard was proposed by the DC Education Working Group. It is currently under consideration by the Dublin Core Usage Board. The details about this standard can be found:  http://dublincore.org/documents/2000/10/05/education-namespace/. The XML schema of this standard will be the same as Dublin Core standard as in the APPENDIX A.
3.1.1.3              LTSC (IMS)

IMS metadata standard in MRMS is referred to IMS V1.2.2. It was developed by IMS Global Learning Consortium, Inc. Its specification for IMS v1.2 can be found on http://www.imsproject.org/metadata/index.html. Its XML schema is attached in the APPENDIX B.

 
3.1.1.4              ADL (SCORM)

This standard is developed by Advanced Distributed Learning Network. The Sharable Content Object Reference Model (SCORM) is a set of interrelated technical specifications built upon the work of the AICC, IMS and IEEE to create one unified "content model". The metadata specification for SCORM v1.2 can be found on http://www.adlnet.org. The XML schema of this standard is attached in the APPENDIX C.

 
3.1.1.5              MARC 21

MARC is the acronym for MAchine-Readable Cataloging. The MARC 21 was evolved from the original LC MARC standard developed by the Library of Congress. It has become the standard used by most library computer programs since 1990s. Specification and other documents of the MARC 21 can be found on http://lcweb.loc.gov/marc/. Its XML schema is attached in the APPENDIX D.

 
3.1.1.6              CSDGM(FGDC)

CSDGM is the acronym for Content Standard for Digital Geospatial Metadata. The Federal Geographic Data Committee coordinates the development of this standard and that is why this standard is also called FGDC standard. Specification and other documents of CSDGM can be found on: http://www.fgdc.gov. Metadata presentation via XML for this standard can be found on: http://www.fgdc.gov/metadata/metaxml.html.

 
3.1.1.7              GILS

GILS is the acronym for Global Information Locator Service. The GILS standard is an international standard profile of ISO 23950. Specification for GILS V2 standard can be found on http://www.gils.net/standards.html and http://ifla.inist.fr/documents/libraries/cataloging/metadata/prof_v2.htm. Its XML tag names are listed in the APPENDIX E.

 
3.1.1.8              EAD

EAD is the acronym for Encoded Archive Description. This standard is developed by the EAD Working Group. More information on this standard can be found on http://lcweb.loc.gov/ead/tglib/tlelem.html. APPENDIX F is the element name tags for Version 1.0 of EAD DTD for XML encoding released at the end of August 1998. 

 
3.1.2        RECORD CONTENT AND STRUCTURE

A metadata record contains three (3) sets of essential information: its descriptive data, its structure data and its administrative data.

Ø      descriptive metadata, which describes the item for which the record is a surrogate, or contains annotation data. Those descriptive data are presented as a set of descriptive metadata elements which correspond to Dublin Core elements (qualified or unqualified, with or without educational extensions).

 

Ø      administrative metadata, which contains information about the metadata record.  Examples include the source of the metadata, the date the record was last modified, who has authority to access or to modify the record.

 

Ø      structural metadata, which describes relationships between metadata records.  For example, collection metadata records may have references to &#8220;child&#8221; metadata records which describe individual items in the collection.

 
3.1.3        RECORD TYPE

Currently, three types of data are considered: collection, item and annotation. However, all three types of metadata will have a unified record structure.

 

Item metadata records describe items; they represent surrogates for the items themselves.  As used here, "item" is a rather general term which includes:

·         digital entities, web pages, images, sounds, video clips,  etc.

·         web sites, collections or groups of resources

·         online services, processes, databases or directories

·         other static or dynamically generated content which may be accessed using an HTTP request, protocol, or algorithm.

A collection metadata record (CMR) describes an aggregate of items and is conceptually at the &#8220;top level&#8221; of the library.  The CMR refers to zero or more metadata records that describe individual items or sub-aggregations in the collection. 

 

Annotation metadata records (AMRs) contain annotations on resources represented by other metadata records. AMRs may be used in two ways: AMRs may refer to an ordered list of one-or-more metadata records, or they may annotate the relationship between two metadata records.
3.1.4        RECORD RELATIONSHIP

Record relationships among all records are described by their link relationships, or parent/child relationship exception annotation to the link. A metadata item is the basic unit of metadata record. A metadata collection is referred to an aggregation of items, annotations and collections. A metadata collection could be &#8220;real&#8221; or &#8220;virtual&#8221;. A virtual collection is referred to a collection that does not add any new items into the metadata repository but aggregate the existing items to a collection. A metadata annotation is a metadata record of annotation to metadata entities (i. e., annotation could be an annotation to an item, a collection, an annotation or a link).
3.2             METADATA REPOSITORY DATABASE
3.2.1        CHOICE OF DATABASE MANAGEMENT SYSTEM

The database management system is a key component of metadata repository management system. Such a database management system has to be a relational database system, it has to support distribution databases, it has to be able to handle up to billions of records, and it has to have reliable replication capabilities. MySQL database management system will be used for the first year development and production because it is a free-ware with open source (http://www.mysql.com), and it is reasonably robust. However, more robust commercial RDBM will be considered for the next phase development plan.

 

It was agreed on that there would be a single database for the metadata repository on October CI project meeting. This database will store not only metadata records but also managerial data for metadata repository maintenance.

 
3.2.2        ACCESS TO DATABASE

Each metadata record in the MRMS has an access control list (ACL). It is intended that any external access to the record will be made through an internal access component, i. e., no external access will be directly made to the database. The internal access component will verify the requestor&#8217;s identity and permissions by interfacing to authentication and rights brokers through the trusted service interfaces (Figure 1).
3.3             MRMS FRONT PORCH COMPONENT

 

The Front Porch Component of MRMS will act as a metadata-processing buffer for data flows of both directions. It is expected that formats of the metadata records will be determined by its interfaced components, i. e., the formats of metadata records will be determined by either OAI Service or the Internal Database Access component when the Front Porch Component interacts with these components, the formats of metadata records will be determined by the Metadata Ingest and Exposure Interface component when the Front Porch Component interacts with these components. In the case of Metadata Ingest and Exposure Interface component feeds/requests the metadata records that have a different format from OAI Service or the Internal Database Access component accepts/provides, the Front Porch will perform the data format conversion/crosswalking. The Front Porch component will also perform data validation whenever it is needed.

 

It is crucial for the Front Porch to ensure the intellectual property rights of resources protected when metadata are ingested or exposed. Hence, the Front Porch component of MRMS will also have to interface with the Authentication Broker and Rights Broker through the Trusted Service Interface component.

 

 

 
3.4             INTERFACES TO MRMS TRUSTED SERVICES

The metadata repository management system will provide a set of interfaces to access the MRMS database for the trusted services of NSDL.
3.4.1        INTERFACE TO SEARCH ENGINE

It is designed that MRMS will not be a direct searching system for external searching system. Instead, NSDL will build a search engine to search on the searchable database. The search engine will have a replication copy of metadata repository database to work with. Therefore, MRMS will provide an interface to push the data to the search engine.

 
3.4.2        INTERFACE TO ACCESS CONTROL MANAGEMENT

Each metadata record will have an access control list (ACL) to specify who can access what parts of metadata record. When a service request to access any metadata record in the metadata repository database, MRMS will have to verify the identity and authority of access request by consulting to the rights broker.
3.4.2.1              INTERFACE TO AUTHENTICATION BROKER

The metadata repository management system will rely on the authentication broker to authenticate the user identity and its associations. It is expected that the authentication will be on session or connection scope persistent. MRMS will develop a negotiation interface with authentication broker to accomplish the authentication process.
3.4.2.2              INTERFACE TO RIGHTS BROKER

It is anticipated that access control list (ACL) will be a part of records, and ACL will be stored in the MRMS. However, all users access privileges and associations will not be stored in the MRMS. Some of access privileges and associations will be stored and managed by users' organization. Therefore, MRMS will rely on the rights broker to retrieve a user's association and privileges, and MRMS will check user's association/privileges against the ACL of individual record that a user tries to access. MRMS will provide an interface to the right broker to send request and receive the results.
3.4.3        INTERFACE TO CACHE SERVICE SERVER

Content records will be cached on the cache server when the records are accessed. However, a cache flag will be set in the metadata repository to indicate whether a record is cached. MRMS will be responsible to retrieve the cached data from the cache server if the cached data are considered to the update. The interface to the cache server will be developed in a form of API.

 
3.4.4        INTERFACE TO WEB SERVICE SERVER

Web server of NSDL could be the most frequently interfaced server to MRMS. It is considered to be a trusted service in the sense that MRMS trusts the information sent by the NSDL web server.

 

It is expected that the web server will send requests to MRMS for metadata records. Such a request will include but not be limited to metadata record handle and user identification on user&#8217;s behalf.
4          MAJOR DATA FLOW PROCESSES

This section will describe the major data flow processes associated with MRMS. These major data flows include: ingest data flow, metadata update data flow, export data flow, and web display data flow.
4.1             DATA FLOW OF INGEST PROCESS

As mentioned in the previous section, MRMS will be an open system that allows other service to build on and to interact with. Ingest process of MRMS will be one of the examples of this principle. MRMS will provide various way for users and services to ingest their metadata into MRMS. Red arrows indicate the data flow of metadata ingest processes in the Figure 2. Although there will be different methods to acquire metadata, the data flow will share the same path in general. As we mentioned earlier of this document, Dublin Core will be the internal metadata format in MRMS. If the acquired data are not in the Dublin Core format, a crosswalking will be necessary to convert other formatted records to Dublin Core format. In general, authenticate and authorization will be performed before data ingesting process proceeds.
4.1.1        Direct feeding by collection partners

MRMS will provide an OAI interface or XML-RPC to federated collection partners to feed their metadata records into metadata repository through OAI. When the Front Porch of MRMS receives the metadata records from the federated collection partners, it will do the crosswalking if the record format is not in the Dublin Core format. Then OAI server will insert the records directly into the MRMS database.

 
4.1.2        OAI Harvest

OAI harvesting is a batch ingesting process. It will simply harvest the metadata from OAI server and ingest them to the metadata repository database. Because this is a pull method and OAI server is considered a totally trusted server, no authentication nor authorization will be needed.
4.1.3        Web crawling

MRMS will have a web crawling method to collection metadata records from the Internet. It is expected that collected records by this method will be in Dublin Core format so they can be directly ingested into the MRMS database by OAI.
4.1.4        Web uploading

Web uploading method can be used for the online editing of metadata and for existing metadata records in a file. It is expected that the online editing metadata records will have a DC format when the records are uploaded. These records can be directly inserted into metadata repository database through OAI in the Front Porch. Format of records in the web-uploaded file can be in different format. If so, the Front Porch will first perform a crosswalk before inserting them into the metadata repository database.
4.1.5        FTP uploading

This method is very similar to web uploading method. The only difference is that ftp uploading uses ftp protocol to upload the data file to the Front Porch while web uploading uses a web uploading technology (behind the sense it might well be using ftp protocol!) to upload the data file to the Front Porch. After the data get onto the Front Porch, the rest of process will be the same as web uploading.
4.2             DATA FLOW OF METADATA UPDATE PROCESS

Metadata records are subject to update by its owners. MRMS will provide a mechanism for its owners to make such update simple and painless. For each ingest method described in the 4.1 will have its corresponding method for update. These update methods will follow similar paths to their ingest ones. However, individual online update process will have to retrieve the metadata record first before editing and updating. The batch updating, on the other hand, will not need to retrieve the metadata before updating.

 
4.3             DATABASE REPLICATION TO THE MRMS SEARCH ENGINE

As mentioned in the proceeded section, MRMS database will not be a database that the external users will search on. It is expected that NSDL will have its own search engine component and this component will include a copy of MRMS database. The database synchronization will be done by a replication process, which will push the data from MRMS database to the search engine database.

 
4.4             DATA FLOW OF METADATA EXPOSURE PROCESS

MRMS is designed to be an open system that will allow other services to build on while MRMS will carefully manage the intellectual property rights. When a client requests to retrieve metadata records through the Ingest and Exposure Interface, the data flow will follow the green path shown in the Figure 2, and MRMS will do the following:

·        The interface passes the request to the Front Porch.

·        The Front Porch consults with authentication and right brokers to verify the client&#8217;s identity and rights.

·        The Front Porch passes the request and rights to either OAI server or the Internal Database Access component to retrieve the metadata resultset.

·        The Front Porch converts the metadata resultset to the right format and returns them to the client.

 

Figure 2
<file:///C:/Documents%20and%20Settings/sgan/nsdl/arch/arch6_files/image004.gif>

. Data flow processes of MRMS

 

4.5             DATA FLOW OF METADATA RETRIEVING BY TRUSTED SERVICES

It is expected that some of MRMS trusted services need to retrieve the metadata records in the format rather than XML. In order to meet this requirement, MRMS will expose its metadata through the Internal Database Access component rather than OAI. The data flow path is shown as blue path in the Figure 2.

 
5          PLATFORMS
5.1             METADATA REPOSITORY APPLICATION SERVER

MRMS has to a reliable and high performance system when it is in production. Linux server will be the application server at the phase I of the project. More powerful and reliable application server might be needed when the system runs its full production mode in the future.
5.2             DATABASE SERVER

The database server does not have to be the same as application server. But current thought is the data server will be the same server as application server.
6          MAINTAINENCE OF METADATA REPOSITORY
6.1             LOGGING MANAGEMENT ARCHITECTURE

RDBMS used by MRMS will provide a comprehensive logging capability. MRMS will load those logging data into database tables to produce various statistic data and reports. These statistic data and reports will be used for future improvement and development.
6.2             ON-GOING DATA INGESTING

As discussed in the section 4.1, ingesting process of MRMS is intended to be an open process to services and users. What it means is that MRMS will continuously pull and receive metadata records from various resources using methods described in the section 4.1.
6.3             UPDATE TO EXISTING RECORDS

Update to existing records can be done in batch mode or online editing mode. The methods of update existing records are discussed in the section 4.2.
6.4             SYSTEM BACKUP

The system backup is an essential component of MRMS maintenance component. The bottom line for a system backup scheme is that no more than one day worth data will be lost in any worst-case scenario. The system backup for MRMS includes two parts: database backup and application server backup. Both backup will follow the same backup scheme, which could be implemented differently because the nature of data themselves.
6.5             FAULT TOLERANCE ARCHITECTURE

It is crucial to keep uninterrupted availability of MRMS on 7x24 basis when the system runs in its full production mode. But things break. A standby MRMS server will take over production whenever primary production server is down to ensure uninterrupted service. In order to avoid any false alarm, an extremely reliable communication between the primary and standby server will be established when the system is in production.
6.6             RISK MANAGEMENT ARCHITECTURE

The project PI and project manager will be in charge of the risk management. The major risk of this project is to lose the staff members of the project team. (With current freeze of non-academic staff hiring at Cornell University, a waive of freeze will be needed to replace a team member in the case of lose team members.)

APPENDIX A.         XML Schema for Dublin Core Metadata Records

 

 

<schema xmlns="http://www.w3.org/2001/XMLSchema"

 

         xmlns:dc="http://purl.org/dc/elements/1.1/"

 

         targetNamespace="http://purl.org/dc/elements/1.1/"

 

         elementFormDefault="qualified"

 

         attributeFormDefault="unqualified">

 

 

 

  <annotation>

 

   <documentation>

 

    Schema for Dublin Core metadata format.

 

    the Open Archives Initiative. 2000. 

 

    Schema validated at http://www.w3.org/2001/03/webdata/xsv on 05-09-2001

 

    Dublin Core semantics available at 

           http://purl.org/DC/documents/rec-dces-19990702.htm

 

   </documentation>

 

  </annotation>

 

 

 

<element name="dc" type="dc:dublincoreType"/>

 

 

 

<complexType name="dublincoreType">

 

<choice minOccurs="0" maxOccurs="unbounded">

 

  <element name="title" minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="creator"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="subject"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="description"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="contributor"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

 <element name="publisher"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="date"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="type"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="format"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="identifier"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="source"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="language"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="relation"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="coverage"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

  <element name="rights"  minOccurs="0" maxOccurs="unbounded" type="string"/>

 

</choice>

 

</complexType>

 

 

 

</schema>

 

This Schema is available at http://www.openarchives.org/OAI/1.1/dc.xsd

 

APPENDIX B.         XML Schema of LTSC (IMS) Metadata Records

<?xml version="1.0" encoding="UTF-8" ?>

- <!--

 filename=ims_xml.xsd 

  -->

- <xsd:schema xmlns="http://www.w3.org/XML/1998/namespace" targetNamespace="http://www.w3.org/XML/1998/namespace" xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">

- <!--

 CHANGE HISTORY 

  -->

- <xsd:annotation>

  <xsd:documentation>2001-02-22: Thomas Wason initial creation</xsd:documentation>

  <xsd:documentation>In namespace-aware XML processors, the "xml"</xsd:documentation>

  <xsd:documentation>prefix is bound to the namespace name http://www.w3.org/XML/1998/namespace.</xsd:documentation>

  <xsd:documentation>Do not reference this file in XML instances</xsd:documentation>

  <xsd:documentation>Schawn Thropp: Changed the uriReference type to string type</xsd:documentation>

  <xsd:documentation>2001-07-26: S Thropp: Changed the XSD namespace to point to</xsd:documentation>

  <xsd:documentation>Schema of schemas for the 5/2/2001 W3C Recommendation</xsd:documentation>

  <xsd:documentation>Changed the XSD types for base and link to xsd:anyURI</xsd:documentation>

  </xsd:annotation>

- <xsd:attribute name="lang" type="xsd:language">

- <xsd:annotation>

  <xsd:documentation>Refers to universal XML 1.0 lang attribute</xsd:documentation>

  </xsd:annotation>

  </xsd:attribute>

- <xsd:attribute name="base" type="xsd:anyURI">

- <xsd:annotation>

  <xsd:documentation>Refers to XML Base: http://www.w3.org/TR/xmlbase</xsd:documentation>

  </xsd:annotation>

  </xsd:attribute>

  <xsd:attribute name="link" type="xsd:anyURI" />

  </xsd:schema>

 

This Schema is available at http://www.imsglobal.org/xsd/ims_xml.xsd

APPENDIX C.         XML Schema for ADL (SCORM) Metadata Records

 

<?xml version="1.0" encoding="UTF-8"?>

<!-- filename=ims_xml.xsd -->

<xsd:schema xmlns="http://www.w3.org/XML/1998/namespace"

            targetNamespace="http://www.w3.org/XML/1998/namespace"

            xmlns:xsd="http://www.w3.org/2001/XMLSchema"

            elementFormDefault="qualified">

      <!-- 2001-02-22 edited by Thomas Wason IMS Global Learning Consortium, Inc. -->

      <xsd:annotation>

            <xsd:documentation>In namespace-aware XML processors, the &quot;xml&quot; prefix is bound to the namespace name http://www.w3.org/XML/1998/namespace.</xsd:documentation>

            <xsd:documentation>Do not reference this file in XML instances</xsd:documentation>

                <xsd:documentation>Schawn Thropp: Changed the uriReference type to string type</xsd:documentation>

      </xsd:annotation>

      <xsd:attribute name="lang" type="xsd:language">

            <xsd:annotation>

                  <xsd:documentation>Refers to universal  XML 1.0 lang attribute</xsd:documentation>

            </xsd:annotation>

      </xsd:attribute>

      <xsd:attribute name="base" type="xsd:string">

            <xsd:annotation>

                  <xsd:documentation>Refers to XML Base: http://www.w3.org/TR/xmlbase</xsd:documentation>

            </xsd:annotation>

      </xsd:attribute>

      <xsd:attribute name="link" type="xsd:string"/>

</xsd:schema>

This Schema is available at http://www.adlnet.org/

 

APPENDIX D.         XML Schema for MARC 21 Metadata Records

<schema xmlns="http://www.w3.org/2001/XMLSchema"

   targetNamespace="http://www.openarchives.org/OAI/1.1/oai_marc"

   xmlns:oai_marc="http://www.openarchives.org/OAI/1.1/oai_marc"

   elementFormDefault="qualified"

   attributeFormDefault="unqualified">

 

   <annotation>

     <documentation>

      Schema for MARC metadata format.

      MARC semantics available at http://www.loc.gov/marc/

      .....

      This Schema has been successfully applied for MARC21 records.  

      It is likely to also work for older versions of USMARC and CANMARC.  

      Application of this Schema for other MARC formats has not been 

      tested and may require some adjustments. 

      ..... 

      the Open Archives Initiative. 2000. 

      Herbert Van de Sompel

      MARC XML transportation format on which this schema is inspired 

           available at http://www.dlib.vt.edu/projects/OAi/marcxml/marcxml.html

     This Schema validated at http://www.w3.org/2001/03/webdata/xsv on 05-09-2001.

     </documentation>

    </annotation>

 

  <element name="oai_marc">

   <complexType>

     <sequence>

       <element ref="oai_marc:fixfield" minOccurs="1" maxOccurs="unbounded"/>

       <element ref="oai_marc:varfield" minOccurs="0" maxOccurs="unbounded"/>

     </sequence>

    <attribute name="status" type="string" use="optional"/>

    <attribute name="type" type="string" use="required"/>

    <attribute name="level" type="string" use="required"/>

    <attribute name="ctlType" type="string" use="optional"/>

    <attribute name="charEnc" type="string" use="optional"/>

    <attribute name="encLvl" type="string" use="optional"/>

    <attribute name="catForm" type="string" use="optional"/>

    <attribute name="lrRqrd" type="string" use="optional"/>

   </complexType>

  </element>

 

  <element name="fixfield">

   <complexType>

     <simpleContent>

        <extension base="oai_marc:fixfieldType">

        <attribute name="id" type="oai_marc:idType" use="required"/>

        </extension>

     </simpleContent>

   </complexType>

  </element>

 

  <simpleType name="fixfieldType">

    <restriction base="string">

    <!-- fixfield must be enclosed between quotes because spaces 

         are meaningfull -->

     <pattern value='[\n\r\t\s]*"[^"]*"[\n\r\t\s]*'/>

    </restriction>

  </simpleType>

 

  <element name="varfield">

   <complexType>

     <sequence>

       <element ref="oai_marc:subfield" minOccurs="1" maxOccurs="unbounded"/>

     </sequence>

    <attribute name="id" type="oai_marc:idType" use="required"/>

    <attribute name="i1" type="oai_marc:iType" use="required"/>

    <attribute name="i2" type="oai_marc:iType" use="required"/>

   </complexType>

  </element>

 

  <element name="subfield">

   <complexType>

     <simpleContent>

        <extension base="string">

        <attribute name="label" type="oai_marc:subfieldType" use="required"/>

        </extension>

     </simpleContent>

   </complexType>

  </element>

 

  <simpleType name="subfieldType">

    <restriction base="string">

    <!-- MARC subfield (the leading $ i not used)

         may be any lowercase alphabetic or numeric character  -->

     <pattern value="[0-9a-z]"/>

    </restriction>

  </simpleType>

 

  <simpleType name="idType">

    <restriction base="string">

    <!-- MARC tags are 1 to 3 digits -->

     <pattern value="[0-9]{1,3}"/>

    </restriction>

  </simpleType>

 

  <simpleType name="iType">

   <restriction base="string">

    <!-- MARC indicator may be any lowercase alphabetic or numeric character

         or a blank  -->

     <pattern value="[0-9a-z\s]?"/>

    </restriction>

  </simpleType>

 

 

</schema>

This Schema is available at http://www.openarchives.org/OAI/1.1/oai_marc.xsd

 

APPENDIX E.         XMLTag Set for GILS Metadata Records

Tag

Element

Recommended Datatype

1

controlIdentifier

InternationalString

2

streetAddress

InternationalString

3

city

InternationalString

4

stateOrProvince

InternationalString

5

zipOrPostalCode

InternationalString

6

hoursOfService

InternationalString

7

resourceDescription

InternationalString

8

technicalPrerequisites

InternationalString

9

westBoundingCoordinate

intUnit

10

eastBoundingCoordinate

intUnit

11

northBoundingCoordinate

intUnit

12

southBoundingCoordinate

intUnit

13

placeKeyword

InternationalString

14

placeKeywordThesaurus

InternationalString

15

beginningDate

GeneralizedTime

16

timePeriodTextual

InternationalString

17

linkage

InternationalString

18

linkageType

InternationalString

19

recordSource

InternationalString

20

controlledTerm

InternationalString

21

subjectThesaurus

InternationalString

22

uncontrolledTerm

InternationalString

23

originalControlIdentifier

InternationalString

24

recordReviewDate

GeneralizedTime

25

generalAccessConstraints

InternationalString

26

originatorDisseminationControl

InternationalString

27

securityClassificationControl

InternationalString

28

orderInformation

InternationalString

29

cost

Boolean

30

costInformation

InternationalString

31

scheduleNumber

InternationalString

32

languageOfResource

InternationalString

33

medium

InternationalString

34

languageOfRecord

InternationalString

35

relationship

InternationalString

36

endingDate

GeneralizedTime

NOTE: The element "wellKnown" from tagSet-M (1,19) and referred to below has the following definition:

When an element is defined to be "structured into locally defined elements," the target may use this tag (i.e., wellKnown) in lieu of, or along with, locally defined tags. For example, an element named 'title' might be described to be "locally structured." The target might present the element structured into the following subelements: 'wellKnown,' 'spineTitle,' and 'variantTitle,' where the latter two tags are target defined. In this case, 'wellKnown' is assumed to mean 'title.'

51

purpose (Constructed as follows)

This element may include the element wellKnown and may also include locally defined elements.

52

originator (Constructed as follows)

This element may include the element wellKnown and may also include locally defined elements.

53

accessConstraints (Constructed as follows)

This element may include any of the following as well as locally defined elements: generalAccessConstraints, orginatorDisseminationControl, securityClassificationControl.

54

useConstraints (Constructed as follows)

This element may include the element wellKnown and may also include locally defined elements.

55

orderProcess (Constructed as follows)

This element may include any of the following as well as locally defined elements: orderInformation, cost, costInformation

56

agencyProgram (Constructed as follows)

This element may include the element wellKnown and may also include locally defined elements.

57

sourcesOfData (Constructed as follows)

This element may include the element wellKnown and may also include locally defined elements.

58

methodology (Constructed as follows)

This element may include the element wellKnown and may also include locally defined elements.

59

supplementalInformation (Constructed as follows)

This element may include the element wellKnown and may also include locally defined elements.

70

availability (Constructed as follows)

This element may include any of the following as well as locally defined elements: medium, distributor, resourceDescription, orderProcess, technicalPrerequisites, timePeriod, availableLinkage.

71

spatialDomain (Constructed as follows)

This element may include any of the following as well as locally defined elements: boundingCoordinates, place.

90

distributor (Constructed as follows)

This element may include any of the following as well as locally defined elements: name, organization, streetAddress, city, stateOrProvince, zipOrPostalCode, country, networkAddress, hoursOfService, telephone, fax.

91

boundingCoordinates (Constructed as follows)

This element may include any of the following as well as locally defined elements: westBoundingCoordinate, eastBoundingCoordinate, northBoundingCoordinate, southBoundingCoordinate.

92

place (Constructed as follows)

This element may include any of the following as well as locally defined elements: placeKeyword, placeKeywordThesaurus

93

timePeriod (Constructed as follows)

This element may include any of the following as well as locally defined elements: timePeriodTextual, timePeriodStructured.

94

pointOfContact (Constructed as follows)

This element may include any of the following as well as locally defined elements: name, organization, streetAddress, city, stateOrProvince, zipOrPostalCode, country, networkAddress, hoursOfService, telephone, fax.

95

controlledSubjectIndex (Constructed as follows)

This element may include any of the following as well as locally defined elements: subjectThesaurus, subjectTermsControlled.

96

subjectTermsControlled (Constructed as follows)

This element may include any of the following as well as locally defined elements: controlledTerm.

97

subjectTermsUncontrolled (Constructed as follows)

This element may include any of the following as well as locally defined elements: uncontrolledTerm.

98

crossReference (Constructed as follows)

This element may include any of the following as well as locally defined elements: title, relationship, crossReferencelinkage.

99

availableLinkage (Constructed as follows)

This element may include any of the following as well as locally defined elements: linkage, linkageType.

100

crossReferenceLinkage (Constructed as follows)

This element may include any of the following as well as locally defined elements: linkage, linkageType.

101

timePeriodStructured (Constructed as follows)

This element may include any of the following elements: beginningDate, endingDate.

102

availableTimeStructured (Constructed as follows)

This element may include any of the following elements: beginningDate, endingDate.

 

APPENDIX F.          XML Element Tag Name for EAD Metadata Records

<abbr> Abbreviation

<abstract> Abstract

<accessrestrict> Restrictions on Access

<accruals> Accruals

<acqinfo> Acquisition Information

<add> Adjunct Descriptive Data

<address> Address

<addressline> Address Line

<admininfo> Administrative Information

<altformavail> Alternative Form Available

<appraisal> Appraisal Information

<archdesc> Archival Description

<archdescgrp> Archival Description Group

<archref> Archival Reference

<arrangement> Arrangement

<author> Author

<bibliography> Bibliography

<bibref> Bibliographic Reference

<bibseries> Bibliographic Series

<bioghist> Biography or History

<blockquote> Block Quote

<c> Component (Unnumbered)

<c01> Component (First Level)

<c02> Component (Second Level)

<c03>Component (Third Level)

<c04> Component (Fourth Level)

<c05> Component (Fifth Level)

<c06> Component (Sixth Level)

<c07>Component (Seventh Level)

<c08>Component (Eighth Level)

<c09>Component (Ninth Level)

<c10>Component (Tenth Level)

<c11> Component (Eleventh Level)

<c12>Component (Twelfth Level)

<change> Change

<chronitem> Chronology List Item

<chronlist> Chronology List

<colspec> Table Column Specification

<container> Container

<controlaccess> Controlled Access Headings

<corpname> Corporate Name

<creation> Creation

<custodhist> Custodial History

<dao> Digital Archival Object

<daodesc> Digital Archival Object Description

<daogrp> Digital Archival Object Group

<daoloc> Digital Archival Object Location

<date> Date

<defitem> Definition List Item

<dentry> Display Entry

<did> Descriptive Identification

<dimensions> Dimensions

<div> Text Division

<drow> Display Row

<dsc> Description of Subordinate Components

<dscgrp> Description of Subordinate Components Group

<ead> Encoded Archival Description

<eadgrp> EAD Group

<eadheader> EAD Header

<eadid> EAD Identifier

<edition> Edition

<editionstmt> Edition Statement

<emph> Emphasis

<entry> Table Entry

<event>Event

<eventgrp> Event Group

<expan> Expansion

<extent> Extent

<extptr> Extended Pointer

<extptrloc> Extended Pointer Location

<extref> Extended Reference

<extrefloc> Extended Reference Location

<famname> Family Name

<filedesc> File Description

<fileplan> File Plan

<frontmatter> Front Matter

<function> Function

<genreform> Genre/Physical Characteristic

<geogname> Geographic Name

<head> Heading

<head01> First Heading

<head02> Second Heading

<imprint> Imprint

<index> Index

<indexentry> Index Entry

<item> Item

<label> Label

<language> Language

<langusage> Language Usage

<lb> Line Break

<linkgrp> Linking Group

<list>List

<listhead> List Heading

<name> Name

<namegrp> Name Group

<note> Note

<notestmt> Note Statement

<num> Number

<occupation> Occupation

<odd> Other Descriptive Data

<organization> Organization

<origination> Origination

<otherfindaid> Other Finding Aid

<p> Paragraph

<persname> Personal Name

<physdesc> Physical Description

<physfacet> Physical Facet

<physloc> Physical Location

<prefercite> Preferred Citation

<processinfo> Processing Information

<profiledesc> Profile Description

<ptr> Pointer

<ptrgrp> Pointer Group

<ptrloc> Pointer Location

<publicationstmt> Publication Statement

<publisher> Publisher

<ref> Reference

<refloc> Reference Location

<relatedmaterial> Related Material

<repository> Repository

<revisiondesc> Revision Description

<row> Table Row

<runner> Runner

<scopecontent> Scope and Content

<separatedmaterial> Separated Material

<seriesstmt> Series Statement

<spanspec> Spanned Column Specification

<sponsor> Sponsor

<subarea> Subordinate Area

<subject> Subject

<subtitle> Subtitle

<table> Table

<tbody> Table Body

<tfoot> Table Foot

<tgroup> Table Group

<thead> Table Head

<title> Title

<titlepage> Title Page

<titleproper> Title Proper of the Finding Aid

<titlestmt> Title Statement

<tspec> Table Specification

<unitdate> Date of the Unit

<unitid> ID of the Unit

<unittitle> Title of the Unit

<userestrict> Restrictions on Use