NSDL testing framework

Draft of 12 June 2002 by Dean B. Krafft and Jon Phipps

Introduction

Our initial test plan for the Core Integration portion of the NSDL consists of the following overall elements:

Unit testing (e.g. with JUnit): This is optional for individual classes and components. In a number of cases, the NSDL is built on top of large existing systems. Ground-up unit testing is problematic in these situations. It is also harder to implement with servlet components.
Integration testing: Our current development process calls for fully automating build and deployment using Ant. Full builds from the CVS source code repository will take place at least daily, and a simple set of “smoke tests” will be run against the result of the build and deployment (this will probably be a subset of the functional tests described below). Service to service integration (as opposed to code integration) will be tested by the functional/acceptance testing.
Functional/Acceptance testing (e.g. with HTTPUnit): This will be the primary framework for testing that the NSDL is correct and operational. The overall pattern will be to develop end-to-end tests. For example, a test collection will be created consisting of a set of test records. These will be harvested by the MR, indexed by search, and served up through the NSDL Main Portal. Tests will be run against each of these components verifying that the test collection information is correctly stored, discovered, and presented.
Usability testing (e.g. by end-user evaluation): These tests will be designed and administered by the Evaluation working group. Feedback will be through Comm Portal trackers that allow the end users to report bugs, feature change requests, and future modifications.
Data Integrity testing: Verifying and maintaining the integrity of the harvested information is a major challenge for the library. We plan to use statistical checks, visualization tools (e.g. Spotfire), and sampling to verify information integrity.
Performance testing (e.g. with JMeter, JUnitPerf, and HTTPUnit): Load tests will be run against the uPortal interface simulating simultaneous user access. Initially, the load tests will be based on a simple model of expected usage patterns. As actual log information on usage becomes available, the tests will be revised appropriately.
Disaster recovery: We will define a set of test procedures and protocols to cover operational failures, such as server failover and recovery from tape backup.
Security: Security will be based on enforcing standard “good-practice” for systems administration. This includes setting up a process to make sure that all system patches are up-to-date and that publicly reported security vulnerabilities are patched.

In the sections below, we list the types of tests that will be performed. This is a very early draft of the list of tests. Contributions are welcome.

User Interface

Primary UI usability testing will be based on end-user testing organized by the Evaluation Working Group. Functional problems will be reported to the CI developers through the use of Comm Portal trackers in a workspace set up for this purpose. The test plan needs to include:

Alpha usability testing and feedback: The expected alpha test period will be September 1, 2002 to September 30, 2002. During this time functional changes will still be being made to the base CI initial release.
Beta user testing and feedback: Beta testing will begin October 1, 2002 and run until public release. From October 1, the functions of the base CI initial release will be frozen. However, development will continue on elements of the system (annotation, MySite, etc.) that are planned to be in beta for the initial release.

The following elements of the system will be included in the usability testing:

System
- uPortal framework
- Individual Channels
- Channel Integration
Authentication
- User registration
- User login
- Access to restricted content
Search
- Advanced search interface
- Accurate Search results
- Useful Search results (ranking, deduping)
Multi-browser compatibility for all
MR
- Collection registration interface
- DB Editorial interface

Data integrity

Data integrity tests are in two forms. Automated tests include checking a variety of counts and statistics as the data passes through the various stages of the system: ingest, Metadata Repository, and search. These tests also confirm that XML and database conversions preserve the data. The second type of tests includes manual visualization and sampling of data to verify that the metadata being harvested from collections accurately represents the underlying resources.

Record count constant from harvest through search/browse results
- Count from collection
- Harvest count
- Transform count
- Ingest count
- Relational tables count
- XML tables count
- External harvest count
- Search records count by collection
All available metadata formats harvested by MR
All metadata formats harvested by MR are harvested and indexed by search
Harvested metadata is transformed properly to NSDL
Relational data elements match xml source elements match resulting XML records
Valid, available URLs to resources
Visualized data indicates complete, useful, and error-free metadata is being provided to the MR by collections
Sampling of metadata records confirms accuracy of delivered information regarding underlying resources.

Components

Component-level testing is based on functional (or acceptance) testing at the available external interfaces. Test data typically enters the process through a special NSDL test collection. This collection provides both a fixed set of test cases that ensure data is uniformly maintained as the system is developed and expanded, and a growing set of test cases that exercise the new functionality being developed for the library.

Sample test data includes:

Examples of all metadata formats for which the MR supports standard cross-walks
Fielded metadata that allows the construction and verification of a variety of searches.
Metadata test cases for the Simple Metadata-Based Services (SiMBaS): News, Exhibits, Annotations, etc.
Sample content to support search and SiMBaS uPortal channel testing.

The component testing is primarily based on automated test cases built with HTTPUnit that access and verify web pages that the overall system creates in response to OAI and uPortal requests.

Component tests include:

Metadata Repository

OAI Harvester
Ingest transforms
- Initial
- Re-ingest
DB ingest
- Initial
- Re-ingest
DB edit UI
Ingest workflow process
OAI Server
CRON jobs happening

Search

SDS language functionality
Valid search results

Authentication

uPortal integration
Default origin
Where am I from service
Shibboleth integration
Default profile server

User Interface

uPortal framework
Individual Channels
Channel Integration

SDSC Archive Service

Data and Resource snapshots
API

Operational

Operational testing verifies the integrity of the system under normal and abnormal operational conditions. One of the major challenges of developing reasonable operational tests for the NSDL is the great uncertainty in the expected system load. To set some sort of an initial base, we inquired as to the loads on the Library of Congress catalog service. Currently, the LOC limits web catalog access to 275 simultaneous sessions. Another 250 simultaneous Z39.50 users are allowed, for a total of 550 simultaneous external sessions. This may serve as a reasonable initial target for the NSDL uPortal service.

We currently plan to use a combination of JMeter (to impose a specific system load), HTTPUnit (to create and verify specific accesses), and JUnitPerf (to determine test response times) to do load testing on uPortal, the Metadata Repository, and the Search service.

Load testing (server and I/O)

MR OAI server
MR ingest
uPortal
Channel components
Search

System integration

Integration tests at the code level will be carried out by a daily Ant build, deployment, and “smoke test” test suite run. Service level integration testing will take place in conjunction with the functional component testing described above.

Disaster recovery and system failover

We will develop a full set of procedures and protocols for backups, system failover, and disaster recovery. Quarterly tests of full system rebuild and recovery from backup tape will be performed. Quarterly tests of system failover will also be performed.

Security

Security of the systems will be based on a set of procedures and processes designed to ensure that the systems are protected against known security hazards. Standard systems administration processes, including firewall filtering, lockbox system logging, monitoring vendor security bulletins, minimizing services, and other similar steps will be taken to maintain the security of the systems.