Contents
(c) Bipin C. Desai

The Problem

The purpose of indices and bibliographies (called secondary information) is to inventory the primary information and allow easy access to it. The traditional method of generating bibliography entries required finding the primary source, identifying it as to its subject, etc., describing it for later matching for unknown future users and classifying it according to accepted norms.

The unpredictable retrieval of appropriate information resources, documented in Table 1 and [DESA4] illustrates that there is a need for the development of a system which allows better controlled 'search for' and 'access to' resources available on the Internet. With the current plethora of index services and search systems, most users are lost. However, even after a search there is no guarantee that the appropriate information resource will be found. Furthermore, these systems are not able to function together due to the differences in their coverage, indexing structure and user interface.

This phenomena has been observed in distributed information systems which, even though under control of a single administrative unit, create multiple problems typically caused by differences in semantics and representation, and incomplete and incorrect data dictionaries (cataloging) [DESA]. These problems would be magnified manyfold in any distributed information system which tries to integrate the resources offered by indexing and search systems over the Internet. It is important, also, to avoid problems encountered in a library system where, in spite of the fact that while the same cataloging system[4] is used, the same item may be differently catalogued/classified in two different libraries.

Such problems could be avoided by starting with a standard index structure and building a bibliographic system using standardized control definitions. Such definitions could be built into the knowledgebase of expert system-based index entry and search interfaces. Furthermore, there must be a mechanism to revise index information as the resource changes over time. Finally, annotation of a resource by independent users should be allowed.

With the increasing amount of information on the Internet, it is difficult to follow such a traditional centralized approach due to the enormous number of resources involved and hence the time and cost. Consequently, the indexing system should be distributed and accessible to providers as well as users of the Internet. In a distributed system such as the Internet, it is natural to have the providers of resources, prepare and enter the bibliographic information about each resource using a standardized index scheme. The entry system should be a distributed system and the index should be recorded in a distributed database. Finally, a search system to facilitate locating and retrieving appropriate information from this database is required.

Whereas the index entry and search systems (clients) could be located locally at the providers and users of information resources respectively, the bibliographic database systems(servers) should be distributed and replicated at a number of regional nodes for enhanced availability and response. The entry and search systems have to be supported by an easy-to-use graphical interface for entering the index information and accessing it. These systems should incorporate the expertise and knowledge of expert cataloguers and reference librarians with a help system to guide the user at all steps. The search system should, in addition, provide appropriate feedback indicating the number of hits for each search, and support access to the relevant resources. The navigation of database and resource nodes and the protocols and filters used would be selected by the system, thus facilitating the task of the user. The purpose is to provide uniform access to all resources, as is done in a centralized information system through the intermediary of an expert system analyst.

A number of projects in the Library domain have addressed the problem of cataloging and in particular the cataloging of information in electronic and multi-media format. CORE[CROM], MARC system[BRYN, CRAW, MARC, PETE], MLC[HORN, ROSS, RHEE] and TEI[GAYN, GIOR] are examples of some of these initiatives. These existing and proposed indexing systems range from a minimum to a full level of bibliographic information. However, such systems are designed for professional catalogers and many of the elements included in them, though useful, are beyond the level of familiarity of most providers or users of information.

In the following sections, we present two recent initiatives for allowing suppliers of resources to prepare well thought out catalog information for their resources.


NEXT: Dublin Metadata Workshop
PREV: The Problem
Contents