Cover page aka Semantic Header

Cover page aka Semantic Header

(c)Bipin C. Desai Department of Computer Science Concordia University 1455 De Maisonneuve Blvd. West Montreal, QC Canada H3G 1M8 Email: bcdesai@alcor.concordia.ca

This document is under construction/discussion. It will be submitted as an Internet-Draft for discussion by a IETF Working Group.

This document has been revised. See the current version here!

1. Summary

Cover page (or semantic header) is a portion of each document which should contain information useful in searching for a document based on a number of commonly used critria. The information from the semantic header could be used by various indexing schemes to help locate appropriate documents with minimum effort. It is envisaged that regional and/or specilized databases would be created to maintain archives of cover pages. These databases could be searched by cooperating distributed expert systems to help users locate pertinent documents. Such an expert system is currently under development at Concordia.

Documents evolve, replacing older versions with new versions or entirely newer documents. As such there is a need to replace old cover pages in these databases with new cover pages. This could be done by a document created for this purpose which includes the old semantic header and the new semantic header.

The format of the semantic header, which is delimited by the tags <semhdr> ..</semhdr> is as follows: The sub items in the header are positionally independent and a given header must have at least one entry for each of the following items except for the ones indicated as OPTIONAL.

<semhdr>

  • <title> ....... </title>
  • <subtitle> ....... OPTIONAL </subtitle>
  • <alttitle> ....... OPTIONAL </alttitle>
  • <char-set> ....... OPTIONAL </char-set>
  • <Language> ....... </Language>
  • <author>
    • <aname> ....... </aname>
    • <aorg> ....... </aorg>
    • <aaddress> ....... </aaddress>
    • <aphone> ....... </aphone>
    • <afax> ....... </afax>
    • <aemail> ....... </aemail>
    </author>
  • <Subject>
    • <General> ....... </General>
    • <Sublevel1> ....... OPTIONAL </Sublevel1>
    • <Sublevel2> ....... OPTIONAL </Sublevel2>
    </Subject>
  • <Keyword>
    • .......
    </Keyword>
  • <Dates>
    • <Creatred> ....... </Creatred>
    • <Expiry> ....... </Expiry>
    • <Updated> ....... </Updated>
    </Dates>
  • <Version> ....... </Version>
  • <Hardware>
    • .......
    </Hardware>
  • <Software>
    • .......
    </Software>
  • <Coverage> ....... </Coverage>
  • <Classification> ....... </Classification>
  • <Annotation> ....... OPTIONAL </Annotation>
  • <URL> ....... </URL>
  • <URN> ....... </URN>
  • <UAS> ....... </UAS>
  • <Cost> ....... </Cost>
  • <abstract> ....... OPTIONAL </abstract>
  • <size> ....... </size>
</semhdr>

2. Annotated example:

The semantic header for this document would have the following contents. Note the comments, enclosed by the tags </comment> ... </comment>, meant only for the author of the document, are not to be displayed by browsers. Since the current browsers have not implemented this feature, you will see them and they curently serve as a reminder of what is the syntax of each field of the semantic header.

<semhdr>

<ul>

<li><title><comment>Title of document</comment>

Cover page aka Semantic Header</tittle>

<li><alttitle><comment>Surrogate title </comment>

BCD's baby</alttitle>

<li><char-set><comment> Default charset unless escaped </comment> ASCII </char-set>

<Language><comment>Language used in this document</comment>

English </Language>

<li><comment>List of author(s). Details for each author enclosed in <author>....</author> : consisting of author's name, author's organization, author's address, etc. each is enclosed by appropriate tags.</comment>

<author>

<ul>

<li><aname><comment>Name in The following format: Family(last) name, Given (first) name, and Middle name or initial(s)if any</comment> DESAI, Bipin C.</aname>

<li><aorg><comment>Name of Author's Organization</comment>

Concordia University, Department of Computer Science</org>

<li><aaddress><comment>Author's Address in following format: Street number, Street name, City, Region, Country, Postal Code</comment>

1455 Blvd. de Maisonneuve, Montreal, QC, CANADA, H4G 1M8 </aAddress>

<li><aphone><comment>Author's Phone number </comment>

(514) 848 3025</aphone>

<li><afax><comment>Author's Fax</comment>

(514) 848 8652</aFax>

<li><aemail><comment>Author's Email</comment>

bcdesai@cs.concordia.ca</aemail>

</ul>

</author>

<Subject><comment>List of subject area of the document</comment>

<ul>

<li><General>Cover Page</General>

<li><General>Semantic Header</General>

<li><General>Controlled Header</General>

</ul>

</Subject>

<li><Keyword><comment>List of Keywords/key concepts of the document</comment>

<ul>

<li>Content descriptor

<li>Index aid

<li>Search/index aid

</ul>

</Keyword>

<li><Dates> <comment>List of dates in year-month-day format: yyyy-mm-dd</comment>

<ul>

<li><Creatred> <comment>Date Created/Published,</comment>

1994-06-07

<li><Expiry><comment>Date after which the document is not valid</comment>

1995-02-09

<li><Updated><comment>Date Last updated</comment>

1994-08-09

</ul>

</Dates>

<li><Version> 0.1 </Version>

<Hardware>

<ul>

<li> A computing system with appropriate graphic support.

</ul> </Hardware>

<Software>

<ul>

<li> Graphical or text based WWW browser.

</ul> </Software>

<li><Coverage> <comment>Type of information in document: local interest, regional interest, universal interest</comment>

Universal </Coverage>

<li><Classification><comment>Classification/Control Infomation </comment>

Public</Classification>

<li><Annotation>Related Document(s)/Annotation(s)

http://www.cs.concordia.ca/bcd/navigate.html

http://www.cs.concordia.ca/bcd/priority.html

http://www.cs.concordia.ca/~bcdesai/expert-search.html

</Annotation>

<li><URL><comment>Unique Universal Resource Locator/Call No for this Document</comment>

http://www.cs.concordia.ca/bcd/semantic-header.html</URL>

<li><URN><comment>Unique Universal Resource Name for this Document</comment>

Bipin C. Desai's Cover Page aka Semantic Header</URN>

<li><UAS><comment>Universal Archive Site where this document is archived</comment>

ftp://ftp.cs.concordia.ca/bcd/semantic-header.html</UAS>

<li><Cost><comment>Cost, Currency</comment> 0.21, Can$</Cost>

<li><abstract>This document describes the contents of the semantic header which has to be included with each document. The author(s) of the document is(are) responsible for generating this part of the document. The contents of this header will facilitate search and could be used for generating indices based on subject areas and keywords.</abstract>

<li><size><comment> Size of documents in Kbytes </comment> 9.7 </size> </ul>

</semhdr>

3. Modification to HTML

The HTML has to be augmented by a number of tags and all HTML editors and converters must enforce the presence of these non-optional items in the document. For instance an editor, on starting a new document would bring up a screen with the empty semantic header which will be filled up by the author before it will be submitted to the Web. The "submit" procedure will not accept a document unless the required fields of the document are filled.

The submit procedure would also broadcast new semantic headers on the Web. Iindexing search programs, of the nature of Archie, ALIWEB etc., can traverse the Web looking for newly broadcast headers and store in its local databse for use by users of Web.

Without some form of enforcement, the authors tend to "forget" entering this vital information. The resulting lack of an index record with pertinent information means that the Web community has no reliable way to search for a document in a given subject or topic.

Since the nature of this information is to similar to that which, in the traditional library system, is entered by professional librarians there is a need for a controlled vocabulary in assisting a user to enter appropriate subjects and keywords. These should also require editors to include help facility to assist users to choose appropriate subjects and keywords.

We have introduced the following new tags:

  • <semhdr> ....... </semhdr>
  • <subtitle> ....... </subtitle>
  • <char-set> ....... </char-set>
  • <Language> ....... </Language>
  • <comment> ....... </comment>
  • <author> ....... </author>
  • <aname> ....... </aname>
  • <aorg> ....... </aorg>
  • <aaddress> ....... </aaddress>
  • <aphone> ....... </aphone>
  • <afax> ....... </afax>
  • <aemail> ....... </aemail>
  • <Subject> ....... </Subject>
  • <General> ....... </General>
  • <Sublevel1> ....... </Sublevel1>
  • <Sublevel2> ....... </Sublevel2>
  • <Keyword> .......</Keyword>
  • <Dates>
  • ....... </Dates>
  • <Creatred>
  • <Expiry>
  • <Updated>
  • <Version> ....... </Version>
  • <Hardware> ....... </Hardware>
  • <Software> ....... </Software>
  • <Coverage> ....... </Coverage>
  • <Classification> ....... </Classification>
  • <Annotation> ....... </Annotation>
  • <URL> ....... </URL>
  • <URN> ....... </URN>
  • <UAS> ....... </UAS>
  • <Cost> ....... </Cost>
  • <abstract> ....... </abstract>
  • <size> ....... </size>

More details to come ASAP.

________________________________________

bcd