Workshop A: Web-wide Indexing/Semantic Header or Cover Page

Chair: Bipin C. Desai, Brian Pinkerton

John Leavitt The WebAnts project is working to develop both cooperative explorers and cooperative servers, called "ants". This cooperation will allow the processing and throughput load to be distributed among many machines. While indexing/exploring, this means that lots of explorers are all working at once and are in constant communication with each other to ensure that no area gets re-explored. This expands both the processing and throughput resources available, which means that much more of the web may be explored in a given period of time. When serving information, a similar situation will exist, with many servers each providing only part of the entire database. When a user asks one server for information, the server can not only check its own portion of the database, but can also, if necessary or desired, pass the request onto the other servers. The responses from the different servers can then assembled into a single response for the user. In general, the user should not even notice that this extra work is being done. The advantage is that it is not necessary to have a single machine providing all the processing, throughput, and storage. Note that the WebAnts project is concerned with developing the technology to support this sort of distribution. There remain significant questions as to what the best way may be to break up the database. One obvious approach to distribute the information based on its geographic location, i.e. a server in London has the database for all European documents, one in Pittsburgh has all of the US and Canada, etc. This approach would have many advantages, not the least of which is eliminating unnecessary traffic between continents. Unfortunately, it also has a serious problem, in that it does not reflect the structure of the web, and therefore is unlikely to mirror the distribution during exploration. This lack of symmetry would necessitate an intermediate step in which each explorer attempts to determine which servers should have which information and ships it off to them. Such as step would be non-trivial to implement and could tend to decrease the value of distributing the exploration phase. The WebAnts project recently received funding from Texas Instruments so that a more concentrated, albeit still part-time, effort may be put forth. As part of this effort, cooperative explorers are currently being developed. These will follow a master and servants design, in which each master will coordinate the efforts of its slaves, both between them and by communicating with other masters. The first versions should start exploring the web by the end of March. The next phase of the project is the development of distributed servers, which should make their appearance in early fall.