The Interspace Analysis Environment

Les Tyrrell
CANIS - Community Architectures for Network Information Systems
University of Illinois at Urbana-Champaign, Champaign, IL 61820
tyrrell@canis.uiuc.edu


"The uncertainty about how to search text, and how to interpret it if we do search it, represents the principal difference between database management systems, such as dBase IV or SQL, and information retrieval systems, such as DIALOG or MEDLARS."

- Charles T. Meadow, "Text Information Retrieval Systems"

Enabling Interaction with Advanced Information Retrieval Systems

The Interspace Analysis Environment will provide users with the means to directly manipulate and combine arbitrary information retrieval capabilities, whether they be traditional techniques such as full text search or advanced techniques such as the exploration of the term coocurrence relationships provided in the Interspace Services. The environment emphasis indicates that we wish to give users a level of control and power over their work comparable to that enjoyed by the programmer, as opposed to more typical application-oriented user interfaces which give the user only those abilities which the system programmer deemed neccessary.

In order to give the user this level of control, we are taking an object-oriented approach to the construction of the Analysis Environment's direct-manipulation user interface. In this approach, the objects presented to the user are meant to most closely embody underlying concepts found in the user semantic model, or that model which we wish to present to the user as if it is the actual system, regardless of the actual implemenation of the underlying system. This is a particularly good match for the Interspace model, as it is being implemented using object-oriented techniques and therefore provides a close match between the actual system model and our user semantic model.

In our object-oriented direct manipulation user interface, we emphasize the inherent capabilities of the presented objects. Applications are defined more by the objects found within them than by the windows in which they appear. This approach has the advantage that difficult-to-share application-specific capabilities are minimized while object-specific capabilities are more readily identified and reused throughout the environment, independent of any one particular "application". For instance, within the Interspace model a term which has been extracted from the documents of a given domain has a number of inherent relationships with other objects, such as its appearance in a number of documents, cooccurrence with other terms as found in the documents of the domain, and noting which authors have used that term. No matter where a term is presented within the analysis environment, these remain its inherent capabilities and they are always available to the user.

Design Guidelines

To be convincing in our illusion of the user semantic model while simultaneously giving the user great control over that illusion, we follow these three guidelines:

Architecture

Components of the Analysis Environment are built according to a layered strategy. Each layer corresponds to a particular role or task required by the analysis environemnt, and objects are developed to implement one or more of these roles. Beginning at the lowest level, these layers are:

Development

The layered architecture we are using in development of the Analysis Environment has allowed us to focus on developing particular layers while being somewhat independent of the maturity of the components in the other layers. We've used this feature to emphasis the development of the lower layers, from the system model up through the Interaction/Presentation layers, while relying on a simple textual implementation of the Presentation half of the I/P layer through to the Browser layer. This initial emphasis in development efforts also has a name, it represents the initial development of the analysis environment and is known as the "Interspace Services Technology Demonstrator, Mk. I". The second phase of development ( now underway ) will focus on developing the upper layers of the architecture. This will employ an entirely different form of display technology, based fully upon a graphical direct manipulation user interface toolkit known as "Morphic", originally developed in the Self language but now ported into the dialect of Smalltalk used to build the analysis environment. This phase is distinguished with the Mk. II designation for ISTD. Both of these phases are considered to be leading up to the final realization of the Analysis Environment, which might also be thought of as ISTD mk. III .

The Show so Far

The goals of ISTD mk. I were to encounter and develop the roles and interaction patterns of the lower level infrastructure needed by the analysis environment. This is sufficiently complete that we have now shifted emphasis to the upper levels, which are the province of ISTD mk. II. Lower level work will continue as needed in reflection of those needs identified by the changes in the upper layers of the analysis environment architecture.

In Action

Presently ISTD mk. I is working with real information held in object databases built according to the system model for the Interspace Services. An example session using ISTD mk. I is shown in the following screendump which shows the following, from left to right:

Domains is a window listing select root domains and their subdomains, as exposed by the user. From this list, two were chosen, one to be the source and the other the target for this switching demonstration.

Colorectal Neoplasms, Hereditary Nonpolyposis and its accompanying Search box are two windows for browsing this domain having documents on a form of hereditary cancer. In the search box, we have used the term hereditary cancer as a starting point. The first level of terms shown beneath the title are terms matching this search query. The second level are terms which are related via cooccurrence to the phrase Hereditary nonpolyposis colorectal cancer, in which list the phrase mismatch repair genes appeared to be interesting. Terms related to mismatch repair genes are further exposed as the third level of terms.

Genes, Regulator and its accompanying search and switching boxes are windows corresponding to another domain of knowledge in which the user is interested in possible connections between regulating genes and hereditary cancer. The user has already tried a search for mismatch repair genes, but this had no result. Instead, they have made use of a rudimentary vocabulary switch based on a permuted lexical exploded search for the phrase mismatch repair genes and all phrases related via cooccurrence within the domain "Colorectal Neoplasms, Hereditary Nonpolyposis". Results of this switch form the first level of phrases in the "Genes, Regulator" window.

The term polymerase chain reaction technique appeared to be an interesting topic, one that is in both domains, and perhaps somehow related to mismatch repair genes. The second level of items in the list ( the underlined items ) represent the documents in which the term polymerase chain reaction technique was found. From out of that list, the document titled Regulation by CDF/LIF and retinoic acid of multiple ChAT mRNAs produced from distinct promoters. was selected in order to view its abstract.

Within the abstract, a modest hint about an associative hypertext ability exists in the form of the buttons upon which the author names are given. These could launch other views onto the selected author, and similar abilities are planned for the text within the documents so that the user would be informed of the phrases within the document which have become part of the domain and/or the domain's concept space.

As it turns out, this was probably not a productive line of search, but then that is also part of the search process. The environment allows the user to pursue differing lines of attack without getting locked into any particular path- at any point, any object reachable by the user can serve as a context-breaking point via that object's inherent relationships with other objects in the environment. Alternatively, the path taken by the user in exposing the various ( and self-describing ) relationships can be used to form a context enabling the user to narrowly specify a particular query as a natural side-effect of browsing through the space of concepts. This context forming ability is not demonstrated explicitly here, but is an existing ability within ISTD.