Feature Extraction--
A Pattern for Information Retrieval

SUMMARY
The volume of information we can access is constantly increasing. Naturally, this growth has been influenced by the technology available for information processing. For a long time, printed paper was the prevalent information carrier. With the ubiquity of computers, current information is produced and processed in electronic format.

However, the volume of information alone is only one of the prerequisites of an information-driven society. Equally important is the ability to find the pieces of information relevant to a particular problem. Having the answer to a question but not being able to find it is equivalent to not having it at all.

Current searching methods and algorithms are based on assumptions about technology and goals that seemed reasonable before the widespread use of computers. However, these assumptions no longer hold in the context of information retrieval systems. ``Good ideas do not always scale.'' [Kay]

This paper presents a pattern that provides a proven solution for searching large volumes of information. This pattern is specific to the information retrieval domain. However, information retrieval has expanded into other fields like office automation, genome databases, fingerprint identification, medical imaging, data mining, multimedia, etc. Since the pattern works with any kind of data, it is therefore applicable in many other domains. The paper provides examples for text searching, telecommunications, stock prices, medical imaging and trademark symbols.

The key idea of the pattern is to map from a large, complex problem space into a small, simple feature space. The mapping is the creative part of the solution and it is different for every type of application. Mapping into the feature space is also the hard part of this pattern.

Traditional algorithms scale poorly for problems typical to the information retrieval domain. Since they were designed for exact matching, their use for similarity search is cumbersome. In contrast, feature extraction provides an elegant and efficient alternative. With information retrieval expanding into other fields, this pattern is useful to a wide range of applications.


http://jerry.cs.uiuc.edu/~manolesc/plop98/
FOR MORE INFORMATION

manolesc@cs.uiuc.edu

 

 

 

 

 

 

 

 

 

 

 

 

 

 

... ... ...