|
Feature Extraction-- SUMMARY However, the volume of information alone is only one of the prerequisites of an information-driven society. Equally important is the ability to find the pieces of information relevant to a particular problem. Having the answer to a question but not being able to find it is equivalent to not having it at all. Current searching methods and algorithms are based on assumptions about technology and goals that seemed reasonable before the widespread use of computers. However, these assumptions no longer hold in the context of information retrieval systems. ``Good ideas do not always scale.'' [Kay] This paper presents a pattern that provides a proven solution for searching large volumes of information. This pattern is specific to the information retrieval domain. However, information retrieval has expanded into other fields like office automation, genome databases, fingerprint identification, medical imaging, data mining, multimedia, etc. Since the pattern works with any kind of data, it is therefore applicable in many other domains. The paper provides examples for text searching, telecommunications, stock prices, medical imaging and trademark symbols. The key idea of the pattern is to map from a large, complex problem space into a small, simple feature space. The mapping is the creative part of the solution and it is different for every type of application. Mapping into the feature space is also the hard part of this pattern. Traditional algorithms scale poorly for problems typical to the information retrieval domain. Since they were designed for exact matching, their use for similarity search is cumbersome. In contrast, feature extraction provides an elegant and efficient alternative. With information retrieval expanding into other fields, this pattern is useful to a wide range of applications.
|
|
|
|
|
|
... | ... | ... |