Abstract of:
Entry Vocabulary -- a Technology to Enhance Digital Search
Fredric
Gey, Michael Buckland, Aitao Chen, and Ray
Larson
This paper describes a search technology which enables improved search
across diverse genres of digital objects -- documents, patents, cross-language
retrieval, numeric data and images. The technology leverages human indexing
of objects in specialized domains to provide increased accessibility to
non-expert searchers. Our approach is the reverse-engineer text categorization
to supply mappings from ordinary language vocabulary to specialist vocabulary
by constructing maximum likelihood mappings between words and phrases
and classification schemes. This forms the training data or 'entry vocabulary';
subsequently user queries are matched against the entry vocabulary to
expand the search universe. The technology has been applied to search
of patent databases, numeric economic statistics, and foreign language
document collections.
(PDF)
(PS)
Back
|