 |


Last Update: 24-Sep-2001 |
 
|


DARPA TIDES Project
Home
Translingual
Information Management Using Domain Ontologies
Overview:
The very large and continuing investment in the creation of online
bibliographies and digital libraries has resulted in a body of tens of
millions of textual records in all languages, carefully categorized by
topic using systems for the organization of recorded knowledge -- indexing
languages, library classifications, and topical thesauri, collectively
"domain ontologies." This vast infrastructure, maintained in accordance
with well-established and increasingly interoperable standards and
protocols, can be viewed a corpus of carefully coded language fragments:
titles, metadata, and, sometimes, summaries or the full text of documents.
This project will demonstrate how these language fragments can be
extracted and manipulated to: Create topical dictionaries showing the
topic(s) associated with each word in any selected language; Extend the
range and scale of these dictionaries using conventional bilingual or
multilingual dictionaries; Use bilingual parallel texts where available to
extend the range and scale of topical dictionaries; Collect corpora in
digital form of contemporary discourse in little-documented languages.
This project builds directly on the Unfamiliar Metadata project
and on the CHESHIRE II retrieval system and is part of the Metadata
Research Program.
Quad chart: Ideas, impact and schedule.
|