Metadata Research Program: Tides Project Home Page

Program

Research Areas

Subdomain vocabularies

Natural language
processing

Intelligent EVM agent

Access to numeric
databases

Evaluation methods

Cross-language retrieval

Projects

DARPA
Tides Translingual

   Overview

     Statement of Work

     Quad Chart

     Research Summary

     Prototypes

     One page summary

     Papers & reports

     Research Team

     Funding details

Unfamiliar Metadata

IMLS
Seamless

Prototypes

Papers

Last Update: 24-Sep-2001

DARPA TIDES Project Home

Investigators:

Michael Buckland | Fred Gey | Ray Larson

Translingual Information Management Using Domain Ontologies

Overview: The very large and continuing investment in the creation of online bibliographies and digital libraries has resulted in a body of tens of millions of textual records in all languages, carefully categorized by topic using systems for the organization of recorded knowledge -- indexing languages, library classifications, and topical thesauri, collectively "domain ontologies." This vast infrastructure, maintained in accordance with well-established and increasingly interoperable standards and protocols, can be viewed a corpus of carefully coded language fragments: titles, metadata, and, sometimes, summaries or the full text of documents.
    This project will demonstrate how these language fragments can be extracted and manipulated to: Create topical dictionaries showing the topic(s) associated with each word in any selected language; Extend the range and scale of these dictionaries using conventional bilingual or multilingual dictionaries; Use bilingual parallel texts where available to extend the range and scale of topical dictionaries; Collect corpora in digital form of contemporary discourse in little-documented languages.
    This project builds directly on the Unfamiliar Metadata project and on the CHESHIRE II retrieval system and is part of the Metadata Research Program.
    Quad chart: Ideas, impact and schedule.