Jacek Purat
School of Information Management and Systems
University of California,
Berkeley, CA 94720-4600
September 1, 1998
Introduction
Current increased interest in Multilingual Environmental Thesauri (MET) may be attributed to a concern in two major areas of knowledge: (1) environmental information and (2) information retrieval. Buckland (1991) distinguishes four major categories of retrieval: physical, locating, identification, and identification by subject matter. The last category is applicable to the discussion about MET. The scope of the second area of knowledge: field of environmental information science resists precise definition (Lidicker, W. pers.com.) In general it is understood that this is a multidisciplinary field, composed of elements of natural sciences including biology, physical geography, physics, chemistry, medicine, social sciences, such as cultural geography, archaeology, anthropology, sociology; economics; technical sciences such as engineering, architecture, transport, energy; and other fields: law, education, ethics and other fields. Each of these fields has its own information tradition and uses its own specific terminology. This difference is reflected in the field-specific indexing languages and other metadata used in each field. In many cases terminology overlaps between fields, or different or similar terms are used to describe the same information object, creating a problem of access for different users that’s especially troublesome within environmental field, which demands a multidisciplinary approach to provide solutions to complex environmental problems. This problem is not only within English language, but is amplified by existing in parallel within other natural languages. To environmental information field, multilingual information retrieval adds even further complexity, due to different classifications and indexing traditions existing in each language.
We are dealing thus, with two multi levels: multi-disciplinary and multi-linguistic.
There are numerous retrieval approaches that address these complexities,
designed to aid in multilingual retrieval of environmental information.
One of the major approaches is through Multilingual Environmental Thesauri.
Thesauri
Thesaurus is a term derived from the Greek word ‘thesauros’ meaning treasure, or treasury. The plural of the term is "thesauri". The traditional definition of thesaurus is a work, which contains the entire knowledge from given discipline or a dictionary representing the entire vocabulary of given language (Doroszewski, W. 1981). The most famous example is Roget's Thesaurus, a subject guide to lists of word with similar meanings.
The field of information and library science defines thesaurus as "a compilation of words and phrases showing synonyms, hierarchical and other relationships and dependencies, the function of which is to provide a standardized vocabulary for information storage and retrieval systems " (Rowley, J. 1996). Such list of thesaurus terms - also called an authority list - is useful in showing terms, which may be used in indexing, and which should be not. Thesauri are thus used as linguistic tools in doing research and reading, but the most widely use is related to indexing and information retrieval.
The primary purpose of the thesaurus is to aid in information retrieval (Aitchison, J. et al, 1997). This is achieved by utilizing the thesaurus in indexing, and later in searching. Based on retrieval utility, there are four ways that the thesaurus may ne used:
(1) Thesaurus used in both indexing and searching
(2) Thesaurus used in indexing, but not searching
(3) Thesaurus used in searching, but not indexing
(4) Thesaurus used in neither
Most desirable from the retrieval point of view is the first type of use.
The thesauri are categorized after disciplines that they represent. For example: astronomical thesaurus, or chemical thesaurus. Some thesauri represent single classical academic disciplines, such as medical thesaurus, and others represent multidisciplinary fields, such as an environmental thesaurus. Significant group of thesauri represents the applied fields of knowledge, such as the water resources thesaurus, or agricultural thesaurus. In addition, the thesauri are divided into monolingual and multilingual, based on the number of languages that they include.
The thesaurus controls the use of the vocabulary by showing which are
the approved terms for indexing. The indexing terms may be selected
from standard terminologies or from any established ones. The ISO-2788
Standard for Monolingual Thesauri gives the following definition:
Thesaurus: the vocabulary of a controlled indexing language, formally
organized so that
a priori relationships between concepts (for
example as "broader" and "narrower") are
made explicit.
The standard further specifies the indexing terms that must be
used in indexing thesauri.
Indexing term: the representation of a concept, preferably in the form
of a noun or a
noun phrase. An indexing term can consist
of more than one word, and is then known as
compound term. In a controlled indexing vocabulary,
a term is designated either as
a preferred term or as a non-preferred term.
Preferred term: A term used consistently when indexing to represent given concepts sometimes known as "descriptor".
Non-preferred term: The synonym or quasi-synonym of a preferred term. A non-preferred term is not assigned to documents, but is provided as an entry point in a thesaurus or alphabetical index, the user being directed by an instruction (for example USE or SEE) to the appropriate preferred term; sometimes known as "non-descriptor".
All preferred and non-preferred terms can be classified into five distinct forms:
--Single words (example: Molecule; Bioregionalism; Recarbonization;
Refuse )
--Phrases of two or three words (example: Country Life; Money Laudering)
--Two words linked by ‘and’ or ‘&’ (example: Boats & Boating)
--Compound phrases (example: Habitat conservation plans; Human exposure
evaluation)
--Names of persons, bodies, places (example: Kosciuszko, Thaddeus;
Berkeley; Miskito Nation)
The relationships specified by the standard, and their abbreviations
are presented below:
SN
Scope note; a note attached to a term
to indicate its meaning within an indexing
language
USE
The term that follows the symbol is
the preferred term when a choice between
synonyms or quasi-synonyms exists
UF
Use for; the term that follows the symbol
is a non-preferred synonym or quasi-synonym
TT
Top term; the term that follows the
symbol is the name of the broadest class to which
the specific concept belongs; sometimes
used in the alphabetical section of a thesaurus
BT
Broader term; the term that follows
the symbol represents a concept having a wider
meaning
NT
Narrower term; the term that follows
the symbol refers to a concept with a more
specific meaning
RT
Related term; the term that follows
the symbol is associated, but is not a synonym, a
quasi-synonym, a broader term or a narrower
term
A two good lists of thesauri are located at
http://www.asindexing.org/thesonet.htm
http://www-cui.darmstadt.gmd.de/%7Elutes/thesauri.html.
Multilingual Thesauri
Multilingual thesauri consist of two or more monolingual thesauri cross-referenced
by
concept rather than alphabetically. By means of these links, a user
can follow correspondences across multiple languages and rapidly browse
an entry's sub-categorizations of meanings and synonyms. Like a traditional
monolingual thesaurus, a multilingual thesaurus is best used to complement
one's knowledge of a language, particularly as a memory aid for infrequently
used words and among translators for basic grammatical information, such
as gender and inflection (EAGLES, 1994). As such, it is more useful as
an aid for writing in a foreign language. For this reason, multilingual
thesauri have found favor in the administrative sector among non-language
professionals preparing correspondence in foreign languages. Professional
translators, in fact, may find the basic vocabulary of the current generation
of such packages too limited.
A well-designed interface can make an electronic multilingual thesaurus
easier to navigate than traditional monolingual paper-based thesauri. In
addition, the underlying lexical data can be organized in such a way as
to allow other ways of navigating around entries than just on the basis
of the formal synonym relationship.
Multilingual Thesaurus - used in indexing and retrieval is a thesaurus
containing indexing vocabularies representing given field in two or more
languages. This type of thesauri can be used to support indexing
and searching in several languages. In addition to standard relationships
present in the thesaurus (SN, USE, UF, TT, BT, NT, RT) it contains
the additional relationship called equivalence (=), which connects equivalent
terms between languages (Aitchison, et al 1997) The problems of multilingual
thesaurus construction are similar to monolingual thesaurus construction;
providing, of course, that there are competent linguists available. Perhaps
the most difficult aspect is that of human organization, often involving
the work of international committees. The relevant standard here
is ISO 5964: 1985: Guidelines for the establishment and development of
multilingual thesauri (International, 1985a), the equivalent to British
Standard 6723: 1985 (British, 1985a), which states clearly right at the
start: ‘The guidelines given in this International Standard should be used
in conjunction with ISO 2788, and regarded as an extension of the scope
of the monolingual guidelines. The majority of procedure and recommendations
contained in ISO 2788 are equally valid for a multilingual thesaurus’.
Thus, the handling of descriptors in multilingual thesauri is much the same as in monolingual work, with the added variety of morphologies arising from the different language forms. It may be useful in working on the thesaurus, to know that languages are classified broadly into four forms:
Inflectional languages, such as Latin, which use case - endings.
These root-suffixes qualify the nouns and verbs.
Agglutinative languages, such as Turkish, Finnish and Hungarian
where the root-suffixes can, and regularly do, stand as separate words.
Isolating languages, such as Chinese, which make no use of inflection,
agglutination or prepositions.
Analytical languages, such as English, which use word order,
auxiliary words and some vestiges of inflection to provide the grammatical
structure.
However, the knowledge of structure is secondary to a good working knowledge of the languages being handled, including the socio-cultural nuances, particularly present in non-scientific subjects.
Environmental thesauri
Lloyd, G. (1996) in an assessment of environmental thesauri, noticed that the domain of “environment” is such an all-encompassing one, that it is impossible to define its boundaries. Instead, he proposed to use the macro/micro thesaurus approach, which would leave the linking of terms to the user. In his model, the user would have a tool to develop its own micro thesaurus, which than could be linked to the exisiting macro thesaurus.
Since the definition of the environmental thesauri is essential for this work, I have attempted to construct the definition of an environmental thesaurus, which is based in part, on definition of environmental information (Purat, 1998).
Environmental thesaurus – a vocabulary of a controlled indexing languages, which enlist the environmental information objects (Purat, 1998). The goal of environmental thesauri is to further better understanding of our environment, including all living and non-living elements and the relationships between them. The vocabulary is formally organized so that a priori relationships between concepts (for example as "broader" and "narrower") are made explicit. Types of information include real space terms: Objects, events, processes, and their relations; and represented space terms: data, text, pictures, sounds and documents, collected objects, and relations between them. Following are the examples of these categories:
Real space terms:
*Objects: brain, plant, animal, metal, salt, redwood grove,
nest, building
*Processes: hurricane, birth, explosion, hunt, construction, copulation, seasons, life, coagulation, baking, storing, growth, symbiosis
*Relations between them: historical, spatial, analogy,
parallel, genealogical
Represented space terms :
*Oral tradition: Native languages, Stories, Legends, Myths,
Beliefs, Songs, Plays,
Ways, Mental maps, Native classifications. (Since there are no
oral thesauri,
oral tradition terms are not used in environmental thesaurus)
*Documented tradition:
Data: numerical description of real space
Text: written description of real space
Maps & Tables: spatial representation of info. objects
Pictures: painted description of real space
Sound records: recorded description of real space
Documents: Vehicles for data, text, pictures or sound
of informative nature
Collected representative objects: holotypes (species
holotypes in museums)
Relations between them: statistical, morphological, cognitive,
Collections of above
Gareth Lloyd (1996) conducted an assessment of environmental thesauri
for the EEA Catalogue of Data Sources. He focused on the logical
structuring of the thesauri and the nature of the terms used. The
work includes: (1) Bilateral comparison between 1991 CNR Trilingual Thesaurus
and the 1990 INFOTERRA Thesaurus, (2) Recommendation for GEMET Thesaurus,
(3) Description of software which currently used for construction, maintenance,
and searching of thesauri
Multilingual environmental thesauri
Multilingual environmental thesaurus (MET) –is an expanded environmental
thesaurus. It is composed of a set of multilingual vocabularies of a controlled
indexing languages, which describe the types of information objects that
define the aggregate surroundings. The terms represent documented tradition
that originated from oral tradition. The indexing terms in MET include
a set of standard thesaurus relationships. MET contains the source language
and target languages, with no implication as to their status, and no one
language having dominance over the other. Like other multilingual
thesauri, MET contains indexing vocabularies representing a given field
in two or more languages. In addition to standard relationships present
in a thesaurus (SN, USE, UF, TT, BT, NT, RT) MET contains the additional
relationship called equivalence (=), which connects equivalent terms between
languages. The degree of equivalence between foreign terms varies.
There are a total of five types of equivalence occuring in MET’s:
-exact equivalence
-inexact equivalence
-partial equivalence
-single-to-multiple equivalence
-non-equivalence
Abbreviations for MT construction are given usually in English, German,
French, and other languages, and along with the mathematical symbols:
BT Broader term <
NT Narrower term >
RT Related term --
USE Use _
UF Used for =
SN Scope Note (none)
Multilingual thesaurus construction
It is rather unlikely that the new thesaurus would be constructed without
any usage of existing thesauri, classifications, or subject headings.
Thus the major challenge in construction of a multilingual thesaurus are
the processes of integration of existing thesauri. There are five
chief problems of compatibility of terminology (International, 1996):
-Specificity: differences in degree of fineness of definition
(precision)
-Exhaustivity: differences in coverage of the field
-Compound terms: differences in using compound vs. separate terms
-Synonyms: different choice regarding synonyms
-Inter-relationships: differences in structure and emphasis
Methods proposed to address these problems include: mapping, switching,
merging, integration. A historical review and annotated bibliography on
this was done by Dahlberg (1996) and (International, 1996).
Multilingual thesauri management
Since the use of words changes over time and new concepts emerge, all
thesauri became outdated and it is essential to maintain and update their
contents periodically (Aitchison, J. et al, 1997). This role should be
assigned to an editor or a group of editors who should oversee the continuous
procedure, which becomes more complex in a multilingual setting.
Major changes fall into six categories:
-Amendment of existing terms
-Status of existing terms
-Deletion or demotion of existing terms
-Addition of new, or deletion of old relationships
-Addition of new terms
-Amendment of existing structure
Thesaurus management packages can aid the management of thesaurus revision.
The comprehensive listing of such software was provided by Milstead (1998),
for the American Society of Indexers.
List of Multilingual Environmental Thesauri
1. GEMET : General European Multilingual Environment Thesaurus
The General European Multilingual Environmental Thesaurus has been
developed as an indexing, retrieval and control tool for the Catalogue
of Data Sources of the European Environment Agency. GEMET was conceived
as a "general" thesaurus, aimed to define a common general language, a
core of general terminology for the environment. The current version -
GEMET 1.0 - is available in British English with equivalents in American
English. Danish, Dutch, French, German, Italian and Norwegian equivalents
are available to a high extent. The GEMET has with a Glossary in English,
which contains about 1,400 definitions of descriptors either relevant or
ambiguous.
GEMET has been compiled by merging the terms of the following multilingual
documents:
-Umwelt Thesaurus of Umweltbundesamt (UBA), Berlin,
-Thesaurus Italiano per Medio Ambiente (TIA) of Consiglio Nazionale
delle
Ricerche (CNR), Rome,
-Multilingual Environment Thesaurus (MET) of Nederlands Bureau voor
Onderzoek Informatie (NBOI), Amsterdam EnVoc Thesaurus,
-The UNEP Infoterra thesaurus,
-Thesaurus de Medio Ambiente of Ministerio de Obras Publicas, Transportes
y
Medio Ambiente (MOPTMA), Madrid,
-Lexique environnement -Planète, of the Ministère de
l'environnement, Paris,
Descriptors for the Dobris +3 report of the EEA and of Eurovoc, the
Thesaurus
of the European Parliament
The resulting 4,954 descriptors have been arranged in a classification
scheme made of 34 groups.
Each descriptor has been arranged in a hierarchy. Furthermore,
to allow a
thematic retrieval, a set of 41-theme descriptor has been assigned
to one or more themes. At the
moment, the user can access the thesaurus through the thematic structure,
through
the group-hierarchical list,or through the alphabetical list. GEMET
follows the ISO norms on
monolingual and multilingual thesauri.
Address: http://www.eea.dk/Locate/GEMET/default.htm
Format: electronic, paper
2.Environment and Development (Multilingual Thesaurus)
This is one of a very few thesauri/glossaries containing Arabic and
Chinese characters.
Retrieval utility: indexing
Languages: English, French, Spanish, Russian, Arabic, Chinese
Fields: General environmental science, policy and development
Terms: Narrow thesaurus
Authors: United Nations
Reference: Terminology Bulletin No. 344. Volume II (indexes) and Volume
I.
ST/SC/SER. F/344, United Nations, New York, 1992
Format: paper
3. Multilingual Thesaurus in Geosciences
The IUGS Commission on the Management and Application of Geoscience
Information
(COGEOINFO, previously COGEODOC) in collaboration with ICSTI (International
Council for Scientific and Technical Information) has recently published
the second edition of the Multilingual Thesaurus of Geosciences. The printed
version contains six languages: English (American), French, German, Russian,
Spanish and Italian. The database also contains Czech and Finnish. The
Multilingual Thesaurus in Geosciences contains 5823 key terms expressed
as descriptors (preferred terms) or non-descriptors (non-preferred terms)
in each language.
The Multilingual Thesaurus in Geosciences is compatible with major international
and national bibliographic databases and indexing vocabularies in geosciences:
the English (American) version of the MT is compatible with the AGI (American
Geological Institute) GeoRef Thesaurus, the French version with the INIST/BRGM
(Institute de l'Information Scientifique et Technique / Bureau de Recherches
Géologiques et Minières) PASCAL/GEODE Lexique, and the German
with the BGR (Bundesanstalt für Geowissenschaften und Rohstoffe) GEOLINE
Thesaurus. The other
language versions are compatible with national thesauri used in Russia
(VSEGEI), Spain (ITGE/GEOMINER), Italy (CNR/GEODOC) and Finland (Geological
Survey of Finland /FINGEO). The MT gives the documentary equivalents of
major geoscience concepts in the languages included at present. The Multilingual
Thesaurus in Geosciences is also an operational tool for the development
and management of other national geoscience thesauri in a manner that ensures
compatibility with other major geoscience information systems.
Retrieval utility: indexing and searching (Georef)
Structure: alphabetical, poly-hierarchical, code numbers assigned
Languages: English, French, German, Russian, Spanish, Italian
Fields: Physical geography, geology, geophysics
Authors: IUGS Comission on the Management and Application of
Geoscience Information
Format: paper, electronic
4. POPIN Thesaurus
The POPIN Thesaurus is a project of a Committee for International Cooperation
in National Research in Demography. CICRED is a non-governmental organization,
founded in 1972 with the aim of developing cooperation amongst national
population research centers, and encouraging new research. It also establishes
a link between research centers and international organizations, the main
activities of which lie in the field of demography - United Nation Population
Division, United Nations Population Fund (UNFPA), and also World Health
Organization (WHO) and Food and Agriculture Organization (FAO), to enable
the creation of an international network for the storage and retrieval
of population information, it was necessary to adopt classification of
publications using a set of descriptors: the Multilingual Population
Thesaurus (English, Arabic, Chinese, Spanish, French, and Portuguese),
prepared by CICRED in collaboration with the Population Division, provides
this common language. The "Population Information Network" (POPIN), a network
created in 1981 and managed by the UN Population Division, uses this Thesaurus
of which a third version was published in 1993 under the title POPIN Thesaurus.
Scope: General
Retrieval utility: indexing and searching
Structure: listed alphabetically (possible “hidden” hierarchy)
Languages: English, French, Spanish
Fields: population, general environment, demography, public health
Terms: terms do not match retrieved terminology
Authors: CICRED
Address: http://www.cicred.ined.fr/index.html
Format: electronic (French, Spanish, English), paper (French, Spanish,
English Arabic, Chinese, Portuguese)
5. INFOTERRA Multilingual Thesaurus
This is one of the oldest environmental thesauri. A multilingual
version I said to be under development, but it is not yet available for
viewing.
Scope: General
Retrieval utility: indexing and searching
Structure: polyhiearchy, listed alphabetically
Languages: English,
Fields: general, environment, developing countries, conservation
Authors:UNEP
Address: http://www.csa.com/routenet/read/infoterrathesauraus.html
Format: electronic (English only), paper (English only)
6. Dutch Milieu-thesaurus (Multilingual Environmental Thesaurus version)
The Netherlands Agency for Research Information (NBOI) has published,
under contract of the European Commission, the Multilingual Environmental
Thesaurus. It comprises approx. 2.400 preferred terms in 8 languages (English,
French, German, Spanish, Italian, Norwegian, Danish and Dutch) and, depending
on the language, a varying number (500 - 800) of non-preferred terms (synonyms
and variants). Preference has been given to the conceptual equivalents
instead of the exact "terminological" translation. The terms of the thesaurus
have been divided into 31 groups, each group representing a specific part
of the "environmental cycle" (pollution sources, polluting substances,
air, water, soil, noise, biology, processes and techniques, etc. etc.).
The hierarchy has been laid down in a coded structure. The linguistic equivalents
are linked to each other by these codes. The thesaurus has the normal thesaurus
relations, such as BT-NT (broader term - narrower term), USE-UF (use -
used for), RT (related term) and SEE. The publication comprises 8
volumes. Each volume consists of:
- a hierarchical, systematic part in which the terms are presented
in 31 hierarchical groups;
- an alphabetical part in which all terms are given in alphabetical
order, including their hierarchical relations (BT, NT, USE, UF, RT and
SEE);
- a numerical list of all the codes and their corresponding terms in
8 languages (this part is the same for all volumes).
Retrieval utility: indexing
Languages: English, French, German, Spanish, Italian, Norwegian,
Danish, Dutch
Fields: 31 groups of "environmental cycles"
Terms: 2,400 preferred & 50-800 non-preferred terms/per language
Author: Laurens de Lavieter
Reference: http://www.hgur.se/envir/miljo21/littips/0074.html
Format: paper, 8 volumes
7. FAO AGROVOC Thesaurus ONLINE
To make the AGRIS System (International Information System for the
Agricultural Sciences and Technology) of the Food Administration Organization
of the United Nations (FAO) available to non-English speaking users, FAO
and the European Community (EC) developed in the early 1980s a multilingual
agricultural vocabulary, AGROVOC. The AGROVOC Thesaurus has been
used by AGRIS for indexing and retrieval since 1986. The Third edition
of AGROVOC was published in 1995, with the first supplement. Supplements
to AGROVOC are produced yearly. The latest version of AGROVOC/Edition
3, version 1998 is available through the AGRIS Processing unit's FTP server
at the International Atomic Energy Agency in Vienna: ftp://ftp.iaea.org/dist/agris/outgoing/
Retrieval utility: indexing and searching
Structure: listed alphabetically and polyhierarchically
Languages: English, French, Spanish, German
Fields: agriculture, environment, developing countries, biology
Terms: 36,089 terms (lead language French), classical thesaurus
Author: FAO
Address: (1) http://www.fao.org/scripts/agrovoc/frame.htm
(2) http://www.cirad.fr:80/cgi-bin/agrovoc/agrovoc
Format: electronic
8. ECDIN - Environmental Chemicals Data and Information Network (thesaurus)
ECDIN is a factual databank created under the Environmental Research
Programme of the Joint Research Centre (JRC) of the Commission of European
Communities at the Ispra Establishment. The main characteristics
of ECDIN is the multidisciplinary approach, intended to allow a comprehensive
management of the environmental impact on each chemical. The thesaurus
contains also factual information, to make it usable for users without
a scientific or technical background, or without easy accesses to libraries.
ECDIN deals with the whole spectrum of parameters and properties that might
help the user to evaluate real or potential risk in the use of a chemical
and its economical and ecological impact. The following major categories
are covered:
Identification, Physical-Chemical Properties, Production and Use Legislation and Rules, Occupational Health and Safety, Toxicity, Concentrations and Fate in the Environment, Detection Methods, Hazards and Emergency.
The data contained in ECDIN are extracted from original published literature
or from existing databanks. Other sources are the list of names used
by the European Customs Inventory of Chemicals and European http://ulisse.ei.jrc.it/Ecdin/E_hinfo.html
Scope: Specific taxonomy
Retrieval utility: indexing and searching
Structure: polyhierarchy (7 levels), listed alphabetically,
Languages: Standard, English, German, French, Dutch,
Greek, Spanish, Danish, Italian, Portuguese
Fields: environmental chemicals
Terms: specific taxons
Authors: JRC/CEC
Address: http://ulisse.ei.jrc.it/
Format: electronic
9. ECPHIN – European Community Pharmaceutical Information Network
Databank
The objective of ECPHIN has been to create an information system to
store data on pharmaceutical products from official sources (or officially
recommended sources) on prices of medical products and scientific information
(SPC) to highlight disparities in prices and presentations of technical
information of equivalent pharmaceutical products.
Scope: General
Retrieval utility: indexing and searching
Structure: polyhierarchy, listed alphabetically
Languages: English,
Fields: general, environment, and developing countries
Terms: specific taxons
Authors: JRC/CEC
Address: http://ulisse.ei.jrc.it/
Format: electronic
10. Astronomy Thesaurus (Multilingual Supplement)
Retrieval utility: indexing and searching
Structure: Based on The Astronomy Thesaurus, by Robert R. Shobbrook
and Robyn M. Shobbrook, Version 2.0, January 1995. and ISO5964
Languages: English, French,
German, Italian, Spanish
Fields: Astronomy, Cosmology,
Space
Terms:preferred and nonpreferred
terms
Authors:Eugenia Gomez, Lucenya
Kedziora, Nora Loiseau, Edith
Sachtschal, Marie-Jose Vin, Marina Zuccoli
Reference:http://www.aao.gov.au/lib/mlsintro.html
Format: electronic
11. Thesaurus Sozialwissenschaften
The terminology included in this thesaurus covers the areas of environmental
studies, including social ecology, environmental movements, activism, and
some environmental philosophy. It is thus, a usefull tool to index
and search the growing social sphere of the environmentalism.
Retrieval utility: indexing and searching
Structure:Traditional
Languages:German, English
Fields:Economy, History,
Philosophy, State and Law
Terms:10,500 terms (6,900
preferred, 3,600 nonpreferred)
Authors: Bearb. von Hannelore Schott
Reference: ISBN 3-8206-0122-8;
ISBN 3-8206-0123-6
Format: electronic, paper
12. The UMLS Metathesaurus
The UMLS Metathesaurus is one of four knowledge sources developed and
distributed by the National Library of Medicine as part of the Unified
Medical Language System (UMLS) project. The Metathesaurus aggregates information
about biomedical concepts and terms from many controlled vocabularies and
classifications used in patient records, administrative health data, bibliographic
and full-text databases and expert systems. It preserves the names, meanings,
hierarchical contexts, attributes, and inter-term relationships present
in its source vocabularies; adds certain basic information to each concept;
and establishes new relationships between terms from different source vocabularies.
The Metathesaurus supplies information that computer programs can use to interpret user inquiries, interact with users to refine their questions, identify which databases contain information relevant to particular inquiries, and convert the users' terms into the vocabulary used by relevant information sources. The scope of the Metathesaurus is determined by the combined scope of its source vocabularies. The Metathesaurus is produced by automated processing of machine-readable versions of its source vocabularies, followed by human review and editing by subject experts. The Metathesaurus is intended primarily for use by system developers, but can also be a useful reference tool for database builders, librarians, and other information professionals.
The Metathesaurus is organized by concept or meaning. Alternate names for the same concept (synonyms, lexical variants, and translations) is linked together. Each Metathesaurus concept has attributes that help to define its meaning, e.g., the semantic type(s) or categories to which it belongs, its position in the hierarchical contexts from various source vocabularies, and, for many concepts, a definition. A number of relationships between different concepts are represented. Some of these relationships are derived from the source vocabularies; others are created during the construction of the Metathesaurus. Most inter-concept relationships in the Metathesaurus link concepts that are similar along some dimension. The Metathesaurus also includes use information, including the names of selected databases in which the concept appears, and, for MeSH® terms, information about the qualifiers that have been applied to the terms in MEDLINE®. Information on the co-occurrence of concepts in MEDLINE and in some other information sources is also included.
Content of the Metathesaurus
The 1998 version of the Metathesaurus contains 476,322 biomedical concepts
with 1,051,903 different concept names from more than 40 source vocabularies.
Important additions for 1998 include the remainder of the October 1995
edition of the Read Codes, which is produced by the National Health Service
in the United Kingdom; the International Classification of Diseases, 10th
edition (ICD10), as issued by the World Health Organization; the Health
Care Financing Administration's Common Procedure Coding System (HCPCS),
which includes the American Dental Association's Current Dental Terminology
(CDT); the International Classification of Primary Care (ICPC); the Nursing
Outcomes Classification (NOC); the 1997 FDA Standard Product Nomenclature
(SPN97); terminology from the University of Washington's Digital Anatomist
Symbolic Knowledge Base (UWDA); and the transliterated Russian edition
of MeSH.
Address: http://www.nlm.nih.gov/pubs/factsheets/umlsmeta.html
13. Macrothesaurus for Information Processing in the Field of
Economic and Social Development
The Macrothesaurus for Information Processing in the Field of Economic
and Social Development Is published by the Organization for Economic Cooperation
and Development (OECD). The OECD is a Paris-based intergovernmental
organization, whose purpose is to provide its 29 Member countries with
a forum to enable its members to consult and co-operate with each other
in order to achieve the highest sustainable economic growth in their countries
and improve the economic and social well-being of their populations.
The thesaurus is a bilingual (trilingual in paper) publication that is
used for indexing and searching multidisciplinary publications dealing
with sustainable development. This new edition of the Macrothesaurus for
Information Processing in the field of Economic and Social Development
represents a continuation of the combined efforts of many organizations
over a period of almost 30 years to create a common vocabulary to facilitate
the indexing, retrieval and exchange of development-related information.
The Macrothesaurus comprises descriptors (keywords) designed for indexing
books and documents covering the field of economic and social development.
It can also be used as a search aid for documentation centers, libraries,
databases and on-line networks. Efforts have been made to improve the user-friendliness
and flexibility of the Macrothesaurus by increasing the number of non-descriptors
(i.e. cross-references) and scope notes in this edition. The preparation
of this fifth edition was guided by an Advisory Committee composed of representatives
from the International Development Research Centre, Ottawa, the United
Nations Department of Economic and Social Affairs, New York, the Organisation
for Economic Co-operation and Development (OECD), and the OECD Development
Centre, Paris.
Retrieval utility: indexing and searching
Structure: listed alphabetically, hierarchically, code number assigned
Languages: English, French, Spanish
Fields: sustainable development
Terms: all
Authors: OECD
Address: http://electrade.gfi.fr/cgi-bin/OECDBookShop.storefront/
Format: electronic (English, French) paper (English, French, Spanish)
14. Multilingual European Thesaurus on Health Promotion.
The Multilingual European Thesaurus on Health Promotion is a project
of the International Union for Health Promotion and Education (IUHPE) .
The IUHPE is a global association of people and organizations working in
the fields of health promotion and health education, dedicated to the promotion
of world health through education, communication action and the development
of healthy public policies. The thesaurus covers the field of health promotion
and it has terms in English, French, German and Dutch. It has been compiled
by information managers in the field of health promotion education from
several European countries in 1996. The aim of the thesaurus is to improve
communication in the field of health promotion and health education at
European level. The thesaurus consists of 1,300 keywords in four languages.
Scope: General
Retrieval utility: indexing and searching
Structure: listed alphabetically, hierarchically, code number assigned
Languages: English, French, German, Dutch
Fields: public health, environmental health
Terms: UF, BT, RT, NT
Authors: IUHPE
Address: http://www.nigz.nl/multhes/th529.html
(2) http://www.ccer.ggl.ruu.nl/thes/pref.htm
Format: electronic, paper
15. Multilingual Egyptological Thesaurus
The Multilingual Egyptological Thesaurus is a valuable tool in indexing
and searching in the field of environmental history, cultural diversity,
bioarchaeology, and anhropology where it relates to environmental issues.
This is one of the most advanced multilingual thesauri in archaeology/anthropology
fields, and it offers a sound foundation for further multilingual development
of these fields. This Multilingual Egyptological Thesaurus, has been compiled
mainly for the (computerized) documentation and retrieval of museum objects,
results from the collaboration between the Computer Working Group of the
International Association of Egyptologists (IAE) and the Comité
International pour l'Égyptologie (CIPEG) of the International Council
of Museums (ICOM). The initiative was taken by the former during the "Table
ronde Informatique & Égyptologie" which took place in Paris,
July 1990. The latter decided to join the project according to the resolution
of the CIPEG conference in Moscow, July 1991. Following thesauri
were used in the construction of this multilingual thesaurus: the
French Faraon thesaurus of the Louvre Museum, the German list compiled
by Rolf Gundlach, Joachim Karig and Dietrich Wildung (Die Begriffslisten
der Dokumentation Ägyptischer Altertümer, Berlin-Darmstadt-München
1970), an English thesaurus made by Fathi Saleh (Cairo), the
Thesauri zur Erfassung kunst- und kulturgeschichtlicher Denkmäler
aus und in Ägypten in einem relationalen Datenbanksystem, compiled
by Maya Müller (Basel), and the Index of the Tübinger Atlas des
Vorderen Orients.
Scope: general, specific
Retrieval utility: indexing and searching
Structure: listed alphabetically, hierarchically, code numbers assigned
Languages: English, French, German,
Fields: General archaeology, general anthropology, egyptology, environmental
history
Terms: 15 categories with sub-hierarchies
Authors: IUHPE
Address: http://www.ccer.ggl.ruu.nl/thes/pref.htm
Format: electronic, paper
References
Aitchison, Jean, David Bawden, and Alan Gilchrist. (1997) Thesaurus Construction and Use: A Practical Manual. 3rd ed. London: Aslib, 1997. (Available in USA from Portland Press).
Buckland, Michael Keeble (1992) Information and Information Systems,
Greenwood Press, Westport, CT, 1992, pp. 225pp
Dahlberg, Ingetraut (1996). ‘The compatibility guidelines — a re-evaluation.’
In:
Compatibility and integration of order systems: Research Seminar Proceedings
of the TIP/ISKO Meeting, Warsaw, 13—15 September 1995. Warsaw: Wydawnictwo
SBP, 1995, pp. 32—45
Doroszewski, Witold. (1981) Slownik Poprawnej Polszczyzny, Panstwowy Instytut Wydawniczy, Warszawa 1981.
EAGLES (1994). Evaluation of Natural Language Processing Systems, EAG-EWG-PR.2, ILC-CNR, Pisa.
International Society for Knowledge Organization (ISKO), Polish Librarians Association and Society of Professional Information (TIP) (1996). Compatibility and integration of order systems: Research Seminar Proceedings of the TIP.iSKO Meeting, Warsaw, 13—15 September 1995. Warsaw: Wydawnictwo SBP, 1996
Johannes, Robert E. (1993). Integrating Traditional Ecological Knowledge and Management with Environmental Impact Assessment. In Inglis. J.T. (ed), op. cit.
Lloyd, Gareth. (1996). Assessment of Environmental Thesauri. EEA Information Interchange Project. Work Package 44.4, WCMC, November 1996 (draft version)
Milstead, Jessica (1998). Thesaurus Management Software. In: American Society of Indexers Web Site: http://www.asindexing.org/thessoft.htm
National Information Standards Institute. American National Standard Guidelines for the Construction, Format, and Management of Monolingual Thesauri. Bethesda, MD: NISO Press, 1994. (ANSI/NISO Z39.19-1993).
Purat, Jacek (1998). Organization of Environmental Terminologies. Research Paper. (http://www.sims.berkeley.edu/~purat/org_env_term.html)
Rowley, Jennifer, E. (1992) Organizing Knowledge, Second Edition, Gower Publishing, Vermont, USA, pp. 510
United Nations, (1992) Environment and Development. Terminology Bulletin No. 344. Volume II (indexes) and Volume I. ST/SC/SER. F/344, United Nations, New York, 1992
Copyright: Jacek Purat, SIMS, UC Berkeley.
Return to Search Support for Unfamiliar Metadata Vocabularies