The World of Multilingual Environmental Thesauri

Jacek Purat
School of Information Management and Systems
University of California,
Berkeley, CA 94720-4600

September 1, 1998
 
 
 

Introduction

Current increased interest in Multilingual Environmental Thesauri (MET) may be attributed to a concern in two major areas of knowledge: (1) environmental information and (2) information retrieval. Buckland (1991) distinguishes four major categories of retrieval: physical, locating, identification, and identification by subject matter. The last category is applicable to the discussion about MET.  The scope of the second area of knowledge: field of environmental information science resists precise definition (Lidicker, W. pers.com.)   In general it is understood that this is a multidisciplinary field, composed of elements of natural sciences including biology, physical geography, physics, chemistry, medicine, social sciences, such as cultural geography, archaeology, anthropology, sociology; economics; technical sciences such as engineering, architecture, transport, energy; and other fields: law, education, ethics and other fields.  Each of these fields has its own information tradition and uses its own specific terminology.  This difference is reflected in the field-specific indexing languages and other metadata used in each field.  In many cases terminology overlaps between fields, or different or similar terms are used to describe the same information object, creating a problem of access for different users that’s especially troublesome within environmental field, which demands a multidisciplinary approach to provide solutions to complex environmental problems.  This problem is not only within English language, but is amplified by existing in parallel within other natural languages. To environmental information field,  multilingual information retrieval adds even further complexity, due to different classifications and indexing traditions existing in each language.

We are dealing thus, with two multi levels: multi-disciplinary and multi-linguistic.  There are numerous retrieval approaches that address these complexities, designed to aid in multilingual retrieval of environmental information.  One of the major approaches is through Multilingual Environmental Thesauri.
 
 

Thesauri

Thesaurus is a term derived from the Greek word ‘thesauros’ meaning treasure, or treasury.   The plural of the term is "thesauri".  The traditional definition of thesaurus is a work, which contains the entire knowledge from given discipline or a dictionary representing the entire vocabulary of given language (Doroszewski, W. 1981).  The most famous example is Roget's Thesaurus, a subject guide to lists of word with similar meanings.

The field of information and library science defines thesaurus as "a compilation of words and phrases showing synonyms, hierarchical and other relationships and dependencies, the function of which is to provide a standardized vocabulary  for information storage and retrieval systems " (Rowley, J. 1996).  Such list of thesaurus terms - also called an authority list - is useful in  showing terms, which may be used in indexing, and which should be not. Thesauri are thus used as linguistic tools in doing research and reading, but the most widely use is related to indexing and information retrieval.

The primary purpose of the thesaurus is to aid in information retrieval (Aitchison, J. et al, 1997).  This is achieved by utilizing the thesaurus in indexing, and later in searching.  Based on retrieval utility, there are four ways that the thesaurus may ne used:

(1) Thesaurus used in both indexing and searching
(2) Thesaurus used in indexing, but not searching
(3) Thesaurus used in searching, but not indexing
(4) Thesaurus used in neither

Most desirable from the retrieval point of view is the first type of use.

The thesauri are categorized after disciplines that they represent. For example: astronomical thesaurus, or chemical thesaurus.  Some thesauri represent single classical academic disciplines, such as medical thesaurus, and others represent multidisciplinary fields, such as an environmental thesaurus.  Significant group of thesauri represents the applied fields of knowledge, such as the water resources thesaurus, or agricultural thesaurus.  In addition, the thesauri are divided into monolingual and multilingual, based on the number of languages that they include.

The thesaurus controls the use of the vocabulary by showing which are the approved terms for indexing.  The indexing terms may be selected from standard terminologies or from any established ones.  The ISO-2788 Standard for Monolingual Thesauri gives the following definition:
 
Thesaurus: the vocabulary of a controlled indexing language, formally organized so that
     a priori relationships between concepts (for example as "broader" and "narrower") are
     made explicit.
 
 The standard further specifies the indexing terms that must be used in indexing thesauri.

Indexing term: the representation of a concept, preferably in the form of a noun or a
     noun phrase.  An indexing term can consist of more than one word, and is then known as
     compound term. In a controlled indexing vocabulary, a term is designated either as
a preferred term or as a non-preferred term.

Preferred term: A term used consistently when indexing to represent given concepts sometimes known as "descriptor".

Non-preferred term: The synonym or quasi-synonym of a preferred term. A non-preferred term is not assigned to documents, but is provided as an entry point in a thesaurus or alphabetical index, the user being directed by an instruction (for example USE or SEE) to the appropriate preferred term; sometimes known as "non-descriptor".

All preferred and non-preferred terms can be classified into five distinct forms:

--Single words (example: Molecule; Bioregionalism; Recarbonization; Refuse )
--Phrases of two or three words (example: Country Life; Money Laudering)
--Two words linked by ‘and’ or ‘&’ (example: Boats & Boating)
--Compound phrases (example: Habitat conservation plans; Human exposure evaluation)
--Names of persons, bodies, places (example: Kosciuszko, Thaddeus; Berkeley; Miskito Nation)

The relationships specified by the standard, and their abbreviations are presented below:
 
SN
      Scope note; a note attached to a term to indicate its meaning within an indexing
      language
 USE
      The term that follows the symbol is the preferred term when a choice between
      synonyms or quasi-synonyms exists
  UF
      Use for; the term that follows the symbol is a non-preferred synonym or quasi-synonym
  TT
      Top term; the term that follows the symbol is the name of the broadest class to which
      the specific concept belongs; sometimes used in the alphabetical section of a thesaurus
  BT
      Broader term; the term that follows the symbol represents a concept having a wider
      meaning
  NT
      Narrower term; the term that follows the symbol refers to a concept with a more
      specific meaning
  RT
      Related term; the term that follows the symbol is associated, but is not a synonym, a
      quasi-synonym, a broader term or a narrower term
 
A two good lists of thesauri are located at
http://www.asindexing.org/thesonet.htm
http://www-cui.darmstadt.gmd.de/%7Elutes/thesauri.html.
 

Multilingual Thesauri
Multilingual thesauri consist of two or more monolingual thesauri cross-referenced by
concept rather than alphabetically. By means of these links, a user can follow correspondences across multiple languages and rapidly browse an entry's sub-categorizations of meanings and synonyms. Like a traditional monolingual thesaurus, a multilingual thesaurus is best used to complement one's knowledge of a language, particularly as a memory aid for infrequently used words and among translators for basic grammatical information, such as gender and inflection (EAGLES, 1994). As such, it is more useful as an aid for writing in a foreign language. For this reason, multilingual thesauri have found favor in the administrative sector among non-language professionals preparing correspondence in foreign languages. Professional translators, in fact, may find the basic vocabulary of the current generation of such packages too limited.
A well-designed interface can make an electronic multilingual thesaurus easier to navigate than traditional monolingual paper-based thesauri. In addition, the underlying lexical data can be organized in such a way as to allow other ways of navigating around entries than just on the basis of the formal synonym relationship.
 

Multilingual Thesaurus -  used in indexing and retrieval is a thesaurus containing indexing vocabularies representing given field in two or more languages.  This type of thesauri can be used to support indexing and searching in several languages. In addition to standard relationships present in the thesaurus (SN, USE, UF, TT, BT, NT, RT) it contains  the additional relationship called equivalence (=), which connects equivalent terms between languages (Aitchison, et al 1997) The problems of multilingual thesaurus construction are similar to monolingual thesaurus construction; providing, of course, that there are competent linguists available. Perhaps the most difficult aspect is that of human organization, often involving the work of international committees.  The relevant standard here is ISO 5964: 1985: Guidelines for the establishment and development of multilingual thesauri (International, 1985a), the equivalent to British Standard 6723: 1985 (British, 1985a), which states clearly right at the start: ‘The guidelines given in this International Standard should be used in conjunction with ISO 2788, and regarded as an extension of the scope of the monolingual guidelines. The majority of procedure and recommendations contained in ISO 2788 are equally valid for a multilingual thesaurus’.
 
 

Thus, the handling of descriptors in multilingual thesauri is much the same as in monolingual work, with the added variety of morphologies arising from the different language forms. It may be useful in working on the thesaurus, to know that  languages are classified broadly into four forms:

 Inflectional languages, such as Latin, which use case - endings. These root-suffixes qualify the nouns and verbs.
 Agglutinative languages, such as Turkish, Finnish and Hungarian where the root-suffixes can, and regularly do, stand as separate words.
 Isolating languages, such as Chinese, which make no use of inflection, agglutination or prepositions.
 Analytical languages, such as English, which use word order, auxiliary words and some vestiges of inflection to provide the grammatical structure.

However, the knowledge of structure is secondary to a good working knowledge of the languages being handled, including the socio-cultural nuances, particularly present in non-scientific subjects.

 

Environmental thesauri

Lloyd, G. (1996) in an assessment of environmental thesauri, noticed that the domain of “environment” is such an all-encompassing one, that it is impossible to define its boundaries. Instead, he proposed to use the macro/micro thesaurus approach, which would leave the linking of terms to the user.  In his model, the user would have a tool to develop its own micro thesaurus, which than could be linked to the exisiting macro thesaurus.

Since the definition of the environmental thesauri is essential for this work, I have attempted to construct the definition of an environmental thesaurus, which is based in part, on definition of environmental information (Purat, 1998).

Environmental thesaurus – a vocabulary of a controlled indexing languages, which enlist the environmental information objects (Purat, 1998).  The goal of environmental thesauri is to further better understanding of our environment, including all living and non-living elements and the relationships between them.   The vocabulary is formally organized so that a priori relationships between concepts (for example as "broader" and "narrower") are made explicit. Types of information include real space terms: Objects, events, processes, and their relations; and represented space terms: data, text, pictures, sounds and documents, collected objects, and relations between them.  Following are the examples of these categories:

Real space terms:
 *Objects: brain, plant, animal, metal, salt, redwood grove, nest, building

 *Processes: hurricane, birth, explosion, hunt, construction, copulation, seasons, life, coagulation, baking, storing, growth, symbiosis

 *Relations between them: historical, spatial, analogy, parallel, genealogical
 

Represented space terms :
*Oral tradition: Native languages, Stories, Legends, Myths, Beliefs, Songs, Plays,
Ways, Mental maps, Native classifications.  (Since there are no oral thesauri,
oral tradition terms are not used in environmental thesaurus)
 
*Documented tradition:
  Data: numerical description of real space
  Text: written description of real space
  Maps & Tables: spatial representation of info. objects
  Pictures: painted description of real space
  Sound records: recorded description of real space
  Documents: Vehicles for data, text, pictures or sound of informative nature
  Collected representative objects: holotypes (species holotypes in museums)
  Relations between them: statistical, morphological, cognitive,
  Collections of above
 
 
 
 

Gareth Lloyd (1996) conducted an assessment of environmental thesauri for the EEA Catalogue of Data Sources.  He focused on the logical structuring of the thesauri and the nature of the terms used.  The work includes: (1) Bilateral comparison between 1991 CNR Trilingual Thesaurus and the 1990 INFOTERRA Thesaurus, (2) Recommendation for GEMET Thesaurus, (3) Description of software which currently used for construction, maintenance, and searching of thesauri
 

Multilingual environmental thesauri
Multilingual environmental thesaurus (MET) –is an expanded environmental thesaurus. It is composed of a set of multilingual vocabularies of a controlled indexing languages, which describe the types of information objects that define the aggregate surroundings. The terms represent documented tradition that originated from oral tradition.  The indexing terms in MET include a set of standard thesaurus relationships. MET contains the source language and target languages, with no implication as to their status, and no one language having dominance over the other.  Like other multilingual thesauri, MET contains indexing vocabularies representing a given field in two or more languages.  In addition to standard relationships present in a thesaurus (SN, USE, UF, TT, BT, NT, RT) MET contains the additional relationship called equivalence (=), which connects equivalent terms between languages.  The degree of equivalence between foreign terms varies. There are a total of five types of equivalence occuring in MET’s:
 -exact equivalence
 -inexact equivalence
 -partial equivalence
 -single-to-multiple equivalence
 -non-equivalence

Abbreviations for MT construction are given usually in English, German, French, and other languages, and along with the mathematical symbols:
BT  Broader term   <
NT Narrower term   >
RT Related term  --
USE Use  _
UF Used for  =
SN Scope Note  (none)
 

Multilingual thesaurus construction
It is rather unlikely that the new thesaurus would be constructed without any usage of existing thesauri, classifications, or subject headings.  Thus the major challenge in construction of a multilingual thesaurus are the processes of integration of existing thesauri.  There are five chief problems of compatibility of terminology (International, 1996):
 -Specificity: differences in degree of fineness of definition (precision)
 -Exhaustivity: differences in coverage of the field
 -Compound terms: differences in using compound vs. separate terms
 -Synonyms: different choice regarding synonyms
 -Inter-relationships: differences in structure and emphasis
Methods proposed to address these problems include: mapping, switching, merging, integration. A historical review and annotated bibliography on this was done by Dahlberg (1996) and (International, 1996).
 

Multilingual thesauri management
Since the use of words changes over time and new concepts emerge, all thesauri became outdated and it is essential to maintain and update their contents periodically (Aitchison, J. et al, 1997). This role should be assigned to an editor or a group of editors who should oversee the continuous procedure, which becomes more complex in a multilingual setting.  Major changes fall into six categories:
 -Amendment of existing terms
 -Status of existing terms
 -Deletion or demotion of existing terms
 -Addition of new, or deletion of old relationships
 -Addition of new terms
 -Amendment of existing structure
 
Thesaurus management packages can aid the management of thesaurus revision. The comprehensive listing of such software was provided by Milstead (1998), for the American Society of Indexers.
 
 

List of Multilingual Environmental Thesauri

1. GEMET : General European Multilingual Environment Thesaurus
The General European Multilingual Environmental Thesaurus has been developed as an indexing, retrieval and control tool for the Catalogue of Data Sources of the European Environment Agency. GEMET was conceived as a "general" thesaurus, aimed to define a common general language, a core of general terminology for the environment. The current version - GEMET 1.0 - is available in British English with equivalents in American English. Danish, Dutch, French, German, Italian and Norwegian equivalents are available to a high extent. The GEMET has with a Glossary in English, which contains about 1,400 definitions of descriptors either relevant or ambiguous.
GEMET has been compiled by merging the terms of the following multilingual documents:
-Umwelt Thesaurus of Umweltbundesamt (UBA), Berlin,
-Thesaurus Italiano per Medio Ambiente (TIA) of Consiglio Nazionale delle
Ricerche (CNR), Rome,
-Multilingual Environment Thesaurus (MET) of Nederlands Bureau voor
Onderzoek Informatie (NBOI), Amsterdam EnVoc Thesaurus,
-The UNEP Infoterra thesaurus,
-Thesaurus de Medio Ambiente of Ministerio de Obras Publicas, Transportes y
Medio Ambiente (MOPTMA), Madrid,
-Lexique environnement -Planète, of the Ministère de l'environnement, Paris,
Descriptors for the Dobris +3 report of the EEA and of Eurovoc, the Thesaurus
of the European Parliament
 

The resulting 4,954 descriptors have been arranged in a classification scheme made of 34 groups.
Each descriptor has been arranged in a hierarchy.  Furthermore, to allow a
thematic retrieval, a set of 41-theme descriptor has been assigned to one or more themes. At the
moment, the user can access the thesaurus through the thematic structure, through
the group-hierarchical list,or through the alphabetical list. GEMET follows the ISO norms on
monolingual and multilingual thesauri.

Address: http://www.eea.dk/Locate/GEMET/default.htm
Format: electronic, paper
 
 
2.Environment and Development (Multilingual Thesaurus)
This is one of a very few thesauri/glossaries containing Arabic and Chinese characters.
Retrieval utility: indexing
Languages: English, French, Spanish, Russian, Arabic, Chinese
Fields: General environmental science, policy and development
Terms: Narrow thesaurus
Authors: United Nations
Reference: Terminology Bulletin No. 344. Volume II (indexes) and Volume I.
ST/SC/SER. F/344, United Nations, New York, 1992
Format: paper
 

3. Multilingual Thesaurus in Geosciences
The IUGS Commission on the Management and Application of Geoscience Information
(COGEOINFO, previously COGEODOC) in collaboration with ICSTI (International Council for Scientific and Technical Information) has recently published the second edition of the Multilingual Thesaurus of Geosciences. The printed version contains six languages: English (American), French, German, Russian, Spanish and Italian. The database also contains Czech and Finnish. The Multilingual Thesaurus in Geosciences contains 5823 key terms expressed as descriptors (preferred terms) or non-descriptors (non-preferred terms) in each language.

The Multilingual Thesaurus in Geosciences is compatible with major international and national bibliographic databases and indexing vocabularies in geosciences: the English (American) version of the MT is compatible with the AGI (American Geological Institute) GeoRef Thesaurus, the French version with the INIST/BRGM (Institute de l'Information Scientifique et Technique / Bureau de Recherches Géologiques et Minières) PASCAL/GEODE Lexique, and the German with the BGR (Bundesanstalt für Geowissenschaften und Rohstoffe) GEOLINE Thesaurus. The other
language versions are compatible with national thesauri used in Russia (VSEGEI), Spain (ITGE/GEOMINER), Italy (CNR/GEODOC) and Finland (Geological Survey of Finland /FINGEO). The MT gives the documentary equivalents of major geoscience concepts in the languages included at present. The Multilingual Thesaurus in Geosciences is also an operational tool for the development and management of other national geoscience thesauri in a manner that ensures compatibility with other major geoscience information systems.

Retrieval utility: indexing and searching (Georef)
Structure: alphabetical, poly-hierarchical, code numbers assigned
Languages: English, French, German, Russian, Spanish, Italian
Fields: Physical geography, geology, geophysics
Authors: IUGS Comission on the Management  and Application of Geoscience Information
Format: paper, electronic

 

4. POPIN Thesaurus
The POPIN Thesaurus is a project of a Committee for International Cooperation  in National Research in Demography. CICRED is a non-governmental organization, founded in 1972 with the aim of developing cooperation amongst national population research centers, and encouraging new research. It also establishes a link between research centers and international organizations, the main activities of which lie in the field of demography - United Nation Population Division, United Nations Population Fund (UNFPA), and also World Health Organization (WHO) and Food and Agriculture Organization (FAO), to enable the creation of an international network for the storage and retrieval of population information, it was necessary to adopt classification of publications using a set of descriptors:  the Multilingual Population Thesaurus (English, Arabic, Chinese, Spanish, French, and Portuguese), prepared by CICRED in collaboration with the Population Division, provides this common language. The "Population Information Network" (POPIN), a network created in 1981 and managed by the UN Population Division, uses this Thesaurus of which a third version was published in 1993 under the title POPIN Thesaurus.

Scope: General
Retrieval utility: indexing and searching
Structure: listed alphabetically (possible “hidden” hierarchy)
Languages: English, French, Spanish
Fields: population, general environment, demography, public health
Terms: terms do not match retrieved terminology
Authors: CICRED
Address: http://www.cicred.ined.fr/index.html
Format: electronic (French, Spanish, English), paper (French, Spanish, English Arabic, Chinese, Portuguese)
 

5. INFOTERRA Multilingual Thesaurus
This is one of the oldest environmental thesauri.  A multilingual version I said to be under development, but it is not yet available for viewing.
Scope: General
Retrieval utility: indexing and searching
Structure: polyhiearchy, listed alphabetically
Languages: English,
Fields: general, environment, developing countries, conservation
Authors:UNEP
Address: http://www.csa.com/routenet/read/infoterrathesauraus.html
Format: electronic (English only), paper (English only)
 

6. Dutch Milieu-thesaurus (Multilingual Environmental Thesaurus version)
The Netherlands Agency for Research Information (NBOI) has published, under contract of the European Commission, the Multilingual Environmental Thesaurus. It comprises approx. 2.400 preferred terms in 8 languages (English, French, German, Spanish, Italian, Norwegian, Danish and Dutch) and, depending on the language, a varying number (500 - 800) of non-preferred terms (synonyms and variants). Preference has been given to the conceptual equivalents instead of the exact "terminological" translation. The terms of the thesaurus have been divided into 31 groups, each group representing a specific part of the "environmental cycle" (pollution sources, polluting substances, air, water, soil, noise, biology, processes and techniques, etc. etc.).  The hierarchy has been laid down in a coded structure. The linguistic equivalents are linked to each other by these codes. The thesaurus has the normal thesaurus relations, such as BT-NT (broader term - narrower term), USE-UF (use - used for), RT (related term) and SEE.  The publication comprises 8 volumes. Each volume consists of:
- a hierarchical, systematic part in which the terms are presented in 31 hierarchical groups;
- an alphabetical part in which all terms are given in alphabetical order, including their hierarchical relations (BT, NT, USE, UF, RT and SEE);
- a numerical list of all the codes and their corresponding terms in 8 languages (this part is the same for all volumes).

Retrieval utility: indexing
Languages: English, French, German, Spanish, Italian, Norwegian,
Danish, Dutch
Fields: 31 groups of "environmental cycles"
Terms: 2,400 preferred & 50-800 non-preferred terms/per language
Author: Laurens de Lavieter
Reference: http://www.hgur.se/envir/miljo21/littips/0074.html
Format: paper, 8 volumes
 

7. FAO AGROVOC Thesaurus ONLINE
To make the AGRIS System (International Information System for the Agricultural Sciences and Technology) of the Food Administration Organization of the United Nations (FAO) available to non-English speaking users, FAO and the European Community (EC) developed in the early 1980s a multilingual agricultural vocabulary, AGROVOC.  The AGROVOC Thesaurus has been used by AGRIS for indexing and retrieval since 1986. The Third edition of AGROVOC was published in 1995, with the first supplement. Supplements to AGROVOC are produced yearly. The latest version of  AGROVOC/Edition 3, version 1998 is available through the AGRIS Processing unit's FTP server at the International Atomic Energy Agency in Vienna: ftp://ftp.iaea.org/dist/agris/outgoing/

Retrieval utility: indexing and searching
Structure: listed alphabetically and  polyhierarchically
Languages: English, French, Spanish, German
Fields: agriculture, environment, developing countries, biology
Terms: 36,089 terms (lead language French), classical thesaurus
Author: FAO
Address: (1) http://www.fao.org/scripts/agrovoc/frame.htm
(2) http://www.cirad.fr:80/cgi-bin/agrovoc/agrovoc
Format: electronic
 

8. ECDIN - Environmental Chemicals Data and Information Network (thesaurus)
ECDIN is a factual databank created under the Environmental Research Programme of the Joint Research Centre (JRC) of the Commission of European Communities at the Ispra Establishment. The main  characteristics of ECDIN is the multidisciplinary approach, intended to allow a comprehensive management of the environmental impact on each chemical. The thesaurus contains also factual information, to make it usable for users without a scientific or technical background, or without easy accesses to libraries.  ECDIN deals with the whole spectrum of parameters and properties that might help the user to evaluate real or potential risk in the use of a chemical and its economical and ecological impact. The following major categories are covered:

Identification, Physical-Chemical Properties, Production and Use Legislation and Rules, Occupational Health and Safety, Toxicity, Concentrations and Fate in the Environment, Detection Methods, Hazards and Emergency.

The data contained in ECDIN are extracted from original published literature or from existing databanks.  Other sources are the list of names used by the European Customs Inventory of Chemicals and European http://ulisse.ei.jrc.it/Ecdin/E_hinfo.html
Scope: Specific taxonomy
Retrieval utility: indexing and searching
Structure: polyhierarchy (7 levels), listed alphabetically,
 Languages: Standard, English, German, French, Dutch,
Greek, Spanish, Danish, Italian, Portuguese
 Fields: environmental chemicals
 Terms: specific taxons
 Authors: JRC/CEC
 Address: http://ulisse.ei.jrc.it/
 Format: electronic
 

9. ECPHIN – European Community Pharmaceutical Information Network Databank
The objective of ECPHIN has been to create an information system to store data on pharmaceutical products from official sources (or officially recommended sources) on prices of medical products and scientific information (SPC) to highlight disparities in prices and presentations of technical information of equivalent pharmaceutical products.

Scope: General
Retrieval utility: indexing and searching
Structure: polyhierarchy, listed alphabetically
 Languages: English,
 Fields: general, environment, and developing countries
 Terms: specific taxons
 Authors: JRC/CEC
 Address: http://ulisse.ei.jrc.it/
 Format: electronic
 

10. Astronomy Thesaurus (Multilingual Supplement)
Retrieval utility: indexing and searching
Structure: Based on The Astronomy Thesaurus, by Robert R. Shobbrook and  Robyn M. Shobbrook, Version 2.0, January 1995. and ISO5964
        Languages: English, French, German, Italian, Spanish
        Fields: Astronomy, Cosmology, Space
        Terms:preferred and nonpreferred terms
        Authors:Eugenia Gomez, Lucenya Kedziora, Nora Loiseau, Edith
Sachtschal, Marie-Jose Vin, Marina Zuccoli
        Reference:http://www.aao.gov.au/lib/mlsintro.html
        Format: electronic
 

11. Thesaurus Sozialwissenschaften
The terminology included in this thesaurus covers the areas of environmental studies, including social ecology, environmental movements, activism, and some environmental philosophy.  It is thus, a usefull tool to index and search the growing social sphere of  the environmentalism.
Retrieval utility: indexing and searching
        Structure:Traditional
        Languages:German, English
        Fields:Economy, History, Philosophy, State and Law
        Terms:10,500 terms (6,900 preferred, 3,600 nonpreferred)
Authors: Bearb. von Hannelore Schott
        Reference: ISBN 3-8206-0122-8; ISBN 3-8206-0123-6
        Format: electronic, paper
 

12. The UMLS Metathesaurus
The UMLS Metathesaurus is one of four knowledge sources developed and distributed by the National Library of Medicine as part of the Unified Medical Language System (UMLS) project. The Metathesaurus aggregates information about biomedical concepts and terms from many controlled vocabularies and classifications used in patient records, administrative health data, bibliographic and full-text databases and expert systems. It preserves the names, meanings, hierarchical contexts, attributes, and inter-term relationships present in its source vocabularies; adds certain basic information to each concept; and establishes new relationships between terms from different source vocabularies.

The Metathesaurus supplies information that computer programs can use to interpret user inquiries, interact with users to refine their questions, identify which databases contain information relevant to particular inquiries, and convert the users' terms into the vocabulary used by relevant information sources. The scope of the Metathesaurus is determined by the combined scope of its source vocabularies. The Metathesaurus is produced by automated processing of machine-readable versions of its source vocabularies, followed by human review and editing by subject experts. The Metathesaurus is intended primarily for use by system developers, but can also be a useful reference tool for database builders, librarians, and other information professionals.

The Metathesaurus is organized by concept or meaning. Alternate names for the same concept (synonyms, lexical variants, and translations) is linked together. Each Metathesaurus concept has attributes that help to define its meaning, e.g., the semantic type(s) or categories to which it belongs, its position in the hierarchical contexts from various source vocabularies, and, for many concepts, a definition. A number of relationships between different concepts are represented. Some of these relationships are derived from the source vocabularies; others are created during the construction of the Metathesaurus. Most inter-concept relationships in the Metathesaurus link concepts that are similar along some dimension. The Metathesaurus also includes use information, including the names of selected databases in which the concept appears, and, for MeSH® terms, information about the qualifiers that have been applied to the terms in MEDLINE®. Information on the co-occurrence of concepts in MEDLINE and in some other information sources is also included.

Content of the Metathesaurus
The 1998 version of the Metathesaurus contains 476,322 biomedical concepts with 1,051,903 different concept names from more than 40 source vocabularies. Important additions for 1998 include the remainder of the October 1995 edition of the Read Codes, which is produced by the National Health Service in the United Kingdom; the International Classification of Diseases, 10th edition (ICD10), as issued by the World Health Organization; the Health Care Financing Administration's Common Procedure Coding System (HCPCS), which includes the American Dental Association's Current Dental Terminology (CDT); the International Classification of Primary Care (ICPC); the Nursing Outcomes Classification (NOC); the 1997 FDA Standard Product Nomenclature (SPN97); terminology from the University of Washington's Digital Anatomist Symbolic Knowledge Base (UWDA); and the transliterated Russian edition of MeSH.

Address: http://www.nlm.nih.gov/pubs/factsheets/umlsmeta.html
 
 
 

13. Macrothesaurus for Information Processing in the Field of
       Economic and Social Development
The Macrothesaurus for Information Processing in the Field of Economic and Social Development Is published by the Organization for Economic Cooperation and Development (OECD).  The OECD is a Paris-based intergovernmental organization, whose purpose is to provide its 29 Member countries with a forum to enable its members to consult and co-operate with each other in order to achieve the highest sustainable economic growth in their countries and improve the economic and social well-being of their populations.  The thesaurus is a bilingual (trilingual in paper) publication that is used for indexing and searching multidisciplinary publications dealing with sustainable development. This new edition of the Macrothesaurus for Information Processing in the field of Economic and Social Development represents a continuation of the combined efforts of many organizations over a period of almost 30 years to create a common vocabulary to facilitate the indexing, retrieval and exchange of development-related information. The Macrothesaurus comprises descriptors (keywords) designed for indexing books and documents covering the field of economic and social development. It can also be used as a search aid for documentation centers, libraries, databases and on-line networks. Efforts have been made to improve the user-friendliness and flexibility of the Macrothesaurus by increasing the number of non-descriptors (i.e. cross-references) and scope notes in this edition. The preparation of this fifth edition was guided by an Advisory Committee composed of representatives from the International Development Research Centre, Ottawa, the United Nations Department of Economic and Social Affairs, New York, the Organisation for Economic Co-operation and Development (OECD), and the OECD Development Centre, Paris.

Retrieval utility: indexing and searching
Structure: listed alphabetically, hierarchically, code number assigned
Languages: English, French, Spanish
Fields: sustainable development
Terms: all
Authors: OECD
Address: http://electrade.gfi.fr/cgi-bin/OECDBookShop.storefront/
Format: electronic (English, French) paper (English, French, Spanish)
 
 
 

14. Multilingual European Thesaurus on Health Promotion.
The Multilingual European Thesaurus on Health Promotion is a project of the International Union for Health Promotion and Education (IUHPE) . The IUHPE is a global association of people and organizations working in the fields of health promotion and health education, dedicated to the promotion of world health through education, communication action and the development of healthy public policies. The thesaurus covers the field of health promotion and it has terms in English, French, German and Dutch. It has been compiled by information managers in the field of health promotion education from several European countries in 1996. The aim of the thesaurus is to improve communication in the field of health promotion and health education at European level. The thesaurus consists of 1,300 keywords in four languages.

Scope: General
Retrieval utility: indexing and searching
Structure: listed alphabetically, hierarchically, code number assigned
Languages: English, French, German, Dutch
Fields: public health, environmental health
Terms: UF, BT,  RT, NT
Authors: IUHPE
Address: http://www.nigz.nl/multhes/th529.html
 (2) http://www.ccer.ggl.ruu.nl/thes/pref.htm
Format: electronic, paper
 

15. Multilingual Egyptological Thesaurus
The Multilingual Egyptological Thesaurus is a valuable tool in indexing and searching in the field of environmental history, cultural diversity, bioarchaeology, and anhropology where it relates to environmental issues. This is one of the most advanced multilingual thesauri in archaeology/anthropology fields, and it offers a sound foundation for further multilingual development of these fields. This Multilingual Egyptological Thesaurus, has been compiled mainly for the (computerized) documentation and retrieval of museum objects, results from the collaboration between the Computer Working Group of the International Association of Egyptologists (IAE) and the Comité International pour l'Égyptologie (CIPEG) of the International Council of Museums (ICOM). The initiative was taken by the former during the "Table ronde Informatique & Égyptologie" which took place in Paris, July 1990. The latter decided to join the project according to the resolution of the CIPEG conference in Moscow, July 1991.  Following thesauri were used in the construction of this multilingual thesaurus:  the French Faraon thesaurus of the Louvre Museum,  the German list compiled by Rolf Gundlach, Joachim Karig and Dietrich Wildung (Die Begriffslisten der Dokumentation Ägyptischer Altertümer, Berlin-Darmstadt-München 1970),   an English thesaurus made by Fathi Saleh (Cairo), the Thesauri zur Erfassung kunst- und kulturgeschichtlicher Denkmäler aus und in Ägypten in einem relationalen Datenbanksystem, compiled by Maya Müller (Basel), and the Index of the Tübinger Atlas des Vorderen Orients.
Scope: general, specific
Retrieval utility: indexing and searching
Structure: listed alphabetically, hierarchically, code numbers assigned
Languages: English, French, German,
Fields: General archaeology, general anthropology, egyptology, environmental history
Terms: 15 categories with sub-hierarchies
Authors: IUHPE
Address: http://www.ccer.ggl.ruu.nl/thes/pref.htm
Format: electronic, paper
 
 
 

References
 

Aitchison, Jean, David Bawden, and Alan Gilchrist. (1997)  Thesaurus Construction and Use: A Practical Manual. 3rd ed. London: Aslib, 1997. (Available in USA from Portland Press).

Buckland, Michael Keeble (1992)  Information and Information Systems, Greenwood Press, Westport, CT, 1992, pp. 225pp
 

Dahlberg, Ingetraut (1996). ‘The compatibility guidelines — a re-evaluation.’ In:
Compatibility and integration of order systems: Research Seminar Proceedings of the TIP/ISKO Meeting, Warsaw, 13—15 September 1995. Warsaw: Wydawnictwo SBP, 1995, pp. 32—45

Doroszewski, Witold. (1981) Slownik Poprawnej Polszczyzny, Panstwowy Instytut Wydawniczy, Warszawa 1981.

EAGLES (1994). Evaluation of Natural Language Processing Systems, EAG-EWG-PR.2, ILC-CNR, Pisa.

International Society for Knowledge Organization (ISKO), Polish Librarians Association and Society of Professional Information (TIP) (1996). Compatibility and integration of order systems: Research Seminar Proceedings of the TIP.iSKO Meeting, Warsaw, 13—15 September 1995. Warsaw: Wydawnictwo SBP, 1996

Johannes, Robert E. (1993). Integrating Traditional Ecological Knowledge and Management with Environmental Impact Assessment. In Inglis. J.T. (ed), op. cit.

Lloyd, Gareth. (1996).  Assessment of Environmental Thesauri.  EEA Information Interchange Project.  Work Package 44.4, WCMC, November 1996 (draft version)

Milstead, Jessica (1998). Thesaurus Management Software. In: American Society of Indexers Web Site: http://www.asindexing.org/thessoft.htm

National Information Standards Institute. American National Standard Guidelines for the Construction, Format, and Management of Monolingual Thesauri. Bethesda, MD: NISO Press, 1994. (ANSI/NISO Z39.19-1993).

Purat, Jacek (1998). Organization of Environmental Terminologies.  Research Paper. (http://www.sims.berkeley.edu/~purat/org_env_term.html)

Rowley, Jennifer, E. (1992) Organizing Knowledge, Second Edition, Gower Publishing, Vermont, USA, pp. 510

United Nations, (1992) Environment and Development. Terminology Bulletin No. 344. Volume II (indexes) and Volume I.  ST/SC/SER. F/344, United Nations, New York, 1992

Copyright: Jacek Purat, SIMS, UC Berkeley.

Return to Search Support for Unfamiliar Metadata Vocabularies