...records
Specifically, the titles, authors and abstracts were examined.

...terms
Where ``collocation'' commonly implies ``linearly adjacent'', we relax it to mean ``standing in a consistent, identifiable relationship'' with a document.

...temporal
Examples of such temporal distances are pre-publication advertisements and announcements for forthcoming books, reports, records, standards, etc.

...vocabulary
They also dealt with identifiers (words and phrases from titles or abstracts), but this was complicated by the effects of different indexing policies in the history of ISA, namely, between 1966-1983 identifiers came from natural language; from 1984 onward identifiers were drawn from a controlled vocabulary.

...as
This measure is a simpler version of Sievert's consistency percentage measure. Sievert's scaling factor is merely for convenience of reading, and the denominators express the same idea: the number of terms assigned in common plus the number assigned separately in Sievert's measure is the number assigned by either in [Soergel1994].

...multi-set
A multi-set allows for duplicate entries, which in this case allows for words to be used more than once in a document. In other words, a word occurring twice in a document will give rise to two occurrences of the word57#57subject heading pair.

...Usenet''
Optionally, common or unhelpful stopwords such as ``on'' may be ignored.

...pairings
In this case, {biomedical, information-services}, {biomedical, internetworking}, {biomedical, medical-computing}, {resources, information-services}, and so on.

...database
The numbers reported here were gathered online on 30 January, 1995.

...relevant
Relevant here means ``a match'' between the algorithm's subject heading assignment and the actual subject heading already assigned by a human indexer.

...recall
With regard to the terms of [Soergel1994], our definition of precision is equivalent to his purity of indexing (descriptive view) or impurity of indexing (descriptive view) and our definition of recall is equivalent to his completenesss of indexing (descriptive view).

...dictionaries
Not shown in the table, this measure reached 0.26 at a depth of four retrieved subject headings.

...four
Depth four results are not shown in the table.

Christian Plaunt

School of Information Management and Systems

UC Berkeley

chris@bliss.berkeley.edu

Wed Dec 20 16:53:25 PST 1995