Turning the GND subject headings into a SKOS thesaurus: an experiment

The "Integrated Authority File" (Gemeinsame Normdatei, GND) of the German National Library (DNB), the library networks of the German-speaking countries and many other institutions, is a widely recognized and used authority resource. The authority file comprises persons, institutions, locations and other entity types, in particular subject headings. With more than 134,000 concepts, organized in almost 500 subject categories, the subjects part - the former "Schlagwortnormdatei" (SWD) - is huge. That would make it a nice resource to stress-test SKOS tools - when it would be available in SKOS. A seminar at the DNB on requirements for thesauri on the Semantic Web (slides, in German) provided another reason for the experiment described below.

skos-history: New method for change tracking applied to STW Thesaurus for Economics

“What’s new?” and “What has changed?” are questions users of Knowledge Organization Systems (KOS), such as thesauri or classifications, ask when a new version is published. Much more so, when a thesaurus existing since the 1990s has been completely revised, subject area for subject area. After four intermediately published versions in as many consecutive years, ZBW's STW Thesaurus for Economics has been re-launched recently in version 9.0. In total, 777 descriptors have been added; 1,052 (of about 6,000) have been deprecated and in their vast majority merged into others. More subtle changes include modified preferred labels, or merges and splits of existing concepts.

Since STW has been published on the web in 2009, we went to great lengths to make change traceable: No concept and no web page has been deleted, everything from prior versions is still available. Following a presentation at DC-2013 in Lisbon, I've started the skos-history project, which aims to exploit published SKOS files of different versions for change tracking. A first beta implementation of Linked-Data-based change reports went live with STW 8.14, making use of SPARQL "live queries" (as described in a prior post). With the publication of STW 9.0, full reports of the changes are available. How do they work?



"What's new?" and "What has changed?" are common user questions when a new version of a vocabulary is published - be it a thesaurus, a classification, or a simple keyword list. Making use of the regular structure of SKOS files, changes can be derived from the differences of the versions (deltas), and can be grouped to get an overview of additions, deletions/deprecations, hierachy or label changes. The resulting reports should be apprehensable by humans and processable by machines. skos-history aims at developing a set of processing practices and a supporting ontology to this end.

Automatic Indexing: ZBW Indexer

Applying automatic methods of indexing makes analyzing and structuring of electronic content much easier and faster. This is why we are in the process of testing a statistics-based automatic indexing method. By means of our STW Thesaurus for Economics, the ZBW Indexer generates possible keywords from any economic text. (The demo is not functional any more).

Economics Taxonomies in Drupal

The Drupal economics_taxonomies module can be used in Drupal installations to gain access to well established vocabularies for economics over the web. Economics contents curated in Drupal (journal articles, blog entries, etc.) can be indexed using terms from this vocabularies, without a need to install them locally.

Web Services for Economics

As the publisher of the STW Thesaurus for Economics, ZBW provides experimental thesaurus web services for use by humans and by machines. In the first instance these services are designed to support autosuggest functions and query expansion in the context of information retrieval applications. Parts of the delivered data originate from datasets which were created by third parties and shared through open licenses.

Subscribe to RSS - Thesaurus