Thesaurus

Turning the GND subject headings into a SKOS thesaurus: an experiment

The "Integrated Authority File" (Gemeinsame Normdatei, GND) of the German National Library (DNB), the library networks of the German-speaking countries and many other institutions, is a widely recognized and used authority resource. The authority file comprises persons, institutions, locations and other entity types, in particular subject headings. With more than 134,000 concepts, organized in almost 500 subject categories, the subjects part - the former "Schlagwortnormdatei" (SWD) - is huge. That would make it a nice resource to stress-test SKOS tools - when it would be available in SKOS. A seminar at the DNB on requirements for thesauri on the Semantic Web (slides, in German) provided another reason for the experiment described below.

skos-history: New method for change tracking applied to STW Thesaurus for Economics

“What’s new?” and “What has changed?” are questions users of Knowledge Organization Systems (KOS), such as thesauri or classifications, ask when a new version is published. Much more so, when a thesaurus existing since the 1990s has been completely revised, subject area for subject area. After four intermediately published versions in as many consecutive years, ZBW's STW Thesaurus for Economics has been re-launched recently in version 9.0. In total, 777 descriptors have been added; 1,052 (of about 6,000) have been deprecated and in their vast majority merged into others. More subtle changes include modified preferred labels, or merges and splits of existing concepts.

Since STW has been published on the web in 2009, we went to great lengths to make change traceable: No concept and no web page has been deleted, everything from prior versions is still available. Following a presentation at DC-2013 in Lisbon, I've started the skos-history project, which aims to exploit published SKOS files of different versions for change tracking. A first beta implementation of Linked-Data-based change reports went live with STW 8.14, making use of SPARQL "live queries" (as described in a prior post). With the publication of STW 9.0, full reports of the changes are available. How do they work?

<--break->

skos-history

"What's new?" and "What has changed?" are common user questions when a new version of a vocabulary is published - be it a thesaurus, a classification, or a simple keyword list. Making use of the regular structure of SKOS files, changes can be derived from the differences of the versions (deltas), and can be grouped to get an overview of additions, deletions/deprecations, hierachy or label changes. The resulting reports should be apprehensable by humans and processable by machines. skos-history aims at developing a set of processing practices and a supporting ontology to this end.

Automatisches Indexieren: ZBW-Indexer

Die Analyse und Strukturierung elektronischer Inhalte wird durch automatisierte Sacherschließungsverfahren wesentlich beschleunigt und vereinheitlicht. Daher erproben wir derzeit ein statistikbasiertes maschinelles Indexierungverfahren. Der ZBW-Indexer generiert mit Hilfe des Standard-Thesaurus Wirtschaft potentielle Keywords aus beliebigen wirtschaftswissenschaftlichen Texten. (Die Demo-Anwendung ist nicht mehr aktiv.)

RSS - Thesaurus abonnieren