From Athenawiki

Jump to: navigation, search

Contents

A7. IMPLEMENT YOUR THESAURUS

A: Conceive your terminologyA1: Define your domainsA2: Identify your users' expectationsA3: Define your connection with the datamodelA4: Choose your termsA5: Organize your terms into a thesaurus structureA6: Find equivalent terms in other languagesA7: Implement your thesaurusB8: Validate your SKOSification

Actions:

Since you have just conceived your thesaurus structure, chosen your terms and found equivalent terms in different langages, you have now to technically make the thesaurus by:

  • Refining your general conception and preparing the implementation by consulting some standards which have been elaborated to provide guidance for the elaboration of thesaurus:
    • Three already finalized standards: ISO 2788:1986: + ISO 5964: 1985 + ANSI/NISO Z39.19-2003
    • But most of all: BS8723: Structured Vocabularies for Information Retrievaland the upcoming ISO 25964-1 - Thesauri and interoperability with other vocabularies: Thesaurus for information retrieval
  • Using your in-house thesaurus or collections management tool, or if there is not any terminology management part in your collections management tool, using a spreadsheet tool (such as Excel or Calc from Open Office) to practically declare and organize the lists of terms and the transversal groups.


Purpose:

The objective is to effectively build the thesaurus you have previously conceived. If your conception is satisfying, the technical concretisation will be quick and easy. Before trying to technically make your thesaurus, we recommend you to consult standards giving guidance for elaboration of such terminology. Indeed the work of ISO is a good guidance to implement your thesaurus.

If the 3 following standards ISO 2788:1986: + ISO 5964: 1985 + ANSI/NISO Z39.19-2003 are finalized and interesting to know when you want to conceive precisely a thesaurus, we recommend you the latest ones.

  • BS8723: Structured Vocabularies for Information Retrieval: This standard, which is a British adaption of the ISO 2788, intends to take into account every kind of terminology, not only thesauri, and focuses also on the interoperability between vocabularies. It takes into account the connection between terminologies and collections & objects, in the perspective of a SKOSification.
  • ISO 25964: Thesauri and Interoperability with other Vocabularies. This norm is divided in two parts: the first part on "Thesaurus for Information retrieval" will be published in 2011. The second part about "Interoperability with other vocabularies" will be published in 2012. This norm gives an update of the previous norms on thesauri (ISO 2799 and ISO 5964) for their design but also some technical specifications for thesaurus design and maintenance softwares. Some recommendations for interchange formats and protocols are available. The UML diagram presenting the general design of a thesaurus and its implementation defined by this norm is included in the annexes.

We make our recommendations according to the recommendations of these standards.

Among all the exiting tools we have identified during our benchmark , no one is really dedicated to the implementation of a new thesaurus. Ideally, in the perspective of the SKOSification (especially step B2: Roughly SKOSify your thesaurus), you should directly use at this very step an XML editor in which you could already format your terminology in RDF. However you can make it more easily by using a spreadsheet tool andthen converting it in an XML.

XML is not mandatory here, but your terminology will be in a more standard form than in a spreadsheet. The first interest of XML is thatyou are making a first step for your terminology SKOSification. The second one is that the arborescence displayof XML (for instance in a Web browser) helps to see in one sight how your thesaurus is structured. Anyway, even if we can say that the previous steps did not require very specific knowledge in Information Engineering, this very one requires for the first time technical skills.


For example

You have a thesaurus about architecture in which there are two micro-thesauri:one about monuments, and another one about habitations. In your "monument" list of terms, you have for example "palace", "triumphal arch","therms"... And in the "habitation" list you can have "apartment", "hut", "house", "squat"... Finally, your transversal group of terms on the theme of "buildings" gathers "palace", and "apartment".

In order to implement such a thesaurus, you use OpenOffice as a spreadsheet software. Your main sheet is called "Architecture Thesaurus". In the first column you have the micro-thesauri names ("monument", "habitation"). In the second the related terms which are in hierarchical relation.

Image: A7_table_1.jpg

Then, in order to declare the transversal grouping of terms related to the theme "buildings", you create a new sheet in your spreadsheet entitled "buildings" in which the first column gives the terms and the second the source micro-thesauri.

Image: A7_table_2.jpg

Methods and tools:

If you do not have an in-house thesaurus management tool wich enables you to implement a thesaurus from scratch and convert it in XML, we advice you to use a spreadsheet tool such as OpenOffice.

It is a free tool whose functionalities are adapted to organise terms according to both hierarchical and transversal approaches. And you can export you file data into an XML conversion thanks to the function Save As.


Navigation

We invite you to pursue the step by step process by going to the next step: B: Make your terminology interoperable.

Go to the next step

The different tasks we detail are:

  • A1: Define your collection domain(s)
  • A2: Identify your users' expectations about your semantic descriptions
  • A3: Define your connection with the datamodel
  • A4: Choose the terms for the semantic description of your digital resources
  • A5: Organise your terms into a thesaurus structure
  • A6: Find equivalent terms in other languages
  • A7: Implement your thesaurus

You can also navigate through the recommendations by using the synoptic map below. This map will be available on each page of the recommendations process. In order to know the name of a step in particular, just rollover and stay a bit on the very box so that the name appears.

A: Conceive your terminologyB: Make your terminology interoperableC: Link your terminology to a networkA1: Define your domainsB1: Evaluate how far SKOS is compliant with your main featuresC1: Definition of metadata on your terminologyA2: Identify your users' expectationsB2: Roughly SKOSify your thesaurusC2: Identification of resources for mappingA3: Define your connection with the datamodelB3: Define with precision the labels expressing conceptsC3: Mapping with other resourcesA4: Choose your termsB4: Identify your conceptsC4: Validation of the interoperabilityA5: Organize your terms into a thesaurus structureB5: Map your conceptsA6: Find equivalent terms in other languagesB6: Map your termsA7: Implement your thesaurusB7: Ensure the documentation of conceptsB8: Validate your SKOSification
This page was last modified on 3 June 2011, at 10:23.This page has been accessed 920 times.