From Athenawiki
Contents |
A7. IMPLEMENT YOUR THESAURUS

Actions:
Since you have just conceived your thesaurus structure, chosen your terms and found equivalent terms in different langages, you have now to technically make the thesaurus by:
- Refining your general conception and preparing the implementation by consulting some standards which have been elaborated to provide guidance for the elaboration of thesaurus:
- Three already finalized standards: ISO 2788:1986: + ISO 5964: 1985 + ANSI/NISO Z39.19-2003
- But most of all: BS8723: Structured Vocabularies for Information Retrievaland the upcoming ISO 25964-1 - Thesauri and interoperability with other vocabularies: Thesaurus for information retrieval
- Using your in-house thesaurus or collections management tool, or if there is not any terminology management part in your collections management tool, using a spreadsheet tool (such as Excel or Calc from Open Office) to practically declare and organize the lists of terms and the transversal groups.
Purpose:
The objective is to effectively build the thesaurus you have previously conceived. If your conception is satisfying, the technical concretisation will be quick and easy. Before trying to technically make your thesaurus, we recommend you to consult standards giving guidance for elaboration of such terminology. Indeed the work of ISO is a good guidance to implement your thesaurus.
If the 3 following standards ISO 2788:1986: + ISO 5964: 1985 + ANSI/NISO Z39.19-2003 are finalized and interesting to know when you want to conceive precisely a thesaurus, we recommend you the latest ones.
- BS8723: Structured Vocabularies for Information Retrieval: This standard, which is a British adaption of the ISO 2788, intends to take into account every kind of terminology, not only thesauri, and focuses also on the interoperability between vocabularies. It takes into account the connection between terminologies and collections & objects, in the perspective of a SKOSification.
- ISO 25964: Thesauri and Interoperability with other Vocabularies. This norm is divided in two parts: the first part on "Thesaurus for Information retrieval" will be published in 2011. The second part about "Interoperability with other vocabularies" will be published in 2012. This norm gives an update of the previous norms on thesauri (ISO 2799 and ISO 5964) for their design but also some technical specifications for thesaurus design and maintenance softwares. Some recommendations for interchange formats and protocols are available. The UML diagram presenting the general design of a thesaurus and its implementation defined by this norm is included in the annexes.
We make our recommendations according to the recommendations of these standards.
Among all the exiting tools we have identified during our benchmark , no one is really dedicated to the implementation of a new thesaurus. Ideally, in the perspective of the SKOSification (especially step B2: Roughly SKOSify your thesaurus), you should directly use at this very step an XML editor in which you could already format your terminology in RDF. However you can make it more easily by using a spreadsheet tool andthen converting it in an XML.
XML is not mandatory here, but your terminology will be in a more standard form than in a spreadsheet. The first interest of XML is thatyou are making a first step for your terminology SKOSification. The second one is that the arborescence displayof XML (for instance in a Web browser) helps to see in one sight how your thesaurus is structured. Anyway, even if we can say that the previous steps did not require very specific knowledge in Information Engineering, this very one requires for the first time technical skills.
For example
|
You have a thesaurus about architecture in which there are two micro-thesauri:one about monuments, and another one about habitations. In your "monument" list of terms, you have for example "palace", "triumphal arch","therms"... And in the "habitation" list you can have "apartment", "hut", "house", "squat"... Finally, your transversal group of terms on the theme of "buildings" gathers "palace", and "apartment". In order to implement such a thesaurus, you use OpenOffice as a spreadsheet software. Your main sheet is called "Architecture Thesaurus". In the first column you have the micro-thesauri names ("monument", "habitation"). In the second the related terms which are in hierarchical relation. Then, in order to declare the transversal grouping of terms related to the theme "buildings", you create a new sheet in your spreadsheet entitled "buildings" in which the first column gives the terms and the second the source micro-thesauri. |
Methods and tools:
If you do not have an in-house thesaurus management tool wich enables you to implement a thesaurus from scratch and convert it in XML, we advice you to use a spreadsheet tool such as OpenOffice.
It is a free tool whose functionalities are adapted to organise terms according to both hierarchical and transversal approaches. And you can export you file data into an XML conversion thanks to the function Save As.
Navigation
We invite you to pursue the step by step process by going to the next step: B: Make your terminology interoperable.

The different tasks we detail are:
- A1: Define your collection domain(s)
- A2: Identify your users' expectations about your semantic descriptions
- A3: Define your connection with the datamodel
- A4: Choose the terms for the semantic description of your digital resources
- A5: Organise your terms into a thesaurus structure
- A6: Find equivalent terms in other languages
- A7: Implement your thesaurus
You can also navigate through the recommendations by using the synoptic map below. This map will be available on each page of the recommendations process. In order to know the name of a step in particular, just rollover and stay a bit on the very box so that the name appears.



