Metadata Conventions

Metadata have been often proposed to help in organizing data, managing document effectively and obtain better results with search engines. Data that do not have accompanying metadata are often hard to find, difficult to access, troublesome to integrate, and perplexing to understand or interpret. Furthermore, as time passes, undocumented data may lose their value and relevant memories can dissolve without trace.

Metadata is structured data describing facts about documents, in such a way as to help users make sense of their content, their relationships and their history. Careful decisions about which structured data to use for describing which facts about documents determine the identification of metadata schemas, or, under some circumstances, ontologies. Metadata and ontologies are a way of organizing data about data, or information used to retrieve information, in such a way as to help in organizing, understanding and searching facts within huge quantities of documents.

Metadata provide improved reliability in search results, good support for workflow processes, data filtering, inventory of the information held by  organizations, as well as inter-organizational consistency in describing shared facts and documents, interoperability and long-term organization memory.

The Dublin Core (http://dublincore.org/) is probably the most widespread and famous schema for metadata structures as applied to electronic documents. Yet, its generality limits the flexibility of metadata for the scopes and extent of applications internal to an organization. For this reason, many metadata schemas build over the Dublin Core, but at the same time they actually extend it for purposes of local interest only. Far more interesting appear to be initiatives such as the one sponsored by the International Federation of Library Associations (IFLA), called Functional Requirements for Bibliographic Records (FRBR) http://www.ifla.org/VII/s13/frbr/frbr.pdf), which tries to capture several different natures in documents, such as the persistent characteristics of different versions of the same document, as expressed in the WORK/EXPRESSION/MANIFESTATION/ITEM specification.

It is also important to mention the Semantic Web initiative: within the World Wide Web Consortium, the Semantic Web is the collection of models, protocols and software tools aimed at providing software with the features to “understand” documents, rather than simply “display” them. The idea behind the Semantic Web is to create sophisticated applications that can derive new knowledge and exhibit complex behaviour based on formalized statements about the content of the documents, and expressed in terms of metadata accompanying the documents themselves.

Naming Convention

to identify metadata unambiguously without name clashes

A mechanism is needed to ensure that metadata belonging to the core set can be identified unambiguously and that proprietary metadata can be added to the various Akoma Ntoso document models without name clashes.

Proprietary metadata names are constructed using a namespace prefix pointing to the country codes of the relevant emanating body (such as the Parliament). It is assumed that name clashes in metadata items within a single country are resolved before being committed to actual Akoma Ntoso documents.

Akoma Ntoso assumes the following conventions regarding metadata items:

  • Items without namespace prefix belong to the core set of metadata elements and are available to all emanating bodies in their explicitly stated meaning.
  • Items with a given country prefix are local to a specific country and are not to be used outside of documents coming from that country. Applications receiving documents that contain metadata elements with unknown prefix CAN take no further actions on them, but MUST preserve them when delivering the documents to other applications down the line.
  • Items within the Akoma Ntoso namespace are to be considered as local but are scoped to the whole Akoma Ntoso framework, rather that to an individual country. A central repository and registry of metadata names belonging to the Akoma Ntoso prefix will be kept to which any individual emanating body can contribute. The registry will make sure that no name clashes exist in Akoma Ntoso scoped local extensions.

Subject descriptors, keywords and thesauri

the provider of values for a number of metadata items

Any generic or specialized thesaurus can be used to specify terms in a number of metadata elements of the Akoma Ntoso metadata section. In particular this is true for the subject descriptors (element name: <keyword>). Whenever a thesaurus is used, the attribute dictionary must be present to specify the name of the thesaurus itself. It is possible to use terms that do not belong to any thesaurus, in which case the “none” value must be used for the dictionary attribute.

<meta>
    ...
    </classification source=”#FV”>
        <keyword value=”Community programme” dictionary=”Eurovoc”/>
        <keyword value=”Internet” dictionary=”Eurovoc”/>
        <keyword value=”information technology” dictionary=”Eurovoc”/>
        <keyword value=”new technology” dictionary=”Eurovoc”/>
        <keyword value=”data protection” dictionary=”Eurovoc”/>
    </classification>
    ...
</meta>

Other Metadata Initiatives

identification of semantic equivalences

Akoma Ntoso strongly encourages the identification of semantic equivalences between Akoma Ntoso metadata and well-known  international metadata initiatives. Among them a strong importance is in providing equivalences between Akoma Ntoso and IFLA FRBR, Dublin Core (http://dublincore.org/) and FOAF (http://xmlns.com/foaf/0.1/).

The central repository of the Akoma Ntoso metadata initiative provides a direct conversion mapping between all relevant Akoma Ntoso metadata and other metadata initiatives (whenever appropriate). In particular, all Akoma Ntoso required and optional items are mapped, whenever appropriate, to specific Dublin Core and FOAF properties. The central repository also requires that all local extensions (either belonging to the Akoma Ntoso prefix or not) are explicitly associated to a Dublin Core or FOAF property whenever appropriate.