|
|
We are in the process of updating this content to reflect changes to the latest release of AKOMA NTOSO!!!
- Info
AKOMA NTOSO in detail
Note: Return to reference manual view.
1.
What is it
AKOMA NTOSO - Architecture for Knowledge-Oriented Management of African Normative Texts using Open Standards and Ontologies
defines a set of simple, technology-neutral representations of
parliamentary documents for e-services in a Pan-African context and
provides an enabling framework for the effective exchange of "machine readable" parliamentary, legislative and judiciary documents such as legislation, debate record, minutes, etc.
Providing
access to primary legal materials, parliamentary works and judiciaries
documents is not just a matter of giving physical or on-line access to
them. "Open access" requires the
information to be described and classified in a uniform and organized
way so that content is structured into meaningful elements that can be
read/understood by software applications, so that the content is made "machine readable".
The
opportunity to make accessible the structures and semantic components
of parliamentary, legislative and judiciaries documents to software
applications means to be able to available the huge capacity of ICTs to
manipulate documents not as just plain undifferentiated text but in
their structure and semantic component so that high value information
services can be developed to assist both institutions and citizens to
better play their respective roles.
AKOMA NTOSO fulfils the citizens' right to access parliamentary and judiciary proceedings and legislation by providing "open access" and advanced functionalities like "point-in-time"
legislation through standardised representations of data and metadata
in the African Parliamentary domain and mechanism for citation and
cross referencing of legal documents to also improve data exchange and
document life cycle automation.
The AKOMANTOSO logo is a combination of two symbols:
- Linked Hearts - a symbol used by the Akan people of West Africa to represent understanding and agreement;
- The African continent - to indicate the regional focus of AKOMA NTOSO
Placing
the 'linked hearts' over the African continent is a symbolic
representation of the Akomantoso goal: "To provide open access to
parliamentary documentation and allow African parliaments to exchange
information efficiently promoting understanding and collaboration".
1.1.
Open Access
"meaning" and "structure" of every element in a parliamentary, legislative or judiciary document will available to software applications
Interoperability Framework
Although
each Parliament has its unique characteristics, all Parliamentary
democracies have a number of characteristics in common: Actors,
Structures, Procedures, Acts and Information. AKOMA NTOSO defines
common building blocks in a single model that can be applied to each
(or at least most) parliamentary, legislative and judiciary documents.
AKOMA NTOSO
defines a set of recommendations and guidelines for e-services in a
pan-African context. The framework is an essential pre-requisite for
interlinking and web-enabling Parliaments and Courts. It will address
information content and recommend technical policies and specifications
for connecting information systems across Africa.
Country
Parliaments and Courts should use the guidance provided to supplement
their national e-Government Interoperability Frameworks with a
pan-African dimension and thus enable pan-African interoperability of
Parliaments. AKOMA NTOSO is meant to supplement, rather than replace,
national interoperability guidance that may exist by adding the
pan-African dimension.
This initiative will enable open access by focussing on both "semantic" and "technical" Interoperability.
- Semantic interoperability
is concerned with ensuring that the precise meaning of exchanged
information is understandable by any person or application receiving
the data. The majority of AKOMA NTOSO's efforts are dedicated to this
area.
- Technical interoperability
is aimed at ensuring that all AKOMA NTOSO-related applications, systems
and interfaces are based on a shared core of technologies, languages
and technical assumptions easing data interchange, data access and
reuse of acquired competencies and tools. AKOMA NTOSO ensures technical
interoperability by enforcing the use of open standards and open
document formats, based on the XML (eXtensible Markup Language)
language whose specifications are a world-wide standard and for which
numerous tools and applications have been developed and are widely
available.
By adopting AKOMA NTOSO specifications, parliamentary and court system designers can ensure interoperability
between systems while at the same time enjoy the flexibility to select
different hardware, and systems and application software to implement
solutions.
From presentation to structure and semantics
There are three aspects to any parliamentary, legislative and judiciary document:
- Presentation - how the information looks e.g. the colour of the text used in the document, the headings and other such formatting issues.;
- Structure - how the information is organized;
- Semantics - what the information represents or means;
Online
publishing of documents has long been confined to presentation issues.
Documents have been put on line trying to replicate as much as possible
the layout and formatting of paper. The way a document looks is very
important to the "human reader" but do not really provide much useful information to the computer to actually "read" a document as a knowledgeable human being could do.
The
development of descriptive markup meta-languages such as XML allows to
add information to any document that would make both the structure and
the semantic of a document "readable" by a
computer. Computer do not have the kind of experience and knowledge
that allow professional human being to be able to deduct structure and
semantics from a document unless this document has been previously "marked up" to make it "machine readable".
More specifically:
- Semantic markup -
semantically identifies parts of the document (e.g., headings, names,
references, provisions, In this way the "meaning" of the different
parts can then be "understood" by machines as well.
- Structural markup
- this refers to the categorizing of different parts of a document
based on their functionality e.g. In a parliamentary document you may
want to indicate that a certain section of the document is the
Preamble, Question, Motions etc.
AKOMA
NTOSO provides a way to move digital documents from the presentation
era to the semantic one. Digital parliamentary, legislative and
judiciary documents will not just be displayed online, they will now be
"understood" by software applications. Both the "meaning" and
"structure" of every element in a parliamentary, legislative or
judiciary document will be available for all machines to access, thus
providing the unprecedented opportunity to exploit the speed and
accuracy of ICTs to manage, access and distribute such documents.
2.
Scope
Define a common FORMAT
Parliaments and Courts function through the medium of documents. Debate in Parliamentary chambers and courts proceedings are recorded as documents. Legislation is passed through the voting process via a combination of documents, the proposed legislation itself, proposed amendments, committee working papers and so on. Given that the process is document-centric, the key enabler of streamlined Information Technology in Parliaments and Courts will be the use of open document formats for the principal types of documents. Such open document formats will allow easy exchange and aggregation of information - in addition to reducing the time required to make the information accessible via different electronic publishing media. The Information Technology industry has coalesced around a standard technology for Open Document Formats known as XML (eXtensible Markup Language). AKOMA NTOSO makes use of industry standard XML (eXtensible Markup Language) to define the open documents. It includes a set of XML-based parliamentary, legislative and judiciary Open Document Formats: - Parliamentary Debates
- Committee briefs
- Journals
- Primary Legislation - covering the life-cycle of a piece of legislation
- Judgements
- Others to be added
Define a MODEL
Define a MODEL for data interchange and open access to parliamentary, legislative and judiciary documents
Regardless of the processes that generate and use parliamentary, legislative and judiciary documents, regardless of the cultural and historical factors that give shape and substance to these documents, and regardless of the human languages in which these documents are written, there are undeniable relationships that connect documents of the same type, of different types, of different countries. One of the main objectives of AKOMA NTOSO is to be able to capture and describe these similarities so as to unify and streamline, wherever possible and as far as possible, the processes and formats and tools related to parliamentary, legislative and judiciary documentation. This lends itself to reducing investments in tools and systems, helping open access, and enhancing cooperation and integration of governmental bodies both within the individual African countries and between them. AKOMA NTOSO defines a model for open access focused on the following issues: - generation of documents: it should be possible to use the same tools for creating the documents, regardless of the type, country, language, and generation process of the document.
- presentation of documents: it should be possible to use the same tools to show on screen and print on paper all documents, regardless of their type, country, language and generation process.
- accessibility of documents: it should be possible to reference and access documents across types, languages, countries, etc., implementing the network of explicit references among texts into a web of hypertext links that allow the reader to navigate easily and immediately across them.
- description of documents: it should be possible to describe all documents, regardless of their types, languages, countries, etc., so as to make it possible to create repositories, search engines, analysis tools, comparison tools, etc.
At the same time, the AKOMA NTOSO model considers the differences that exist in individual document types, that are derived from using different human languages, and that are implicit in the legislative culture of each country. Therefore the common open access model is designed to be flexible, support exceptions, and allow extensions far enough to provide support for all peculiarities that can be found in the complete document set.
Define a DATA schema
Define a common African parliamentary, legislative and judiciary DATA schema
Parliaments and Courts work with a number of distinct types of documents such as legislation, debate record, Parliamentary Questions, Judiciary Proceedings, Judgements etc. AKOMA NTOSO defines a distinct document type for each major type of document. The definition takes the form of human and machine-readable document models, one for each document type. All document types share the same basic structures, provide support for metadata, addressing and references, and differentiate common structure and national peculiarities and extensions. All documents can be produced by the same set of tools (although specialized tools may provide more detailed and specific help in specific situations), need the same tools to be displayed or printed (although specialized tools can provide more sophisticated and individual presentations), can reference each other in an unambiguous and machine-processable way, and can be described by a common set of metadata that helps in indexing, analysing and storing all documents.
Define a METADATA schema
Define a common African parliamentary, legislative and judiciary METADATA schema and ontology
Metadata is structured information about a resource. Metadata records information about a document that does not actually belong to it, but is necessary to examine in order to deal with it (for instance, information about its publication, lifecycle, etc.). Metadata also enables a resource to be found by indicating what the resource is about and how it can be accessed. Furthermore, metadata facilitates the discovery and use of online resources by providing information that aids and increases the ease with which information can be located by search engines that index metadata. Metadata values are labelled and collected according to a common ontology, i.e. an organized description of the metadata values that describe the resources. A common ontology is fundamental to provide a way for managing, organizing and comparing metadata. The African parliamentary, legislative and judiciary ontology is concerned particularly with records management and resource management, and covers the core set of elements that contain data needed for the effective management and retrieval of official parliamentary, legislative and judiciary information. The aim of the African parliamentary, legislative and judiciary ontology is to provide a universal container for all the information about a resource that is available to the owner of the resource, does not belong to the recourse itself, and might be needed for management or searching. Two metadata vocabularies are of foremost importance for the AKOMA NTOSO ontology: the Dublin Core and the Eurovoc-Africa thesaurus. The AKOMA NTOSO ontology provides direct translation of its values into the corresponding Dublin Core properties, and uses systematically values and terms drawn from the Eurovoc-Africa thesaurus. Yet again, AKOMA NTOSO ontology is designed to be extensible so that those Parliaments and Courts with different, or more specific, metadata needs may add extra elements and qualifiers to meet their own requirements.
Define a mechanism for cross referencing
The AKOMA NTOSO Naming Convention and the AKOMA NTOSO Reference Mechanism are intended to enable a persistent, location-independent, resource identification mechanism. The adoption of a scheme based on this Naming Convention will allow the full automation of distributed hypertext. The AKOMA NTOSO reference mechanism, based on a shared naming convention, will allow automated generation of hypertext links and access to resources explicitly cited in AKOMA NTOSO documents. This automation can cater for: - the availability, at a certain time, of more than one resource corresponding to the document referred to;
- the possibility that references to resources not yet published on the web are present.
Official documents, bills, laws, acts and judgements contain numerous references to other official documents, judgements, bills, laws and acts. The whole parliamentary, legislative and judiciary corpus of documents can be seen as a network, in which each document is a node linking, and linked by, several other nodes through natural language expressions. The adoption of a common naming convention and a reference mechanism to connect a distributed document corpus, like the one embodied by the African Parliaments and Courts, will greatly enhance the accessibility and richness of cross references. It will enable comprehensive cross referencing and hyper-linking, so vital to any parliamentary, legislative and judiciary corpus, from: - debate record into legislation
- section of legislation to section of legislation in the same act
- section of legislation to section of legislation in another act of the same Parliament or of an institution like the Pan African Parliament
- from judgements to other judgements and acts.
3.
Users
AKOMA NTOSO aims at providing support for a large number of tasks and users spread throughout time, space and competencies. The types of potential users that might end up using or benefiting from AKOMA NTOSO can be grouped in the following categories:
"The author"
"The author" can be a member of an African Parliament, a Judge, a legal practitioner or a clerk/personal assistant of them. He/she is currently drafting a new piece of legislation, due to be discussed which maybe, approved in a future session of the Parliament, or preparing a judgement. "The author" is not aware of the existence of AKOMA NTOSO, XML, or any such technicality. He/she might, or might not, be aware of the existence of guidelines in the formal drafting of law, judgements he/she does not know what XML is, and does not care. He/she wants be able retrieve bills, acts, judgements, etc. effectively, to be able to access explicit references to other laws made in a bill or act, et. The "author" also wants to be able to access "point-in-time" consolidations of laws that provide a consolidation of the original act and the subsequent amendments up to a specific point in time, "the author" wants easy and effective tools to find and retrieve bills, acts, judgements, etc. to carry out their his/her duties more effectively.
"The drafter"
"The drafter" is a member of the office supporting the process of legal drafting, parliamentary proceedings or judgements. During the work-flow phase , "the drafter" receives, for e.g. all proposed text modifications to a bill in discussion that generates any of a number of documents used by members of the Parliaments (such as summaries, synaptic views of amendments, etc.). When the proposed bill is finally approved, he/she creates the final version of the bill; either directly in a XML editor or in a word processing file that is then translated into XML by some downstream process phase. "The drafter' is a subject matter expert in a specific matters e.g law, judgements, etc. , and has some computing experience, but he/she definitely no computer programmer/scientist. He/she is aware of AKOMA NTOSO and knows about its structural and semantic requirements but he/she may know very little about XML and "the drafter" will never be exposed or required to know anything about XML but to be very knowledgeable about the structures, semantics and explicit and implicit information the the document he/she is drafting carries..
"The toolmaker"
"The toolmaker" works for a computing firm who has a contract for creating AKOMA NTOSO software for a specific African Parliament and Court. "The toolmaker" decides to create a specialized editing tool by customizing a well-known Word Processor (such as OpenOffice.org or MS Office) and a conversion tools that creates valid AKOMA NTOSO documents recognizing formatting characteristics of the input texts. He/she has the goals of making the tools usable for the "drafter" and his/her colleagues, and at the same time compatible with AKOMA NTOSO rules. Differently from the "drafter", "the toolmaker" has full access to AKOMA NTOSO documentation, and can talk to his users to understand together what each part of AKOMA NTOSO really is relevant to their task and how to proceed.
"The citizen"
"The citizen" of an African country where the AKOMA NTOSO system is being used, he/she might be a lawyer, a public employee, an business person or just any ordinary citizen needing fast and easy access to laws legislation or judgements for his/her own purposes. "The citizen's" main objective is searching for laws either through an explicit reference (e.g. "section 36(2)(c)(ii) of Act 2-1999") or via a search interface (either textual or exploiting vocabularies and ontologies specified through the AKOMA NTOSO metadata). "The citizen" doesn't know that AKOMA NTOSO is a project to provide the text of laws, parliamentary proceedings, judgements, etc. to the citizens through some kind of esoteric machinery behind the scenes. He/she does not know what XML is, and does not care. He/she wants his/her web browser to display the text of law searched, he/she wants all explicit references to other laws to be hypertext links, and a reasonable interface that lets his/her read the text on the screen and, when necessary, print it on paper.
"The future toolmaker"
"The future toolmaker" is 10 years old now. He/she is playing with his school friends and does not know anything about AKOMA NTOSO and does not care. Yet. He/she is in this list because in fifteen years, when he/she'll be 25, he/she will be a professional computer programmer and will have to create new tools for AKOMA NTOSO. The key difference between "the toolmaker" and "the future toolmaker" is that "the future toolmaker" will not have access to complete documentation. He/she will only have sparse documentation of the actual requirements of the system. Furthermore, he/she will have to deal with a fairly stratified situation where the basic ideas (on which "the toolmaker" has worked) have evolved, modified, expanded and changed emphasis. Furthermore, more often than not these changes have happened slowly and without documentation. The only sure thing that "the future toolmaker" has to work on is more than 15 years of legislation available in XML format, whose documentation is introductory for certain, but far from complete and sufficient. Fortunately the early AKOMA NTOSO decisions have been to have the XML format be as self-explanatory as possible, so that "the future toolmaker" can, in principle, deduce all undocumented facts about AKOMA NTOSO by simply examining a few relevant XML instances of the legislation and discovering there how it should work. In a sense, "the future toolmaker" is more a key user for our system than "the toolmaker", and the possibility for 'the future toolmaker" to deduce fundamental properties of AKOMA NTOSO from the visual examination of XML documents will make us sure of long-term existence and usefulness of the AKOMA NTOSO system itself.
4.
Method
Strategic Goals
"lingua franca", long term storage, common metadata, self-explanatory, extensible,
Strategic Goals The AKOMA NTOSO model has been informed by the following strategic goals: - To create a "lingua franca' for the interchange of parliamentary, legislative and judiciary documents between institutions in Africa. For example, Parliament/Court X should be able to easily import a piece of legislation made available in AKOMA NTOSO format by Parliament/Court Y. The goal here is to speed up the process of drafting new legislation/writing sentences/etc. by reducing the amount of re-keying, re-formatting etc. required.
- To provide a long term storage and access format to parliamentary, legislative and judiciary documents that allow search, interpretation and visualization of such documents several years from now, even in the absence of the specific applications and technologies that were used to generate them.
- To provide an implementable baseline for parliamentary, legislative and judiciary systems in African institutions. It is envisaged that this will lead to one or more systems that provide a base layer of software "out of the box" that can then be customized to local needs. The goals here are twofold. Firstly, to facilitate the process of introducing IT into African institutions. Secondly, to reduce the amount of re-invention of the wheel that would result if all institutions pursued separate IT initiatives in the area of parliamentary, legislative and judiciary document production and management.
- To create a common data and metadata models so that information retrieval tools & techniques used in Parliament/Court X can be also be used in Parliament/Court Y. To take a simple example, it should be possible to search across the document repositories of multiple Parliaments/Courts in a consistent and effective way.
- To create common resource naming and resource linking models so that documents produced by Parliaments/Courts can be easily cited and cross-referenced - either by other Parliaments/Courts or by other users.
- To be "self-explanatory", that is to be able to provide all information for their use and meaning through a simple examination, even without the aid of specialized software.
- To be "extensible", that is it must be possible to allow local customisations to the models within the AKOMA NTOSO framework so that local customisation can be achieved without sacrificing interoperability with other systems.
Simple data model
identify a number of basic, fundamental classes of structures
The AKOMA NTOSO document model is designed, first and foremost, to be actually used. As a consequence, a high premium has been placed on simplicity throughout its design. Data models created to handle complex document types (as legislation) need to deal with two apparently opposed requirements: on the one hand, they need to be sufficiently sophisticated to handle all possible occurrences and situations that may occur in the actual documents. On the other, they need to be speedily understood and used by the people who would need to apply these models. These opposed requirements can be jointly satisfied not by simplifying the vocabularies of available structures and elements, which would reduce the available descriptive sophistication of the language, but rather by simplifying the structure variability and types (in XML parlance, the content models), thereby reducing the learning time and the software complexity without compromising a full and detailed descriptive power of the language. The idea therefore is to identify a number of basic, fundamental classes of structures (containers, hierarchies, blocks, etc.) that can be immediately understood and used appropriately, regardless of their actual names.
Ability to evolve
built to stand evolutions and changes over time
A critical attribute of a successful XML model is its ability to evolve over time. This "evolvability' has been a key concern in the creation of the AKOMA NTOSO model. Thus, although the General Schema is built to stand evolutions and changes over time, each individual Detailed Schema can be customized at will even in time, and still be made compatible with the overall AKOMA NTOSO infrastructure and the General Schema. Furthermore, the General Schema is built to stand evolutions and changes even regarding the number of actual functionalities provided: features such as the number of metadata, or the automatic generation of amended text, or the activation of special analysis tools on the text may require with time the evolution of the schema. In these cases, it can be guaranteed that existing documents already marked up according the initial versions of AKOMA NTOSO will be either immediately compatible with the new schemas, or easily convertible to it via a single XSLT stylesheet to be provided.
Validation
verifying the correctness of an XML document against specific schema
Validation is the act of checking the correctness of an XML document according to some pre-defined structural rules expressed in one or more DTDs and XML Schemas. The validation step verifies whether the XML document contains, in number and position, all the expected elements of the type this document is an instance of. The AKOMA NTOSO schema could impose a number of constraints and restrictions on the final form of the XML document, requiring for instance a specific order in the containment of parts or that any part is preceded by a heading, etc. The problem with being too restrictive in the constraints of the schema is that, e.g. the Parliaments may have approved, and may decide to approve in the future, documents that do not conform to these rules: in most countries there are guidelines for the correct drafting of legislation, but this is just what they are: guidelines, that can be ignored and modified at will by a higher authority such as a Parliament or Court. This fact has a very important effect on the generation of XML versions of documents: everything that gets approved by Parliaments or written by Courts has to be accepted by the system, and everything that has already been approved even more so. Therefore, failing XML validation (i.e., violating one or more of the constraints and restrictions expressed in the schemas) cannot have the effect of rejecting documents, but, at most, of pointing out issues and differences from the guidelines that the authority itself, if it wants and has time to spend on this, can consider for editing and modifications. In reality there are two different actors in the complex issue of validating a piece of legislation: the legislator, who is writing the actual content of the document, and the marker, who is converting it into XML by identifying all interesting bits of the text and correctly marking them up using the AKOMA NTOSO vocabulary. If we consider the validation schema a contract, then this contract clearly binds only the marker, leaving the legislator/judge absolutely free to do as he/she chooses. Thus compliance to rules such as "An identifier will always be added to each substructure of the act" or "The enactment date will be specified" can be safely required, as they bind the behaviour of the marker only, while structural rules (such as "Every subpart will have a heading", or "A section will contain paragraphs which contain clauses") cannot be imposed, as they would interfere with the authority and independence of the legislator, which in the case of Parliaments is most often complete and not constrainable by more mundane requirements such as the adherence to an abstract document structure. Forcing markers to fully describe in XML all document parts, and yet leaving to the legislator the maximum freedom in writing, may seem incompatible and hard-to-reach goals, but they can be and are reached in the AKOMA NTOSO framework. AKOMA NTOSO clearly separates data and metadata, thereby clearly distinguishing the contribution of the legislator (data) and the contribution of the marker (metadata); AKOMA NTOSO provides a richly evocative vocabulary of structures and elements, so that the marker can correctly and precisely describe what is actually contained in the documents. AKOMA NTOSO imposes little or no constraints on data, letting the legislator write and organize the text matter as he wishes, but imposes a number of constraints on the metadata, forcing the marker of texts to provide all bits of information that are necessary to manage and organize the documents. Yet it might be appropriate to also give guidance and help in following the drafting guidelines enacted in each country. This is the reason to provide both a General Schema(GS) and several Detailed Schemas (DSs): the GS is fully descriptive, only binding the marker and not the legislator, but allowing the marker to describe as precisely as possible the actual structure of the document as approved and generated by the Parliament. The DSs are more prescriptive, and are used to check whether the document actually conforms to the existing legal drafting guidelines in each individual country. Successful validation of documents will only be required against the GS, as errors would signal incorrect markup from the marker, while the DSs can be used, at the discretion of the Parliament itself, to automatically check conformance of the proposed bill against the drafting guidelines, and thus be able to modify it accordingly in case conformance is sought. The descriptive (GS) and prescriptive (DS) schema, of course, are closely related. As mentioned, they both use the same vocabularies, and differ in the number of rules and constraints they enact. All rules enforced in the GS are also enforced in the DSs, so that, correspondingly, all documents that are valid according to one of the DSs (i.e. are conformant to a set of stricter rules) are also valid according to the GS (i.e., they are also conformant to the set of shallower rules). Furthermore, this pair of schema classes also allows interoperability of systems dealing with documents coming from different African countries. In fact the descriptive GS schema can work on all AKOMA NTOSO documents of all the interested African countries, and can be used as the baseline for accessing and displaying documents regardless of their provenance. On the other hand each prescriptive DS schema is created to deal with the specific guidelines of each individual country, helping the legal drafting process and the correct preservation of cultural peculiarities of each individual country. The fundamental commonality of GS and DSs provides therefore full description of individual and country-specific document types without renouncing interoperability and document interchange.
Tools
editor, converter, name resolver, post-editing tools
Just as many are the users (some of whom are not even aware of the fact they are using or relying on AKOMA NTOSO-compatible systems), many also are the tools that will be created around the AKOMA NTOSO document model. Some of them are basic tools that are necessary for the AKOMA NTOSO system to work at all. Others are additional applications that will be created once the basic tasks have been catered for. Although this is not the place to provide a full list of the foreseeable tools, a brief list of the main categories may help in explaining the breadth and variety of the AKOMA NTOSO project, and the number of issues that need to be considered in the development of the data formats. The editor The editor is the fundamental tool for the generation of XML versions of legislation or judgement, etc. Although not all drafting needs to be actually done on a specialized editor (much less an XML editor) in any real life scenario, there will be situations in which that will be possible and actually necessary. The editor will be used in three different scenarios: as an interface to activate, control and verify the automatic conversion tool previously described. Through the editor "the legal drafter" will be able to verify the correctness of the conversion, and change and add whatever the conversion engine has forgot or misidentified. as a tool to manually mark-up a document provided in a different format. Depending on the sophistication of the conversion engine, this scenario will most probably blend naturally with the first one. Surely the editor will provide for functionalities to edit and add any kind of AKOMA NTOSO-conformant markup, and will be able to check validity of the intermediate result. as an application for direct insertion of both text and markup, starting off an empty document: this will probably be the rarest scenario of use, as the drafting offices will most usually work off an existing document in some other format.
The convertor It is the converter, with the editor, the most fundamental tool for the AKOMA NTOSO system. It will take some convincing for "the drafter" to switch from her old faithful word processor and her manual system of handling amendments through a combination of glue and scissors, to use any kind of strange text editor. In the meantime, one of the most important tools will be the converter. The converter has the double purpose of converting into AKOMA NTOSO files the documents that ?the drafter? is still producing traditionally, and, most importantly of all, of converting into AKOMA NTOSO files the legacy documents, the already approved bills and acts that form the current legislative situation of this African country, and whose conversion to XML is needed for any hypertext web of references to work at all. Since legacy documents are, by definition, in any old format, and since "the drafter" is not interested in converting them into XML using an editor, "the toolmaker" will have to create an automatic mechanism for the task anyway. The converter is based on the idea of semi-automatic conversion, i.e., it has automatic processes to determine as correctly as possible the actual interesting structures, and has a manual process to confirm (or, if there is an error, to edit) the inferences made by the automatic process. In fact, this application could even be one of the modules of the editor, and use the editor itself for corrections to the automatic inferences of the converter. Of course, the amount of human editing is inversely proportional to the sophistication of the converter, and in theory large quantities of documents could be processed automatically with little or no manual intervention. The converter works by examining the typographical and textual regularities of the document, and inferring a structural or semantic role for each text fragment. For every fragment that has no deducible structural or semantic role, the presentation characteristics will be recorded instead and it will be left to the human user to infer the structural or semantic role (if any) needs to be associated to the fragment. E.g., experiences with European laws show that the basic structure of the bill (sections, subsections, clauses, preambles, conclusions, attachments, etc.) can be inferred automatically with great precision and few errors. The most important semantic elements, references and dates, can also be deduced automatically with great precision as long as the human-readable text used for them uses one of a limited number of acceptable forms. More complex structural elements (explicit modifications, specialized terms, people, etc.) might be difficult to catch automatically, but not impossible. Name resolvers The AKOMA NTOSO Naming Convention is a standard mechanism for creating identifiers of documents that can be used for accessing content and metadata regardless of storage options and architecture. AKOMA NTOSO documents will be stored on networked computers and accessible by specifying their addresses. Yet these addresses are extremely dependent on the specificities of the architecture that will be in vogue or appropriate for the economic and technical context of the moment. It is extremely inappropriate, therefore, that any content or structure that is planned to last for more than a short period of time is given direct access to the physical address of the document in the form that will be eventually used for display. For this reason, the AKOMANTOSO Naming Convention specifies an architecture-independent URI address for all relevant structures of the AKOMANTOSO standard, which cannot be used directly for accessing these structures. A Name resolver is a software tool that can, given an architecture-independent URI, identify the resource being sought and provide the current architecture-dependent address that needs to be used at any given time for actual access. Name resolvers are either indirect (in that they redirect the client application to the current address of the requested document) or direct (in that they immediately provide the requested document by generating the actual address and requesting the document as a proxy for the initial client application). Post-editing tools The post-editing tools are a number of validation, enrichment, and storage tools that are used after ?the legal drafter? has finished her editing job. All these tools require no user-interface to speak of, are managed automatically or by the system administrator of the storage centre for all AKOMA NTOSO documents. These tools include at least (but the list might be longer and more sophisticated): A content and structure validator that checks the correctness of the document instance with regard to the AKOMA NTOSO schema document, and any additional rules that were added locally. A reference validator that checks whether all references contained in the document already belong to the document collection and are correctly referenced. A metadata validator that checks whether the metadata stored with the document are correct and complete. A sophisticated and complex document management system, with search engines, hypertext functionalities, XSLT support and versioning facilities. An XSLT stylesheet (or a series thereof) to create visualizations of individual documents for a number of browsers and applications that will increase and get more sophisticated in time.
5.
Schema
This document introduces and explains the
schemas for AKOMA NTOSO 1.0, an XML-based document format for
parliamentary, legislative and judiciary documents in African
Institutions (Pre-Enactment Legislation; Post Enactment Legislation;
Parliamentary Debate Record, Parliamentary Order Paper; Miscellaneous
Parliamentary Documents; Judgements, etc.).
These Technical Annexes provide details about the AKOMA NTOSO 1.0
document structure and elements and any assumptions made in the
development of the schemas are provided, as well as some hints for
document markup using this schema.
Global Overview
AKOMANTOSO makes use of two different but connected families of schemas:
- AKOMA NTOSO General Schema: A vocabulary and minimal set of constraints that all AKOMA NTOSO documents must comply to.
- AKOMA NTOSO Detailed Schemas:
A set of stricter schemas. They provide more constraints over the same
vocabulary of elements to enforce the rules of specific document types
in specific African Parliaments. It is a requirement of AKOMA NTOSO
that all documents satisfying one of the Detailed Schemas also satisfy
the General Schema.
In this document release only the General
Schema is described in full. Thus, except when explicitly mentioned,
all rules are expected to refer to the General Schema (and thus to all
AKOMA NTOSO documents).
AKOMA NTOSO documents are completely
qualified, i.e., namespaces are used throughout. Even though some
elements use the same name as HTML elements, and in fact are directly
drawn out of the HTML vocabulary, out of simplicity it has been decided
to use one namespace only, so that all elements are similarly
qualified. The net result is that it is possible to specify the
AKOMA-NTOSO namespace as the default namespace and have no prefixes in
the instance document, while maintaining full qualification of the
documents.
The namespace for this release of AKOMA NTOSO is "'http://www.akomantoso.org/1.0'".
5.1.
General Schema: Patterns and Content Models
patterns are the abstraction and distillation of past experiences
Patterns are the abstraction and distillation
of past experiences in designing and resolving design problems. They
are general and widely applicable guidelines for approaching and
justifying design issues that often occur in XML-based projects.
We
distinguish between patterns in content models (a restriction of
content models to the ones that are actually useful) and patterns in
schema design (guidelines on how to make a schema more modular,
flexible and understandable by users). Both types of patterns are well
known and well established in the literature, although by different
experts in different ways. (For patterns in content
models, we used our own resource as a source of guidelines, the
document titled "Design patterns for document substructures", by F.
Vitali, A. Di Iorio, and D. Gubellini, while the authoritative resource
for patterns in schema design is www.xmlpatterns.com)
For Patterns in Content Models, the AKOMA NTOSO Schema uses systematically five of the 7 patterns described in "Design patterns for document substructures".
This means that all content models and complex types used in the schema
follow precisely the form of the relevant pattern, and all elements can
be simply described and treated according to their pattern rather than
individually.
These patterns are:
- The markers:
markers are content-less elements that are scattered here and there in
the document and are meaningful for their names as well as their
attributes. Markers are also known in literature as empty elements or milestones.
There are two main families of markers in the AKOMA NTOSO schema:
placeholders in the text content (e.g., note references) that can
appear in any position that also has text, and metadata elements that
only appear in some subsection of the <meta>
section. As discussed in the metadata section, all metadata elements
are markers so that metadata values are not part of the text content of
a document, but rather are attribute values.
- The inlines:
an inline element is an element placed within a mixed model element
that identifies some text fragment as relevant for some reason. There
are both semantically relevant inlines and presentation oriented
inlines. There is but one content model using inlines (and markers),
which means that all mixed model elements (i.e., those that allow both
text and elements) also allow a repeatable selection of all inline
elements. See at the end of this section for an explanation of why this
is only a trade-off decision, and not the ideal solution.
- The blocks:
a block is a container of text or structures that is organized
vertically on the display (i.e., has paragraph breaks) and can contain
either substructures or text. Most blocks in AKOMA NTOSO are based on
the HTML language. There is only one content model using blocks, and it
allows a repeatable selection of all available blocks. This means that
wherever a block is allowed (e.g., a paragraph), a table or a list is
also allowed.
- The containers: containers are sequences of specific elements, some of which can be optional. The corresponding pattern in "Design patterns for document substructures" is the record.
Containers are all different from each other (as the actual list of
contained elements vary), and so there is no single container content
model, but rather a number of content models that share the record
pattern.
- The hierarchy:
a hierarchy is a set of arbitrarily deep nested sections with title and
numbering. Each level of the nesting can contain either more nested
sections or blocks. The corresponding pattern in "Design patterns for document substructures"
is the table. No text is allowed directly inside the hierarchy, but
only within the appropriate block element (or, of course, titles and
numbering). AKOMA NTOSO uses only one hierarchy, with predefined names
and no constraints on their order or systematic layering.
There are three exceptions to the systematic use of patterns:
- The <li> element allows both inlines and other nested lists (<ul> and <ol>). The pattern would require <li> elements to contain only text, and nested lists to be direct children of the main list element (<ul>s within <ul>).
Since this goes against universal HTML practice, we have decided
against full pattern adherence and in favour of HTML tradition.
- The <mod>
element allows quoted text and structures within its content. No
problems for quoted texts, but when an amendment clause specifies in
full a new structure (such as an article) within the main discourse,
the full structure needs to be described, and it is thus possible to
have an article within a paragraph within a clause, which is against
the inline pattern. There is no simple way out of this issue.
- There are six inline elements that only make sense in the preface and preamble parts of the document: these are <ActTitle>, <ActNumber>, <ActType>, <ActProponent>, <ActDate> and <ActPurpose>.
They are in fact part of the one inline content model and thus are
available everywhere in the document. There is no simple way to define
blocks within <preamble> and <preface>
to allow these elements and paragraphs elsewhere to not allow them, so
it is better to allow them everywhere rather than unnecessarily
complicating the schema. In "Design patterns for document substructures", a direct solution to this issue is proposed (additive context, also known as inclusions),
but in the current XML technology such a constraint would require
validation using a different or additional language such as Schematron
or SchemaPath, which constitutes a possible evolution of the AKOMA
NTOSO project, but certainly not an immediate one.
For
patterns in schema design, whenever there has been a design choice to
be made that was not immediately obvious and naturally acceptable, a
relevant pattern has been sought and properly used. You can find the
relevant mentions within the schema itself, in comments and
documentation.
5.2.
General Schema
components of the general schema
All AKOMA NTOSO documents share the same root element <akomantoso>,
under which the specific document type is selected. The single root
element follows a specific design pattern "Universal root" aimed at
better identification of the root and separation of namespace and
schema declaration (available in the root) and meaningful attributes
(available in the document type element).
Types, Attributes and Groups
The schema starts with a few <group>s and <attributeGroup>s
used throughout the schema for content models and types. They are
followed by common simple types (mostly enumerations of string values)
and complex types. Complex types in this section include those
supporting four of the five main content model patterns used throughout
this schema:
- hierarchy (a hierarchy of nested elements with number and titles as shown below:);
Fig. 1 The structure of the hierarchy content model
- blocks (a sequence of block elements - e.g., paragraphs) used within containers either with required or optional identifiers);
- inline (the content model for all mixed model elements such as paragraphs);
- marker (zero length elements characterized by their attributes) either with required or optional identifiers);
- container
(the fifth content model pattern, has no common form, but lists
different elements in different orders, and individual container-like
complex patterns are spread throughout the schema. Content model
patterns are described in the section "Content Models used in the
General Schema");
Elements
After the
previously described section of the schema, the next section contains
Elements which are organized in meaningful sequence as follows:
- The root element <akomantoso>
Fig.2 akomantoso root element
- The document elements, one for each document type (<act>, <bill>, <doc>, <report> and <minutes>),
that share one of the three document formats:
&HierarchicalStructure; (that has an explicit hierarchy inside),
&OpenStructure;, that allows basically everything inside, and
&DebateStructure;, a slightly hierarchical structure for minutes
and reports.
Fig. 3 AKOMA-NTOSO with act document type showing
- The container elements,
one for each main part of the above mentioned structures, except for
clauses, described next, and meta, described in the opposite section.
- The hierarchical elements, listing the main elements that are used in the full hierarchy of nested structures of acts and bills, as well as <title>s, <num>s and <subtitle>s.
- Elements for parliamentary debates, particularly <subdivision>, <speech>, <question> and elements for open structures, particularly <item>.
- AKOMA NTOSO specific block and inline elements, including the table of content (<TOC>), the normative reference (<ref>), the defined term in a definition (<def>) the note marker (<noteref>) pointing to an editorial note placed out of line (in the meta section), the recorded time of a spoken remark (<recordedTime>), the container for amendments (<mod>)
and of two types of amendment quoted fragments: simple text fragments
(such as a few words inside quotes) or full structures (such as an
entire clause or article).

Fig 4. An example of an inline element, the <ref> element
- Generic elements:
the list of available generic elements (one for each of the five main
patterns for content models), explained in detail in a separate
section.
- HTML elements:
the list of elements, directly derived from HTML, used to provide for
presentation-oriented, rather than semantic-oriented, markup within
AKOMA NTOSO documents. They form a very strict simplification of the
HTML language, but allow for all many useful structures inside a
legislative act. HTML elements and how to use them in AKOMA NTOSO are
described in the section "HTML elements and CSS rules".
- Metadata elements provide a location for all relevant information about an AKOMA NTOSO document that does
not belong to its actual content. Metadata thus are all, by definition,
editorial additions to the text as originally composing the document.
Metadata are described in a separate section.
In this release the AKOMA NTOSO 1.0 schema
contains a full total of 129 elements, of which 59 specific to the
AKOMA NTOSO vocabulary, 6 generic elements, 16 HTML elements, and 51
metadata elements.
Design details
Generic Elements
AKOMA NTOSO 1.0
strongly supports the idea of using semantically rich terms whenever a
semantically justifiable text fragment exists in the document. This
means that it is possible that users of AKOMA NTOSO in daily work will
find the need for more elements than currently provided.
Generic elements
come to aid in this respect. Whenever a new semantic is needed to
describe a text fragment, a generic element of the appropriate content
model is used instead, and the correct label is specified in the name
attribute.
It is strongly
discouraged to use presentation-oriented elements (such as b, i, etc.)
elements to emphasize fragments that do have a semantic justification
for being emphasized. Also, each text fragment need to be enclosed
within the appropriate generic element according to its position and
content model, which is the reason for there being five generic
elements (one for each content model pattern).
Finally, an
explicit equivalence is provided between named elements and generic
elements: all named elements are just generic elements in disguise, the
value of the name attribute having been upgraded to being the full
element name. Therefore, for instance, <section> is absolutely equivalent to <hcontainer name="section">, or <noteref> is equivalent to <marker name="noteref">.
This is turn means
that it is possible to reverse the approach, and, after a revision
process, officially enrich the AKOMA NTOSO language with new elements
that have been used in the past as values for the name attribute of
generic elements.
HTML elements and CSS rules
AKOMA NTOSO uses a
number of HTML elements for text fragments whose purpose is mainly
presentation-oriented. These include paragraphs, lists, images, tables,
and so on. Furthermore, as mentioned, even HTML elements have been made
into the AKOMA-NTOSO namespace, so as to simplify the namespace
management.
Only a strict subset of the HTML language has been chosen, and no additional element should be added. In particular, headings (<H1>, <H2>
and so forth) cannot be used in AKOMA NTOSO document, since they
enforce a flat organization of sections, which is against the
fundamentally hierarchical nature of AKOMA NTOSO documents. This is
compatible with future developments of the HTML language, in particular
considering that XHTML 2.0 will include nested hierarchies with <section> and <h> elements closely resembling AKOMA NTOSO <hcontainer> and <title> respectively.
All HTML elements have exactly the same nature and role as they have in HTML documents, with one exception: <div> is a generic container rather than a generic block as in HTML. This is due to the fact that a generic block already exist (<p>), and that in many automatically produced HTML documents (e.g., Open Office and MS Word), the <div> element is in fact used as a section separator (i.e., a container) rather than a paragraph.
The <div>, <p> and <span>
elements can be considered as additional generic elements for the
container, block and inline content models, and are in fact to be
considered absolutely equivalent to <container>, <block> and <inline> elements, using the class attribute instead of the name attribute.
All HTML elements
(and, in fact, all AKOMA NTOSO elements as well) can be optionally
enriched with standard HTML core attributes allowing CSS styles with
precise presentation instructions to be associated to them. The class
and style attributes can be used as in HTML for external or internal
CSS rules, liberally and without limitations on both HTML and AKOMA
NTOSO elements.
Metadata elements
The meta section
contains all the meta-information that needs or can be added to the
actual content of the document. As a rule, all editorial content (i.e.
content added by the editorial process out of Parliament rooms) need to
be placed in the meta section, except for markup and note references.
Vice versa, all actual content of the document needs to have a place
outside of the meta section in the appropriate content sections.
All discourse and
all description of legal sources can be characterized as referring to
one of the four levels of a document as introduced by IFLA-FRBR
(International Federation of Library Associations and
Institutions-Functional Requirements for Bibliographic Records
http://www.ifla.org/VII/s13/frbr/frbr.pdf)
- WORK: the abstract concept of the legal resource (e.g., act 3 of 2005)
- EXPRESSION: any version of the WORK (whose content is specified and different from others for any reason: language, versions, etc.)
- MANIFESTATION: any electronic or physical format of the EXPRESSION: word, xml, Tiff, pdf, etc.
- ITEM: physical copy of any manifestation in the form of a file stored somewhere in some computer on the net or disconnected.
These levels
impact both on the metadata elements (each metadata element refers to
one and only one level of the four) and the identifiers (each level is
associated to a different identifier).
Meta elements are divided in eight subsections:
- Identification:
i.e., a set of information providing identification information about
each of the four FRBR levels, such as authorship, delivery date and
URI.
- Publication: all metadata elements specifying publication information about the document, such as issue and date of the official gazette.
- Classification:
a set of keywords belonging to a specified vocabulary (typically, a
thesaurus such as Eurovoc or similar) that describe the content of the
document and each individual fragment thereof.
- Lifecycle:
information about the events that the document has undergone, and
references to the documents that have caused these events. Lifecycle is
explained in section 10.
- Analysis:
a set of analytical statements about the document. Currently, these
only include information about active modifications (for amending
documents) and passive modifications (for amended documents), but can
in future expand to include detailed formal analysis of the contained
provisions.
- References:
a set of references to external entities explicitly or implicitly
mentioned in the document and in the metadata. These include both other
documents (amending, amended, referenced, referencing acts) and
instances of the AKOMANTOSO ontology.
- Notes:
this subsection contains the text of the editorial notes that might be
produced to comment and expand the actual text of the document. Note
references inside the text point to notes contained here.
- Proprietary:
this subsection allows any additional metadata to be specified in any
order and vocabulary (provided it uses a different namespace than
AKOMA-NTOSO). Proprietary metadata can be used within a specific
document management system to specify additional information useful for
internal search and document management that is not worth standardizing
and imposing across all AKOMA NTOSO implementations.
The development of the meta section is
not finished yet. For instance, support for Dublin Core metadata is
currently imperfect (there are semantic equivalences between Dublin
Core elements and AKOMA NTOSO elements, but they are not complete nor
officially described as equivalent).
Identifiers
Identifiers are
systematically used in AKOMA NTOSO. All AKOMA NTOSO elements allow an
identifier. Many relevant elements and sections require it.
Identifiers are the main way to identify fragments and parts of the
document in an unambiguous form. They can be used in document
references (e.g. links and amendment commands) as a precise pointer to
the actual part of the document mentioned (as opposed to simply
referring to a document as a whole). Also internal links need to use
identifiers. The schema does not explicitly provide a syntax for
identifiers, which is described here in human readable format.
Two kinds of identifiers are relevant to the schema:
- Document URIs: A resource is identified by a unique name according to the naming convention specified in section XX.
- Section identifiers: Identifiers
are composed by juxtaposing subidentifiers of the path needed to access
them. Legal documents provide explicit global numbering for sections
and articles, and local numbering for hierarchical subparts of them.
For instance, all parts in different sections are numbered starting
each time from 1, so "part 1" is not sufficient to clearly identify the
actual part, while "article 12" clearly points to a single and
well-specified element.
- Other concepts
dealt with the Akoma Ntoso ontology also derive from the IFLA FRBR
ontology, and include but are not limited to individuals (Person),
organizations (Corporate Body), actions and occurrences (Event),
locations (Place), ideas (Concept) and physical objects (Object).
Amendments, versions and document lifecycle
AKOMA NTOSO 1.0
includes a sophisticated mechanism to keep track of the life cycle and
evolution of a legislative document. This is particularly useful for
acts that are amended and modified in time, while maintaining their
fundamental nature.
The management of evolution of a document makes two very important assumptions:
- Amendments
and events in the life cycle of a document (including original
approval, final repeal and any other event affecting its presence in
the law system or its content) happen in precise moments in time that
can be determined objectively (albeit with difficulty) and attributed a
specific date.
- Amendments
and events in the life cycle are due to the enactment of a specific,
individual document that can be objectively traced back and identified
with an URI. If two different documents affect the same act on the same
date, then these must be counted as two different and separate events
on the amended act.
Handling events in AKOMA NTOSO centres around the <lifecycle> and <references> elements in the <meta> section. The <lifecycle> element is used to list the dates of all the events affecting a document, while <references> contains
the URIs of all the documents generating these events. Each reference
is provided with a required identifier, which is used by the event list
to specify which document is responsible for which events. These
elements must appear in all documents that have undergone two or more
events (i.e., all acts except the ones that still have no amendments).
Documents in AKOMA
NTOSO are organized in three main categories, as specified in the
contain attribute of the document type element:
- OriginalVersion:
this value reflects the fact that the content on the document is
exactly the content that has been formally and explicitly approved by
the relevant authority, with no amendments applied.
- SingleVersion:
this value reflects the fact that the content of the document is an
editorially modified version of the original act, according to one or
more subsequent amendment acts. These amendment acts and the enactment
dates of the amendments must be all present in the <lifecycle> element. Individual additions and deletions are not necessarily marked in the content.
- MultipleVersions:
this value reflects the fact that the content of the document is the
juxtaposition of fragments belonging to two or more different versions
of the same act, each fragment marked as belonging to one or many of
these versions. Thus in a MultipleVersions act there could be two or
more copies of article 2, each associated to the date it started
enactment and ended enactment.
The <lifecycle>
element is a required element for all SingleVersion and MultipleVersion
documents, and must be complete up to the enactment date of the latest
document referenced in the <lifecycle>
element (i.e., there can potentially be subsequent amendments non
included in a SingleVersion or MultiVersion document, but all
intermediate amendments must be correctly listed and referenced, even
if they play no part to the displayed content). OriginalVersion
documents need not have the <lifecycle> element, but surely can have it if the editors decide so.
In case a
MultipleVersions document is being generated, each element and text
fragment may be associated an enactment specification through the means
of the three enactment attributes: start, end and status. Each fragment
(a whole element if appropriate, otherwise a newly inserted <span> or <inline> element if no exact containing element exists) use these attribute to specify their nature.
The start and end
attributes contain an IDREF to the ID of the event that has marked the
beginning or the end of the enactment of the fragment. A start
attribute with no end attribute marks a fragment that has appeared in
an amendment and still exists in the latest recorded version of the
document. An end attribute with no start attribute mark a fragment that
was part of the original document but has been repealed before or at
the latest recorded version of the document. The status attribute
records the type of amendment of the fragment. The value "omissis" can
only be used by private editors that want to display only part of the
whole document. In this case, the structure must be complete anyway,
but the actual content can be removed is the status="omissis" attribute
is present.
5.3.
AKOMA NTOSO Detailed Schemas
Specialized schemas for all document types of individual countries
The AKOMA NTOSO Detailed Schemas are a number
of specialized schemas (in both DTD and XML Schema) for all the
document types of individual countries. As deemed necessary, they can
include up to one schema for each relevant African country of each of
the following document types:
- act – Post Enactment Legislation
- bill – Pre-Enactment Legislation
- debate record – Parliamentary Debate Record
- report – Parliamentary Order Paper
- doc – Miscellaneous Parliamentary Document
- judgement – Court's Judgements
Although
customized to deal with the fundamental structures of the corresponding
document types in each national parliamentary system, each AKOMA NTOSO
DS contains a subset of the structures contained in the GS and
therefore is completely upward compatible with it: all documents valid
according to a specific DS is also valid according to the GS.
6.
Metadata Conventions
Metadata have been often proposed to help in
organizing data, managing document effectively and obtain better
results with search engines. Data that do not have accompanying
metadata are often hard to find, difficult to access, troublesome to
integrate, and perplexing to understand or interpret. Furthermore, as
time passes, undocumented data may lose their value and relevant
memories can dissolve without trace.
Metadata
is structured data describing facts about documents, in such a way as
to help users make sense of their content, their relationships and
their history. Careful decisions about which structured data to use for
describing which facts about documents determine the identification of metadata schemas, or, under some circumstances, ontologies.
Metadata and ontologies are a way of organizing data about data, or
information used to retrieve information, in such a way as to help in
organizing, understanding and searching facts within huge quantities of
documents.
Metadata provide improved
reliability of searches, support for workflow processes, data
filtering, support for inventory of what information any organization
holds, inter-organizational consistency in describing shared facts and
documents, interoperability and long-term organization memory.
The
Dublin Core is probably the most widespread and famous schema for
metadata structures as applied to electronic documents. Yet, its
generality limits the flexibility of metadata for the scopes and extent
of applications internal to an organization. For this reason, many
metadata schemas build over the Dublin Core, but at the same time they
actually extend it for purposes of local interest only. Far more
interesting appear to be initiatives such as the one sponsored by the
International Federation of Library Associations (IFLA), called
Functional Requirements for Bibliographic Records (FRBR)
http://www.ifla.org/VII/s13/frbr/frbr.pdf), which tries to capture
several different natures in documents, such as the persistent
characteristics of different versions of the same document, as
expressed in the WORK/EXPRESSION/MANIFESTATION/ITEM specification.
It is also important to mention the recent Semantic Web
initiative: within the World Wide Web Consortium the Semantic Web is a
new initiative, heavily backed up by the director of W3C itself, Tim
Berners-Lee, the inventor of the Web, aimed at providing software with
the features to "understand" documents, rather than simply "display"
them. The idea behind the Semantic Web is to create sophisticated
applications that can derive new knowledge and exhibit complex
behaviour based on formalized statements about the content of the
documents, and expressed in terms of metadata accompanying the
documents themselves.
6.1.
Naming Convention
to identify metadata unambiguously without name clashes
A mechanism is needed to
ensure that metadata belonging to the core set can be identified
unambiguously and that proprietary metadata can be added to the various
AKOMA NTOSO document models without name clashes.
Metadata
names are constructed using a prefix pointing to the country codes of
the relevant emanating body (such as the Parliament). It is assumed
that name clashes in metadata items within a single country are
resolved before being committed to actual AKOMA NTOSO documents.
AKOMA NTOSO assumes the following conventions regarding metadata items:
- Items
without prefixes belong to the
|