Schema

This section introduces and explains the schema for Akoma Ntoso 1.0, provide details about the Akoma Ntoso 1.0 document structure, elements and attributes. Design guides and conceptual assumptions that were made in the development of the schemas are discussed, as well as some hints for document markup using this schema.

Global Overview of the schema

Akoma Ntoso uses two kinds of schema:

  • the Akoma Ntoso general schema is the vocabulary of elements and attributes, as well as the minimal set of constraints that all Akoma Ntoso documents must comply to.
  • the Akoma Ntoso custom schemas are an open set of stricter schemas providing more constraints over the same vocabulary of elements and attributes depending on the wishes of their designers. They are meant to enforce the rules of specific document types in specific Parliaments, and as such may enforce a larger number of requirements and constraints than the general schema. It is a requirement of Akoma Ntoso that all custom schemas are restrictions of the general schema, i.e., that all documents satisfying a custom schema must also satisfy the general schema.

In this document the general schema is described in full, and all rules, except when explicitly specified, are meant to refer to the general schema (and thus are applicable to all Akoma Ntoso documents).

Akoma Ntoso documents are completely qualified, i.e., namespaces are used throughout the grammar. Even though some elements use the same name as HTML elements, and in fact are directly drawn out of the HTML vocabulary, they are not qualified within the HTML namespace, but out of simplicity it was decided to use one namespace only, so that all elements are similarly qualified. The net result is that it is possible to specify the Akoma Ntoso namespace as the default namespace and have no prefixes in the instance document, while maintaining full qualification of the documents.

The namespace for version 1.0 of Akoma Ntoso is “’http://www.akomantoso.org/1.0‘”. This namespace, and the qualification “Akoma Ntoso 1.0” have been used throughout a number of dated releases of Akoma Ntoso. Starting from July 2011, a new version of Akoma Ntoso, version 2.0, has been released, which introduces important and backward incompatible modifications to the schema under a new namespace, “’http://www.akomantoso.org/2.0‘”. Akoma Ntoso 2.0 also is undergoing a sequence of releases each carefully dated. Thus to clearly identify the specific version of the schema used for a document one must refer both to the version number and the release date of the schema. It is to be noted, though,  all documents written using any release of Akoma Ntoso 1.0 or any release of Akoma Ntoso 2.0 should be correct and valid (although possibly not as specific) according to the schemas describing subsequent releases, including the most recent ones. Furthermore, it may often be the case that documents valid for an Akoma Ntoso 1.0 schema are also perfectly valid for Akoma Ntoso 2.0 (or that they can be modified with limited effort to this end) with just the specification of the new namespace.

Starting from 20 January 2013 – Akoma Ntoso 3.0 CSD01 was released in OASIS LegalDocML TC.

General Schema: Patterns and Content Models

patterns are the abstraction and distillation of past experiences

Patterns are the abstraction and distillation of past experiences in designing and resolving design problems. They are general and widely applicable guidelines for approaching and justifying design issues that often occur in XML-based projects.

We distinguish between element categories (or patterns in content models, which are a restriction of content models to the ones that are actually useful in real life text documents) and best practices (or patterns in schema design, which are guidelines on how to make a schema more modular, flexible and understandable by users). Both types of patterns are well known and well established in the literature about markup language.

The Akoma Ntoso general schema is systematically based on six element categories. This means that all content models and complex types used in the schema follow precisely the form of the relevant category, and all elements can be simply described and treated according to their category rather than individually.

These categories are:

  1. The marker: markers are content-less elements that are meaningful for a combination of their position within the text, their names and their attributes. There are two main families of markers in the Akoma Ntoso schema: placeholders in the text content (e.g., note references) that can appear in any position where text can be found, and metadata elements that only appear in some subsection of the metadata  section.
    <noteRef n=”1” href=”#note1”/>
  2. The inline: an inline element is a container of text placed within a block to indicate some special semantic or structural characteristics associated to it. For instance, a reference within an act’s section, the name of a speaker within an hansard  and the official date in the heading of a bill  are all examples of inline elements, and simlarly would be examples of bold or italic text within a paragraph. In the following example, both term and ref are inline elements contained within a block and mixed with plain text.
  3. The block: a block element is a container of text that stands autonomously and can contain inline elements. In Western countries blocks are presented as stacked vertically over each other. Examples of blocks are paragraphs, scenes, headings, etc. In the following example, p is a block containing both text and inline elements.
    <p><def>Commission</def> means the commission established by <ref href=”#sect20”>section 20</ref></p>
  4. The container: a container is a grouping element that collects elements of other types and gives them a cumulative name. It has no presentation requirements, but provides justification to other presentational elements. Containers are all different from each other (since the actual list of contained elements varies), and so there is no single container content model, but rather a number of content models that belong to the container category. In the following example, act is a container element containing meta, preamble, sections and attachments. Although not shown in the example, meta, preamble and attachments are containers themselves, while section is a hierarchical container, as explained in the next bullet point.
    <act>
    <meta> ... </meta>
    <preamble> ... </preamble>
    <body> ... </body>
    <attachments> ... </attachments>
    </act>
  5. The hierarchical container, or hcontainer: these are special containers that form a hierarchy of containment. A hierarchy is a set of arbitrarily deep nested sections with title and numbering. Each level of the nesting can contain either more nested sections or blocks. No text is allowed directly inside the hierarchy, but only within the appropriate block element (or, of course, headings and numberings). Akoma Ntoso uses only one hierarchy, with predefined names and no constraints on their order or systematic layering. In the following example, body, chapter, paragraph and clause are sections, while p is the lowest level container.
    <body>
    <chapter id=”cha1”>
    <num>Chapter 1</num>
    <heading>Traditional communities and ...</heading>
    <paragraph id=”cha1-par2”>
    <num>2</num>
    <heading>Recognition of traditional ...</heading>
    <clause id=”cha1-par2-cla1”>
    <num>1</num>
    <content>
    <p>A community may be recognised as …</p>
    </content>            
    </clause>
    ...
    </paragraph>
    ...
    </chapter>
    ...
    </body>
  6. The popups, i.e., those elements that, within an inline flow of text, create full and fully independent structures that do not meddle nor interact with the text and inline elements that surround them. The quotedStructure element is an example of the popup category:
    <body>
    <section id="sec1">
    <heading>Amendment of section 4 of Decree 43 of 1990 (Ciskei)</heading>
    <clause id="sec1-cla1">
    <num>1.</num>
    <content>
    <p> Section 4 of the Supreme Court Decree, 1990 (Ciskei), is hereby amended by the substitution for subsection (2) of the following subsection:
    <mod id="sec1-cla1-mod1">"
    <quotedStructure id="sec1-cla1-mod1-qst1">
    <subsection id="sec1-cla1-mod1-qst1-sec4-ssc2">
    <num>(2)</num>
    <content>
    <p> Notwithstanding the provision of section [1]...</p>
    </content>
    </subsection>
    </quotedStructure>
    "</mod>
    </p>
    </content>
    </clause>
    </section>
    </body>
    

There are two exceptions to the systematic use of patterns:

  • The <li> element allows both inlines and other nested lists (<ul> and <ol>). The pattern would require <li><ul> elements to contain only text, and nested lists to be direct children of the main list element (s within <ul>). Since this goes against universal HTML practice, it was decided against full adherence to the category and in favour of the HTML tradition.
  • There are some inline elements that only make sense in the preface and preamble parts of the document: such as <docTitle>, <docNumber> etc. They are in fact part of the unique existing inline content model and thus are technically available everywhere in the document, rather than simply within the preamble. There is no simple way to define blocks within <preamble> and <preface> to allow these elements and blocks elsewhere to not allow them, so it is better to allow them everywhere rather than unnecessarily complicating the schema.

Regarding guidelines, or patterns in schema design, whenever there has been a design choice to be made that was not immediately obvious and naturally acceptable, a relevant pattern has been sought and properly used. You can find the relevant mentions within the schema itself, in comments and in the following documentation.

The general Schema

components of the general schema

All Akoma Ntoso documents share the same root element <akomaNtoso>, under which the specific document type is selected. The single root element follows a specific design pattern “Universal root” aimed at better identification of the root and separation of namespace and schema declaration (available in the root) and meaningful attributes (available in the document type element).

Types, Attributes and Groups

The schema starts with a few groups and attribute groups used throughout the schema for content models and types. They are followed by common simple types (mostly enumerations of string values) and complex types. Complex types in this section include those supporting five of the six main content model categories used throughout this schema:

  1. hierarchy
  2. blocks 
  3. inline 
  4. marker
  5. popup

The sixth content model category, container, has no common form, but each instance lists different elements in different orders, and individual container-like complex patterns are spread throughout the schema. Content model patterns are described in the section “Content Models used in the General Schema”.

Elements

After the most important and shared groups and types, the Akoma Ntoso schema next introduces  Elements which are organized in the following sequence:

  1. The root element <akomantoso> 

root element.jpeg

 

 

Akoma Ntoso root element

 

  1. The document elements, one for each document type (<act>, <bill>, <judgement>, <debateReport>, <debate>, <amendmentList>, <officialGazette>, <documentCollection>, <amendment> and <doc>), that share one of the four document formats: hierarchicalStructure, that has an explicit hierarchy inside, openStructure, that allows basically everything inside, debateStructure, a slightly hierarchical structure for minutes and reports, and judgementStructure, a flat document type composed of named sections of paragraphs. 
  2. The container elements, one for each main part of the above mentioned structures, except for clauses, described next, and meta, described in the apposite section.
  3. The hierarchical elements, listing the main elements that are used in the full hierarchy of nested structures of acts and bills, as well as the corresponding headings (e.g., <heading>, <num>, etc.)
  4. Elements for the subdivisions of a parliamentary debate, such as <administrationOfOath>, <declarationOfVote>, <communication>, <petitions>, <papers>, <noticesOfMotion>, <questions>, <address>, <proceduralMotions>,  <pointOfOrder>, <adjournment>, <rollCall>, <prayers>, <oralStatements>, <writtenStatements>, <personalStatements>, <ministerialStatements>, <resolutions>, and <nationalInterest>. which  are then filled with the elements of the specific speeches of the debate.
  5. Elements for the speeches of a debate, and particularly <speech>, <question>, <scene>, <narrative>, <summary>.
  6. Elements for judgements and open structures, particularly <blockList>, and <item>.
  7. Akoma Ntoso specific block and inline elements, including the table of content (<toc>), the normative reference (<ref>), the defined term in a definition (<def>) the note marker (<noteref>) pointing to an editorial note placed out of line (in the meta section), the recorded time of a spoken remark (<recordedTime>), the container for amendments (<mod>),  narrative part of the debate <remark> .

 

 An example of an inline element, the <ref> element

  1. Generic elements: the list of available generic elements (one for each of the five main patterns for content models), explained in detail in a separate section. 
  2. HTML elements: the list of elements, directly derived from HTML, used to provide for presentation-oriented, rather than semantic-oriented, markup within Akoma Ntoso documents. They form a very strict simplification of the HTML language, but allow for all many useful structures inside a legislative act. HTML elements and how to use them in Akoma Ntoso are described in the section “HTML elements and CSS rules”.
  3. Metadata elements provide a location for all relevant information about an Akoma Ntoso document that does not belong to its actual content. Metadata thus are all, by definition, editorial additions to the text as originally composing the document. Metadata are described in a separate section.

Design details

Generic Elements

Akoma Ntoso 1.0 strongly supports the idea of using semantically rich terms whenever a semantically justifiable text fragment exists in the document. This means that although we expect that most usual needs in markup of legal documents will have their precise corresponding element at hand, it is possible that some users of Akoma Ntoso may find sometimes the need for more elements than currently provided.

Generic elements come to aid in this respect. Whenever a new semantic is needed to describe a text fragment, a generic element of the appropriate content model is used instead, and the correct label is specified in the name attribute.

It is strongly discouraged to use presentation-oriented elements (such as b, i, etc.) elements to emphasize fragments that do have a semantic justification for being emphasized. Also, each text fragment need to be enclosed within the appropriate generic element according to its position and content model, which is the reason for there being five generic elements (one for each content model pattern).

Finally, an explicit equivalence is provided between named elements and generic elements: all named elements are just generic elements in disguise, the value of the name attribute having been upgraded to being the full element name. Therefore, for instance, <section> is absolutely equivalent to <hcontainer name=”section”>, or <noteref> is equivalent to <marker name=”noteref”>.

This is turn means that it is possible to reverse the approach, and, after a revision process, officially enrich the Akoma Ntoso language with new elements that have been used in the past as values for the name attribute of generic elements.

HTML elements and CSS rules

Akoma Ntoso uses a number of HTML elements for text fragments whose purpose is mainly presentation-oriented. These include paragraphs, lists, images, tables, and so on. Furthermore, as mentioned, even HTML elements have been made into the Akoma Ntoso namespace, so as to simplify the namespace management.

Only a strict subset of the HTML language has been chosen, and no additional element should be added. In particular, headings (<H1>, <H2> and so forth) cannot be used in Akoma Ntoso document, since they enforce a flat organization of sections, which is against the fundamentally hierarchical nature of Akoma Ntoso documents. This is compatible with future developments of the HTML language, in particular considering that future versions of HTML may include nested hierarchies with <section> and <h> elements closely resembling Akoma Ntoso <hcontainer> and <title> respectively.

All HTML elements have exactly the same nature and role as they have in HTML documents, with one exception: <div> is a generic container rather than a generic block as in HTML. This is due to the fact that a generic block already exist (<p>), and that in many automatically produced HTML documents (e.g., Open Office and MS Word), the <div> element is in fact used as a section separator (i.e., a container) rather than a paragraph.

The <div>, <p> and <span> elements can be considered as additional generic elements for the container, block and inline content models, and are in fact to be considered absolutely equivalent to <container>, <block> and <inline> elements, using the class attribute instead of the name attribute.

All HTML elements (and, in fact, all Akoma Ntoso elements as well) can be optionally enriched with standard HTML core attributes allowing CSS styles with precise presentation instructions to be associated to them. The class and style attributes can be used as in HTML for external or internal CSS rules, liberally and without limitations on both HTML and Akoma Ntoso elements.

Metadata elements

The meta section contains all the meta-information that needs or can be added to the actual content of the document. As a rule, all editorial content (i.e. content added by the editorial process out of Parliament rooms) need to be placed in the meta section, except for markup and note references. Vice versa, all actual content of the document needs to have a place outside of the meta section in the appropriate content sections.

All discourse and all description of legal sources can be characterized as referring to one of the four levels of a document as introduced by IFLA-FRBR (International Federation of Library Associations and Institutions-Functional Requirements for Bibliographic Records http://www.ifla.org/VII/s13/frbr/frbr.pdf)

  • WORK: the abstract concept of the legal resource (e.g., act 3 of 2005)
  • EXPRESSION: any version of the WORK (whose content is specified and different from others for any reason: language, versions, etc.)
  • MANIFESTATION: any electronic or physical format of the EXPRESSION: word, xml, Tiff, pdf, etc.
  • ITEM: physical copy of any manifestation in the form of a file stored somewhere in some computer on the net or disconnected.

These levels impact both on the metadata elements (each metadata element refers to one and only one level of the four) and the identifiers (each level is associated to a different identifier).

Meta elements are divided in nine subsections:

  • Identification: i.e., a set of information providing identification information about each of the four FRBR levels, such as authorship, delivery date and URI.
  • Publication: all metadata elements specifying publication information about the document, such as issue and date of the official gazette.
  • Classification: a set of keywords belonging to a specified vocabulary (typically, a thesaurus such as Eurovoc or similar) that describe the content of the document and each individual fragment thereof.
  • Lifecycle: information about the events that the document has undergone, and references to the documents that have caused these events. Lifecycle is explained in section 10.
  • Analysis: a set of analytical statements about the document. Currently, these only include information about active modifications (for amending documents) and passive modifications (for amended documents), but can in future expand to include detailed formal analysis of the contained provisions.
  • TemporalInfo: a set of temporal arguments for defining the intervals of time linked to the life-cycle of the document or with the legislative process workflow.
  • References: a set of references to external entities explicitly or implicitly mentioned in the document and in the metadata. These include both other documents (amending, amended, referenced, referencing acts) and instances of the Akoma Ntoso ontology.
  • Notes: this subsection contains the text of the editorial notes that might be produced to comment and expand the actual text of the document. Note references inside the text point to notes contained here.
  • Proprietary: this subsection allows any additional metadata to be specified in any order and vocabulary (provided it uses a different namespace than Akoma Ntoso). Proprietary metadata can be used within a specific document management system to specify additional information useful for internal search and document management that is not worth standardizing and imposing across all Akoma Ntoso implementations.
Identifiers

Identifiers are systematically used in Akoma Ntoso. All Akoma Ntoso elements allow an identifier. Many relevant elements and sections require it. Identifiers are the main way to refer to fragments and parts of the document in an unambiguous form. They can be used in document references (e.g. links and amendment commands) as a precise pointer to the actual part of the document mentioned (as opposed to simply referring to a document as a whole). Also internal links need to use identifiers. The schema does not explicitly provide a syntax for identifiers, which is described here in human readable format.

Two kinds of identifiers are relevant to the schema:

  • Document URIs: A resource is identified by a unique name according to the naming convention of Akoma Ntoso (continually updated in the Release Notes of each release).
  • Section identifiers: Identifiers are composed by juxtaposing subidentifiers of the path needed to access the corresponding elements. Legal documents provide explicit global numbering for sections and articles, and local numbering for hierarchical subparts of them. For instance, all parts in different sections are numbered starting each time from 1, so “part 1” is not sufficient to clearly identify the actual part, while “section 12” clearly points to a single and well-specified element.
  • Other concepts dealt with the Akoma Ntoso ontology also derive from the IFLA FRBR ontology, and include but are not limited to individuals (Person), organizations (Corporate Body), actions and occurrences (Event), locations (Place), ideas (Concept) and physical objects (Object).
Amendments, versions and document life cycle

Akoma Ntoso 1.0 includes a sophisticated mechanism to keep track of the life cycle and evolution of a legislative document. This is particularly useful for acts that are amended and modified in time, while maintaining a continuity in time of their fundamental nature.

Managing the evolution of a document requires two very important assumptions: that amendments and events in the life cycle of a document (including original approval, final repeal and any other event affecting its presence in the law system or its content)

  • happen in precise moments in time that can be determined objectively (albeit possibly with difficulty) and attributed to a specific date.
  • are due to the enactment of a specific, individual document that can be objectively traced and identified with an URI. If two different documents affect the same act on the same date, then these must be counted as two different and separate events on the amended act.

Handling events in Akoma Ntoso centres around the <lifecycle> and <references> elements in the <meta> section. The <lifecycle> element is used to list the dates of all the events affecting a document, while <references> contains the URIs of all the documents generating these events. Each reference is provided with a required identifier, which is used by the event list to specify which document is responsible for which events. These elements must appear in all documents that have undergone two or more events (i.e., all acts except the ones that still have no amendments).

Documents in Akoma Ntoso are organized in three main categories, as specified in the contains attribute of the document type element:

  1. OriginalVersion: this value reflects the fact that the content on the document is exactly the content that has been formally and explicitly approved by the relevant authority, with no amendments applied.
  2. SingleVersion: this value reflects the fact that the content of the document is an editorially modified version of the original document, according to one or more subsequent amendments. These amendments and the enactment dates of the amendments must be all present in the <lifecycle> element. Individual additions and deletions are not necessarily marked in the content. 
  3. MultipleVersions: this value reflects the fact that the content of the document is the juxtaposition of fragments belonging to two or more different versions of the same document, each fragment marked as belonging to one or many of these versions. Thus in a MultipleVersions document there could be two or more copies of article 2, each associated to the date it started enactment and ended enactment.

The <lifecycle> element is a required element for all SingleVersion and MultipleVersion documents, and must be complete up to the enactment date of the latest document referenced in the <lifecycle> element (i.e., there can potentially be subsequent amendments non included in a SingleVersion or MultiVersion document, but all intermediate amendments must be correctly listed and referenced, even if they play no part to the displayed content). OriginalVersion documents need not have the <lifecycle> element, but surely can have it if the editors decide so.

In case a MultipleVersions document is being generated, each element and text fragment may be associated an enactment specification through the means of the three enactment attributes: start, end and status. Each fragment (a whole element if appropriate, otherwise a newly inserted <span> or <inline> element if no exact containing element exists) uses these attribute to specify their nature.

The start and end attributes contain an IDREF to the ID of the event that marked the beginning or the end of the enactment of the fragment. A start attribute with no end attribute marks a fragment that has appeared in an amendment and still exists in the latest recorded version of the document. An end attribute with no start attribute mark a fragment that was part of the original document but has been repealed before or at the latest recorded version of the document. The status attribute records the type of amendment of the fragment. The value omissis can only be used by private editors that want to display only part of the whole document. In this case, the structure must be complete anyway, but the actual content can be removed is the status=”omissis” attribute is present.

Akoma Ntoso Custom Schemas

specialized schemas for all document types of individual countries

The Akoma Ntoso Custom Schemas are a number of specialized schemas (in XML Schema, but a non-authoritative DTD version is available as well) that contain special additional rules specific to document types that are peculiar to individual countries. As deemed necessary, a custom schema can be created for each country and each of the allowed document types.

Although customized to deal with the fundamental structures of the corresponding document types in each national parliamentary system, each Akoma Ntoso Custom Schema contains a number of additional constraints on the same structures of the general schema and therefore is completely upward compatible with it: all documents valid according to a specific custom schema are also valid according to the general schema.