Customizing Akoma Ntoso: modularization, restrictions, extensions
Restricting Akoma Ntoso without a custom schema
In this section we introduce the three mechanisms for creating restrictions in Akoma Ntoso that require no modifications or additions to the current set of schemas.
Again, it is worthy to remind that a restriction in an XML schema is closely related to the concept of subset in set theory. In particular, a schema identifies implicitly a set of valid documents, and therefore a restriction of that schema must identify a subset of the main set, i.e., every valid document according to the restriction must be a valid document according to the main schema.
The first and most obvious type of restriction is simply to avoid using undesired elements and attributes. In fact, since restrictions basically means eliminating from actual document all unneeded optional elements and attributes, it is possible to obtain exactly the desired restriction by not using the optional elements that are not needed in the specific context, without affecting the schema at all.
On the other hand, the main reason of using a schema is to have a mechanism that verifies whether the XML documents follow the specified rules, and self-constrained markup has no supporting tool to verify whether the self-constraint is actually working. In this sense, therefore, self-constrained markup makes sense either when the individual author of the XML markup is also the author of the self-constraint rules, or when it is possible to implement the self-constraint rules in a separate tool, e.g. an XML editor.
In this case, the tool will never allow the author of the markup to select and use the undesired elements and attributes, and the restriction becomes possible in a completely automatic way.
Selecting combinations and individual modules
Starting with version 2.0 of the Akoma Ntoso vocabulary, a new modular architecture of the schema has been introduced, and a tool with it that allows anyone to create custom schemas by selecting individual modules of the vocabulary. The tool is called “Akoma Ntoso Sub Schema Generator” (AKNSSG) and is shown in figure 1 and 2. A test installation is available at the URI http://akn.web.cs.unibo.it/aknssg/aknssg.html.
Currently there are 26 modules, as shown in fig. 1. The core set is required, and the user can choose optional modules. They are organized firstly in document types (legislation, reports, amendments, judgments, collections) and then around optional features (specific elements such as titled blocks, tables of content, etc.).
fig. 1: the modules’ interface of the Akoma Ntoso SubSchema Generator
Each module is associated to a name, a code, and to a list of features (i.e., of elements and attributes) it contains. Each selection of modules creates a subschema of the general schema, and lists the modules chosen for its generation, as in fig. 2, after the sentence “Current subversion contains the following modules:”.
fig. 2 – The header of a subschema with the list of selected modules
To further simplify the creation of subschema, predefined combinations have been created for the most common situations. They are available in a different page of the same application, shown in fig. 3.
fig. 3: The combinations’ interface of the Akoma Ntoso SubSchema Generator
By clicking on “more details” it is possible to read the list of selected modules for each preset combination.
Creating new combinations and new modules
Of course the selection of the name and content of each preset combination is a design choice of much lesser importance than the details of the schema itself. Thus for instance the current choice of preset combinations or the decision of whether, say, the “amendments” combination should contain the “special elements” module, is a matter of taste and sensibility that may very well find itself ungrounded or inappropriate in other situations.
Even the current list of modules is nothing more that a reasonable choice of elements and attributes for specific purposes. Their number and content is not only easier to dispute and modify than the overall schema (specific requests in that direction sent to email@example.com would be implemented rapidly and with little discussion), but it also requires little specific competency of XML schema or DTD++ and can be done with little effort. This requires just some competencies in XML and the overall structure of the Akoma Ntoso schema.
The modular architecture of Akoma Ntoso relies on an XML that uses no namespace and contains exactly 5 XML elements:
- modular: the root element.
- combos: the list of available preset combinations.
- combo: the definition of a preset combination. It has attributes id, name (used in the web interface for the identification of the combination), desc (a string shown when the “show more” link is selected) and content (a list of the ids of the include modules contained in the preset combination). For instance, the preset combination for the Act is as follows:
<combo id="acts" name="Acts" desc="All elements for describing acts and existing legislative documents" content="core legislativeDocs act modifications tblock semantic advancedRefs authorialNote specials delimiters table"/>
- report: Report is the place in the document where the actual list of modules is shown. It is currently present in one place only, in the initial large comment describing the release. The following report element:
<report version="Release 12/10/2011 - Akoma Ntoso 2.0"/>
is shown in the final schema as follows for the complete version:
Release 12/10/2011 - Akoma Ntoso 2.0
while for a subschema (in this case the minutes subschema) automatically renders as:
Release 12/10/2011 - Akoma Ntoso 2.0
Automatically generated modular subversion from the full schema.
Current subversion contains the following modules:
debateDocs core tblock toc advancedPreface semantic advancedRefs
- include: each individual preset combination. It has attributes label, desc (for description), if and v (for value). Consider for instance the two following examples
<include if="debateDocs" label="Documents for parliamentary debates and hansards" desc="Document elements $debateReport and $debate">…</include>
<include if="debateDocs" v="…" />
The if attribute determines the modules to which each document fragment belongs to. Thus in this example both include structures belong to the same module, debateDocs. There is no limit to the number and position of include elements having the same value for their if attribute: all of them will belong to the same module.
The label attribute contains the short name describing the modules, and is used in the interface of the Akoma Ntoso SubSchema Generator for the identification of the module. Similarly, the content of each desc attribute is shown as one bullet in the full list shown when the corresponding “show more” link is clicked, as follows:
Fig. 4: a detail of the on screen representation a module in the subschema generator
As can be seen, the display automatically shown with a different font all word beginning with the “$” character, which is used to point out the element names described in the module.
There are two types of include elements: those including larger portions of text of the full schema (such as the first example), whose content is directly shown within the start and end tags of the include element, and those that contain only small fragments of text (as in the second example). These are only composed of an empty include element where the text to be included is specified by the v (for value) attribute. They are used to specify, within more complex structure where the fragment is present, just the few characters that need to be included or excluded if the corresponding module is selected or not.
attribute all the ids of the modules that belong to it. After placing the full modular schema in the appropriate directory of the Akoma Ntoso SubSchema Generator, the new combination is shown in the preset tab and can be chosen to create the corresponding subschema.
To create a new module, one needs to create as many include elements as there are fragments of text, throughout the full schema, that needs to be included or excluded from the subschema if the module is selected or not. This refers to both whole sections of the schemas, e.g., where the elements and their attributes are defined, using plain include tags to contain them, as well as smaller fragments of the schema within the types and element groups that include these elements and attributes. In this case, the empty version of the include element should be used, specifying in the v attribute the few characters that should be included or excluded. The existing schema has plenty of examples where both approaches are taken, and should be used as example.