Customizing Akoma Ntoso: modularization, restrictions, extensions
The customization of XML schemas
A schema aiming at generality and universality such as Akoma Ntoso may require ways to tighten up the validation rules contained in it to allow to closely if the requirements of the different documents and needs of different institutions. It could be that the schema is too general, and combination of elements exist that are valid according to the schema but that are appropriate is a specific context. Or maybe the schema is incomplete, and specific elements or combination of elements are not contemplated while they may be required in other contexts.
The two situations described above sum up the basic variety of customization options available on an XML Schema:
- a restriction is a derived rule that is stricter than the base one, so that all documents that are correct according to the derived rule are also valid according to the base one: the set of valid documents for the derived rule is a mathematical subset of the set of valid documents according to the base rule.
- an extension is a derived rule that adds in a controlled way new features to a base rule, so that all documents that are correct according to the derived rule can be validated against the base rule by removing the additional features. The set of valid documents according to the derived rule is a controlled superset of the set of valid documents according to the base rule.
Akoma Ntoso provides simple ways to do both restrictions and extensions in a controlled and easy way without the need to manually create a custom schema, by providing a few tools to this end. More radical restrictions and extensions are possible by creating a custom schema, where both restrictions and extensions can find a place, and for this reason are described together.
The easiest way to do restrictions is simply to avoid using the undesired markup features. We call this option self-constrained markup, and although it might sound trivial, it is by far the most effective and simple way to do restrictions.
In addition to this, we created a new web-based tool, called Akoma Ntoso subschema generator (anssg), which creates restricted schemas in a simple and controlled environment by selecting and composing predefined modules. This is the easiest mechanism available for creating actual schema files that allow a validation engine to verify the documents against derived rules.
Modules come in preset combinations but can be selected and combined freely, always generating a valid subschema of Akoma Ntoso. A simple XML vocabulary over the global XML Schema allows defining additional modules and preset combinations, so a third mechanism for restrictions is to create one’s own modules and combinations by acting on the XML vocabulary.
Extending does not mean loosening constraints: if an item is required or prohibited in a rule, it must remains required or prohibited in the extended rule. In XML schema, extension only means the inclusion of additional elements in the allowed vocabulary, not the modification or overthrowing of rules over the existing items.
In Akoma Ntoso, there are three easy ways to do extensions without affecting the existing schema. The first caters to the most frequent circumstance for extensions: the need to specify additional, task-dependent or site-dependent metadata. For these situations, a specific metadata element is specified in the Akoma Ntoso grammar, <proprietary>, within which it is possible to specify just any collection of metadata with no restrictions on vocabulary, values or number, except that the added element must not belong to the Akoma Ntoso namespace.
The second way to do extensions affects the actual content, and is appropriate for those (hopefully rare) situations in which the descriptive power of the existing Akoma Ntoso elements is not enough. A number of generic elements are available for this situation, with the understanding that they must be used in the appropriate contexts and with the appropriate content and attributes.
Finally, a specific element, foreign, was added to allow the inclusion of content expressed in a XML vocabulary different from Akoma Ntoso. This may refer to mathematical formulae expressed in MathML or to drawings expressed in SVG.
Once the above-mentioned options for customizing the Akoma Ntoso schema have been explored and found ultimately insufficient, the next step requires creating one’s own version of the schema, either by including the general Akoma Ntoso and specify only the differences, or by duplicating the schema and changing it where needed. This requires careful choices, as there are requirements in the existing schema and in the underlying philosophy that should not be ignored or violated.
In particular, Akoma Ntoso heavily relies on a few fundamental principles that need to be respected in any custom schema, including the fundamental separation between content and metadata, and the basic organization of elements in patterns (containers, hierarchical containers, blocks, inlines and metadata elements). Any custom schema needs to respect these basic principles both in practice and in concept, and use separate namespaces for existing structures and new ones.