Schematron Validation Reports

This page provides technical details about Schematron.

This page is intended to provide a short guide to understanding the output of Schematron. As a brief overview, Schematron can detect and report on the presence of specific things within documents using XPath. The most common use case for Schematron is automatic detection of problems or noteworthy items within XML documents in quality control scenarios, i.e. validation. When a Schematron schema is run on a document it produces a report.

Schematron produces reports in an XML format known as Schematron Validation Report Language (SVRL). SVRL, being XML, is machine readable and can be transformed into other formats such as HTML. SVRL is described in the International Standard that defines Schematron, ISO/IEC 19757-3.

Some Schematron implementations offer an API that can be used to modify their behavior to change the format of the reports that are produced. The API may be used, for example, to insert additional information into the SVRL report or to produce reports in an entirely different format (other than SVRL). The API is specific to each implementation of Schematron. SchXslt has an API which is documented here. Schematron Skeleton also offers an API which is documented here.

For the purpose of this document we will focus on standard SVRL beginning with a basic example and then describe some more advanced or optional features.

When a Schematron schema is run on a document it produces a linear sequence of messages based on its processing of the document. The sequences of messages, which makes up the content of the SVRL report, includes the messages that are produced based on evaluating the tests in the Schematron schema among other details.

The Schematron processor may make one or more passes through the document, during which the tests that are defined in the Schematron schema are evaluated. When the Schematron processor evaluates tests on a particular context node (part) of the XML document and encounters a test that produces a message, the processor may fail fast and not evaluate any other tests that could also apply to that context node. In SVRL, the element <svrl:active-pattern> indicates the start of a pass through the document, and the element <svrl:fired-rule> indicates the start of evaluating tests that apply to the particular context node. The presence of a <svrl:fired-rule> element can be understood to mean that the Schematron has processed something in the document. For example, an SVRL that does not contain any <svrl:fired-rule> indicates that the Schematron did not find anything in the document that it could check.

In Schematron, tests can be written in one of two ways:

  1. <sch:assert> outputs a message if an XPath test evaluates to false.
  2. <sch:report> outputs a message if an XPath test evaluates to true.

In other words, the assert element will assert that X is true and throw a message if it is false, and the report element does opposite. In SVRL, the message produced by <sch:assert> uses a <svrl:failed-assert> element. The message produced by a <sch:report> uses a <svrl:successful-report> element.

In some older Schematron schemas and implementations it was assumed that assert would be used for error messages and report would be used for informational messages, however this has limitations and is no longer good practice. The author of a Schematron schema may find that an XPath test is better if it is written to evaluate to true or to false and may accordingly decide whether to use <sch:assert> or <sch:report>. Whether a message an error or informational or any other role can be specified in the role attribute.

The <sch:assert> and <sch:report> elements each have a required test attribute that contains the XPath. <sch:assert> and <sch:report> also have optional attributes that get carried through to the SVRL report, which we are using consistently:

  • role defines the role (or level) of the message. The role values that we are using are ‘error’, ‘warning’, ‘info’. (The attribute is not constrained so other values are possible.)
  • id contains a unique identifier for the XPath test.

In SVRL, the <svrl:failed-assert> and <svrl:successful-report> elements have attributes test, role, id that are carried through from the <sch:assert> and <sch:report> elements. In addition, the <svrl:failed-assert> and <svrl:successful-report> elements have a location attribute that contains the XPath to the exact place in the document where the test produced the message.

The message text is defined in the Schematron as content of the <sch:assert> and <sch:report> elements. The message text may include XPath that is evaluated to produce part of the message, thereby making the message specific and helpful. In SVRL, the message text is contained within <svrl:text> elements. The <svrl:text> elements are normally directly within <svrl:failed-assert> or <svrl:successful-report>, but can also be nested within <svrl:diagnostic-reference> elements.

Note: in our Schematron for JATS 1.2 the id for each XPath test is repeated at the start of the message text. The id’s were added to the message text with the intention of removing them after testing, but so far having the id in the message text still seems useful.

Here is a hypothetical example SVRL that shows <svrl:failed-assert>, <svrl:successful-report>, and <svrl:text>.

<svrl:schematron-output xmlns:svrl="http://purl.oclc.org/dsdl/svrl" xmlns:sch="http://purl.oclc.org/dsdl/schematron" xmlns:error="https://doi.org/10.5281/zenodo.1495494#error" xmlns:schxslt-api="https://doi.org/10.5281/zenodo.1495494#api" xmlns:schxslt="https://doi.org/10.5281/zenodo.1495494" xmlns:xs="http://www.w3.org/2001/XMLSchema">
   <svrl:active-pattern documents="file:///C:/Users/lizziv/Projects/JATS/jats-1.2-upgrade/jats-1.2-workspace/jats-schematron/doc/example.xml"/>
   <svrl:fired-rule context="article-meta"/>
   <svrl:failed-assert location="/Q{}article[1]/Q{}front[1]/Q{}article-meta[1]" role="error" id="am-0001" test="count(article-id[@pub-id-type='doi']) = 1">
      <svrl:diagnostic-reference diagnostic="d1">
         <svrl:text id="d1" xml:lang="en">Please refer to tagging guidelines documentation.</svrl:text>
      </svrl:diagnostic-reference>
      <svrl:text>An article should have one DOI tagged in &lt;article-id&gt; with 
                pub-id-type="doi"</svrl:text>
   </svrl:failed-assert>
   <svrl:successful-report location="/Q{}article[1]/Q{}front[1]/Q{}article-meta[1]" role="warn" id="am-0002" test="ancestor::article[@article-type='book-review'] and not(product)">
      <svrl:diagnostic-reference diagnostic="d1">
         <svrl:text id="d1" xml:lang="en">Please refer to tagging guidelines documentation.</svrl:text>
      </svrl:diagnostic-reference>
      <svrl:text>A book review article should have details of the book(s) being 
                reviewed tagged in &lt;product&gt; element(s)</svrl:text>
   </svrl:successful-report>
</svrl:schematron-output>

The important information in the above SVRL example is extracted and shown below.

Document: example.xml

id role message location
am-0001 error Please refer to tagging guidelines documentation. An article should have one DOI tagged in <article-id> with pub-id-type="doi” /Q{}article[1]/Q{}front[1]/Q{}article-meta[1]
am-0002 warn Please refer to tagging guidelines documentation. A book review article should have details of the book(s) being reviewed tagged in <product> element(s) /Q{}article[1]/Q{}front[1]/Q{}article-meta[1]

This illustrative example Schematron sample-svrl.sch produces an SVRL report that includes varieties of errors, warnings, and informational messages. This Schematron can be run on any XML input. The SVRL report sample-svrl-report.xml shows how error, warning, and informational messages are organized in SVRL.