Empty Tags

The Journal Article Tag Set (JATS) provides a structure, defined as a set of XML tags that have semantic meaning, which enables automated processing of journal articles. In some respects, the XML tags that are used to mark up an article are nearly as important as the intellectual content of the article because the XML tags are the basis for the correct operation of functionality for formatting, display, indexing, distribution, discovery, and re-use of journal articles. In order to support this functionality, it is necessary for the JATS XML tags to be used correctly to identify the parts of a journal article at a granular level. If an XML tag is included where it is not needed, even if it is empty of text or data, and even if it is not visibly noticeable in the display of an article, it can interfere with processing.

Most of the XML tags that are defined by JATS are intended to contain text or data. In general, it is an error for elements and attributes to be present and not contain text or data. There are only a few elements defined by JATS that have semantic meaning merely by their presence and do not need to contain text or data, so there are a few exceptions to this rule.

All elements in JATS fall into one of the following categories:

  • Elements that contain text and/or other elements and may have attributes. Most elements are in this category.

  • Elements that are normally in the above category, but in certain situations may need to be present and empty. The only elements in this category are for table cells and table headings, <oasis:entry>, <td>, <th>.

  • Elements that have attributes to hold data and contain neither text nor elements. Examples of this is are <graphic> and <page-count>.

  • Elements that are defined EMPTY and have no attributes to contain data. Examples of this are <break/> and <hr/>.

  • Elements that preserve whitespace (spaces, tabs, and line endings), instead of allowing whitespace to be reformatted, in which a space without any other characters can be significant. The only element in this category is <x>, although <tex-math>, <preformat>, <code>, and <glyph-data> have a similar characteristic for preserving whitespace.