File and ID Naming Conventions

A file naming convention is a framework for naming files in a way that describes what they contain and how they relate to other files. All assets must follow a set of specified naming conventions to pass validation within T&F’s systems.

T&F’s file naming conventions are based on the TFJA and CATS file naming conventions and are aimed at simplifying delivery requirements whilst enabling delivery of “complete” files to support figure tracking in CATS and re-purposing content. Specific delivery requirements (for example, print assets vs online assets) will be defined using subsets of this convention.

Several data elements commonly used in the file naming convention are indicated in this document as shown below. Patterns are described using a syntax similar to Regular Expressions:

Element Token Example Pattern
Journal Acronym {journal} BATC [A-Z0-9]+
Article ID / Manuscript ID {articleid} 123456 [0-9]+
Volume Number {volume} 2 [0-9a-zA-Z]+(-[0-9a-zA-Z]+)?
Issue Number {issue} 3
3-4
[0-9a-zA-Z]+(-[0-9a-zA-Z]+)?
File Format Extension {ext} pdf [a-z0-9]+
Print or Online formatted {po} P [PO]
Description segment {description} any text [a-zA-Z0-9\x2D]+

The naming convention uses data elements and identifier tokens separated by the underscore (_) character, followed by a standard file name extension. Usually, the first segment identifies the journal, the second segment identifies the file as related to an issue (I) or an article (A), followed by segments that identify the issue or article, followed by segments that identify the contents of the file. Standard file extensions (.xml, .zip, .pdf, and so forth) are used to identify the file format.

The second segment is also used to identify item types other than issues and articles. Codes used to identify item types in the CATS file naming convention are as follows:

Code Description Example Pattern
I Issue BATC_I_2_3_J.zip See Issue Level section below…
A Article BATC_A_123456_J.zip See Article Level section below…
GN General Content BATC_GN_123456.pdf {journal}_GN_{gnid}(_{description})?
Where {gnid} is a general content identifier assigned by CATS matching [0-9]+

All files for an issue or an article should be packaged in .zip format for delivery. Issue and article .zip files should be structured the same as the issue and article folders described in the ‘Examples’ section below. The .zip file should be structured with content files placed at the top level. There should not be a single top-level folder within the .zip.

Please match case according to examples provided. Note that all file extensions are case-sensitive. For example BATC_I_2_3_J.ZIP would not be accepted, due to capitalised .ZIP.

Examples

Issue Level Files (Online)

The issue folder, or issue .zip, collects all issue-level files and article folders. An issue folder, or issue.zip, should contain at minimum an Issue XML file, the 3 issue cover files (as specified in table below) and one or more article folders. The zipped issue folder and contained issue level files should be named as follows:

File/Folder
type
Example Pattern
Issue .zip BATC_I_2_3_J.zip {journal}I{volume}_{issue}_J.zip
Table
of Contents PDF
BATC_I_2_3_TOC.pdf {journal}I{volume}_{issue}_TOC.pdf
Cover
Images
BATC_I_2_3_COVER_200.jpg

BATC_I_2_3_COVER_555.jpg


BATC_I_2_3_COVER_2055.tif
{journal}I{volume}_{issue}COVER{width}.{ext}

Where {width} is the image width in pixels matching [0-9]+ and the format is .jpg or .tif
Issue XML BATC_I_2_3_J.xml {journal}I{volume}_{issue}_J.xml
CATS Data File cats.xml cats.xml
Article Folders BATC_A_123456 See Article Level section below.
Miscellaneous Scanned Pages PDF

(Retro Issues only)
BATC_I_2_3_MISC.pdf {journal}I{volume}_{issue}_MISC.pdf

For current content, issue numbers should not be zero padded. Ranges in issue numbers or volume numbers should use a hyphen (“-“, Unicode character U+002D), to represent the range. Example: BATC_I_2_3-4_J.zip

Note on Supplement Issues:

The naming convention of a Supplement issue reflects the presence of a supplement with the use of an ‘S’ combined with the number of the supplement. For example, the journal RAIJ published a supplement in volume 175, and the supplement issue final file was named: RAIJ_I_175_S1_J.zip

If RAIJ published a second supplement in volume 175, the file would be named: RAIJ_I_175_S2_J.zip

If a supplement was published in a new volume (eg. Volume 176), the supplement number would revert to 1 (ie. RAIJ_I_176_S1_J.zip).

Example Issue (Online)

Example file names and .zip file structure for a current issue in production. This example is for journal BATC, volume 2, issue 3.

BATC_I_2_3_J.zip (Issue zip file)

  cats.xml (CATS XML metadata) 

  BATC_I_2_3_TOC.pdf (Table of Contents PDF) 

  BATC_I_2_3_COVER_200.jpg (Cover image small for online display) 

  BATC_I_2_3_COVER_555.jpg (Cover image large for online display) 

  BATC_I_2_3_COVER_2055.tif (Cover image large for online display – high res) 

  BATC_I_2_3_J.xml (Issue XML) 

  BATC_A_1000001 (Article folder) 

        BATC_A_1000001_J.xml (Article XML) 

        BATC_A_1000001_O.pdf (Article PDF) 

        graphic (Article images folder) 

              BATC_A_1000001_ILM0001.gif (Inline math image) 

              BATC_A_1000001_F0001_B.jpg (Figure 1 image, black & white) 

              BATC_A_1000001_F0002_OC.jpg (Figure 2 image, color for online) 

              BATC_A_1000001_F0002_PC.tif (Figure 2 image, color for print) 

   BATC_A_1000002 (Article folder) 

              BATC_A_1000002_J.xml (Article XML) 

              BATC_A_1000002_O.pdf (Article PDF) 

Note on retrodigitized issues: any relevant non-article pages (for example, front matter that has not been assigned a DOI but may still be of interest) should be named using the “Miscellaneous Scanned Pages PDF” convention. Retrodigitised issues will not contain cats.xml.

Article Level Files (Online)

An article folder, or an article .zip, should contain all files relevant to an article, and should have, at minimum, an Article XML file. The zipped article folder and contained article level files should be named as follows:

File/Folder type Example Pattern
Article .zip BATC_A_123456_J.zip {journal}_A_{articleid}_J.zip
Article XML BATC_A_123456_J.xml {journal}_A_{articleid}_J.xml
Article PDF BATC_A_123456_O.pdf {journal}_A_{articleid}_O.pdf
CATS Data File cats.xml cats.xml
Images folder graphic graphic
Image File BATC_A_123456_F0001_B.jpg {journal}_A_{articleid}_{xmlid}(_{po}?{cb})?.{ext}
Supplementary Files folder suppl suppl
Supplementary File BATC_A_123456_SM0001.zip {journal}_A_{articleid}_{xmlid}(_{description})?.{ext}
Media Files Folder media media
Media File BATC_A_123456_MED0001.mov {journal}_A_{articleid}_{xmlid}(_{description})?.{ext}

Example Article (Online)

Example file names and .zip file structure for a current article in production. This example is for journal BATC, article ID 1000001.

BATC_A_1000001_J.zip (Article zip file)

  cats.xml (CATS XML metadata) 

  BATC_A_1000001_J.xml (Article XML) 

  BATC_A_1000001_O.pdf (Article PDF) 

  graphic (Article images folder) 

        BATC_A_1000001_ILM0001.gif (Inline math image) 

        BATC_A_1000001_F0001_B.jpg (Figure 1 image, black & white) 

        BATC_A_1000001_F0002_OC.jpg (Figure 2 image, color for online) 

        BATC_A_1000001_F0002_PC.tif (Figure 2 image, color for print) 

Article PDF (Online)

The article PDF should be referenced using the <self-uri> element within the <article-meta> section. The content-type attribute should specify “pdf”, and the xlink:href attribute should specify the file name of the PDF. For example:

<self-uri content-type="pdf" xlink:href="BATC_A_1000001_O.pdf"/> 

Image Files (Online and Print)

Image files used in an article should be placed in a folder named “graphic” within the article folder and named using the Image File convention in the above ‘Article Level Files’ table. All image files should be referenced from within the article XML file.

Image files should be saved at the same resolution as used to create the PDF for print. When different versions of an image are needed, for example, a figure that appears in color online and in black and white in print, all versions of the file should be included and identified using the naming convention and attributes on the graphic or inline-graphic element.

For image files the tokens are as follows:

Code Description Example Pattern
{xmlid} Should match the id attribute of the object in the article XML. See list of ID prefixes below. F0001 [A-Za-z]+[0-9]{4}[A-Za-z]*
{cb} Color or Black and White (grey scale). Should match the graphic content-type attribute in the XML as “color” or “black-white” C [CB]
{po} Print or Online. Should match the graphic specific-use attribute in the XML as “print-only” or “web-only” O [PO]

All image files included with an article should be referenced within the article XML. When an item is represented in multiple forms the <alternatives> element should be used to enclose the tags for all forms of the item (for example, a figure supplied as black and white and as color, or a math equation supplied as .gif image, TeX and MathML).

Example, a standard black and white figure:

<fig id="f0001"><label>Figure 1</label> 

  <graphic xlink:href="BATC_A_123456_F0001_B.jpg" content-type="black-white"/> 

</fig> 

Example, a figure that appears in color online and in black and white (or grey scale) in print:

<fig if="f0002"><label>Figure 2</label> 

  <alternatives> 

    <graphic xlink:href="BATC_A_123456_F0002_PB.jpg" specific-use="print-only" content-type="black-white"/> 

    <graphic xlink:href="BATC_A_123456_F0002_OC.jpg" specific-use="web-only" content-type="color"/> 

  </alternatives> 

</fig> 

Inline images that have only one version do not need to include content-type or specific-use descriptors. For example:

<inline-graphic xlink:href="BATC_A_123456_ILG0001.gif"/> 

Supplementary Material Files

When an article includes supplementary online-only files, these files should be placed in a “suppl” folder within the article folder and named using the Supplementary File convention in the above ‘Article Level Files’ table using “SM” at the start of the {xmlid} segment. This example is for journal BATC article ID 1000001.

BATC_A_1000001_J.zipArticle zip file

  cats.xmlCATS XML metadata 

  BATC_A_1000001_J.xmlArticle XML 

  BATC_A_1000001_O.pdfArticle PDF 

  supplArticle supplementary files folder 

        BATC_A_1000001_SM0001.zipSupplementary material file 

Supplementary files should be referenced from within the article XML using the <supplementary-material> element with the file name placed in the xlink:href attribute. For example:

<supplementary-material id="sm0001" content-type="dataset"  

      mimetype="application" xlink:href="BATC_A_1000001_SM0001.zip"> 

   <caption><title>Survey Data</title></caption> 

</supplementary-material> 

The XML should contain a <supplementary-material> tag for each supplementary file. It is also allowable for <supplementary-material> tags to be added to the article XML before the supplementary files themselves are added to the article files.

Media Files

When an article includes media files, these files should be placed in “media” folder within the article folder and named using the Media File convention in the above ‘Article Level Files’ table using “MED” at the start of the {xmlid} segment. This example is for journal BATC article ID 1000001.

BATC_A_1000001_J.zipArticle zip file

  cats.xmlCATS XML metadata 

  BATC_A_1000001_J.xmlArticle XML 

  BATC_A_1000001_O.pdfArticle PDF 

  mediaArticle media files folder 

        BATC_A_1000001_MED0001.movMedia file 

Media files should be referenced from within the article XML using the <media> element with the file name placed in the xlink:href attribute. For example:

<media id="MED0001" mimetype="video" mime-subtype="x-msvideo"  

      xlink:show="new" xlink:href="BATC_A_1000001_MED0001.mov"/> 

A media file may have an associated image file that should be used when the media file cannot be displayed. The associated image file should have the same base file name as the media file but will have a different file extension, will be placed in the “graphic” folder and referenced in the XML using the <graphic> element. For example:

mediaMedia files folder

BATC_A_1000001_MED0001.movMedia file 

graphicImages folder

BATC_A_1000001_MED0001.jpgGraphic alternative for media file 

XML ID Conventions

The id attribute that appears on several XML elements is restricted by the schema to an NMTOKEN ([a-zA-Z0-9]+). The following conventions should be used in assigning values to id attributes based on the element the id identifies, and id attribute values should match the pattern for {xmlid} described above. Numbers should be zero-padded to four digits, and suffix letters (for example F0001A for “Figure 1a”) should be included after the number. The identifier should match the item type and number (or label) of the item in the document. Other formats, such as id values assigned by the generate-id() XPath function, are allowed but the preference is to use this convention.

Component ID prefix Example
Figure* f <fig id="f0001"><label>Figure 1</label><br><br> <br><br><br><fig id="f0002a"><label>Figure 2a</label><br><br> <br><br><br><fig-group id="f0003"><label>Figures 3a, b, c</label><br><br> <br><br><br> <fig id="f0003a"><label>a</label>
Unnumbered Figure uf <fig id="uf0001">
Inline Graphic ilg <inline-graphic id="ilg0001" <br>xlink:href<br>="..."/>
Graphic g <graphic id="g0001>
Math m <disp-formula-group id="m0001"><label>1</label>
Unnumbered Math um <disp-formula id="um0001">
Inline Math ilm <inline-formula id="ilm0001">
Chemistry c <chem-struct-wrap id="c0001"><label>Compound 1</label>
Unnumbered Chemistry uc <chem-struct id="uc0001">
Table* t <table-wrap id="t0001"><label>Table 1</label>
Unnumbered Table ut <table-wrap id="ut0001">
Affiliation aff <aff id="aff0001">
Biography b <bio id="b0001">
Boxed Text (sidebar) bt <boxed-text id="bt0001">
Footnote fn <fn id="fn0001">
Table Footnote tfn <table-wrap id="t0001"> <table-wrap-foot><fn id="tfn0001a">
Endnote en <fn id="en0001">
Citation cit <ref id="cit0001">
Section s <sec id="s0001"> <sec id="s0001-0001"> <sec id="s0001-0001-0001">
List l <list id="l0001"><list id="l0001a" continued-from="l0001">
Appendix app <app id="app0001">
Author Note an <author-notes> <corresp id="an0001">...</corresp> <fn id="an0002">...</fn></author-notes>
Supplementary Material* sm <supplementary-material id="sm0001">
Media* med <media id="med0001">
Query* q <tf:query id="q0001">

*Additional validation may be applied to these id’s by CATS data.

Note that the file name convention normally uses all upper-case characters except for the file extension, and the XML id attribute convention normally uses all lower-case characters (though it is not a requirement to use all lower-case characters in XML id attributes), so if id attribute values are used for comparison the comparison should be case-insensitive. File names placed in the xlink:href attribute should match the actual file name exactly using case-sensitivity.

For example: <inline-graphic id="ilg0001” xlink:href="BATC_A_123456_ILG0001.gif”/>.

Print Deliverables

Final Print Issue

The print issue folder, or print issue .zip, collects all issue-level files and article folders. A print issue zip should contain at minimum a cover PDF, a composite Text PDF, cover and text log files, cats.xml and one or more article PDF files. The zipped print issue folder and contained issue level files should be named as follows:

File/Folder type Example Pattern
Print Issue .zip BATC_I_2_3_P.zip {journal}_I_{volume}_{issue}_P.zip
Print Cover PDF BATC_I_2_3_COVER_P.pdf {journal}_I_{volume}_{issue}_COVER_P.pdf
Composite Print Issue text PDF BATC_I_2_3_TEXT_P.pdf {journal}_I_{volume}_{issue}_TEXT_P.pdf
CATS Data File cats.xml cats.xml
Print Article Files BATC_A_123456_P.pdf {journal}_A_{articleid}_P.pdf
General Content BATC_GN_123456_P.pdf {journal}_GN_{articleid}_P_({description}).pdf
Log file, cover BATC_I_2_3_COVER_P_LOG.pdf 
{journal}_I_{volume}_{issue}_COVER_P_LOG.pdf
Log file, composite text BATC_I_2_3_TEXT_P_LOG.pdf {journal}_I_{volume}_{issue}_TEXT_P_LOG.pdf

For print issues: Covers, and other non-article pages are logged in CATS as General Content. Covers should be named according to the cover naming convention above, and any other non-article pages should be named using the “General Content files” naming convention. The description segment may contain “C1”, “C2”, “C3”, “C4”, “OFC”, “IFC”, “IBC” or “OBC” for covers. For other non-article pages, the description segment may be assigned a descriptive label.

In print final files delivered to CATS, the “Covers PDF” naming convention above may also be used to name composite cover files, and the “Composite Inside Pages PDF” convention may be used to name the composite inside pages PDF, if these files are required.

Example Issue (Print)

Example file names and .zip file structure for a print issue in production. This example is for journal BATC, volume 2, issue 3.

BATC_I_2_3_J_P.zip (Issue zip file)

  cats.xml (CATS XML metadata) 

  BATC_I_2_3_COVER_P.pdf (print cover PDF) 

  BATC_I_2_3_TEXT_P.pdf (composite print issue PDF) 

  BATC_A_100001_P.pdf (print article) 

  BATC_A_100002_P.pdf (print article) 

  BATC_A_100003_P.pdf (print article) 

  BATC_A_100004_P.pdf (print article) 

  BATC_I_2_3_COVER_P_LOG.pdf (cover log file) 

  BATC_I_2_3_TEXT_P_LOG.pdf (text log file) 

Final Print Article

A print article folder, or a print article .zip, should contain all files relevant to an article, and should always contain a print article PDF file, and corresponding cats.xml. The zipped article folder and contained article level files should be named as follows:

File/Folder type Example Pattern
Print Article Zip BATC_A_123456_P.zip {journal}_A_{articleid}_P.zip
Print Article PDF BATC_A_123456_P.pdf {journal}_A_{articleid}_P.pdf
CATS Data File cats.xml cats.xml

Example Article (Print)

Example file names and .zip file structure for a print article in production. This example is for journal BATC, article ID 1000001.

BATC_A_1000001_P.zip (Print article zip file)

  cats.xml (CATS XML metadata) 

  BATC_A_1000001_P.pdf (Print article PDF) 

Legacy Naming Conventions

The table below details T&F’s legacy naming conventions, which are no longer in use, but may be present in our archives.

File/Folder type Example Pattern
Formatted (zero-padded) Issue Number. (Some legacy issues may be zero padded to two digits - for example, issue 3 with zero padding is 03, and issue 4-6 with zero padding is 04-06).) BATC_I_1_02_J [0-9a-zA-Z]{2,}(-[0-9a-zA-Z]{2,})?
EPF file (print) EPF.pdf EPF.pdf