PubMed Central Tagging Guidelines


Introduction
General Tagging Practice
Document Objects
square Elements
Update History


Tools & Resources
Style Checker
Fully-Tagged Samples
Fully-Tagged Citations
Email List
Tag Library

Elements

These XML elements have specific style rules associated with them. This is not a complete list of XML elements included in the NLM Journal Publishing DTD. For guidance using elements not listed here, consult the articleTag Library.

<abbrev-journal-title>
Used to hold a shortened form of the journal title.
<abstract>
There may be more than one abstract in an article. If there is, identify each with an @abstract-type.
<abstract> allows <title>, <p>, and <sec>.
Only set a <title> if a title is something other than "Abstract".
Only use <sec> if the abstract has been divided into sections. Most abstracts can be tagged with just <p>. However, if an abstract is divided into sections, use the <sec>. Do not just set the section titles in <bold> or other formatting.
Do not use @sec-type for sections inside <abstract>.
Do not include Citation Information and/or Copyright Information at the end of the abstract. This information should be included elsewhere in the <article-meta>.
Any keywords should be set in <kwd-group>. Do not include keywords in the <abstract>.
See Sample 1 for examples.
Use @xml:lang for articles with abstracts in multiple languages.
English article with non-English title and/or abstract

An English article does not need to have an @xml:lang at the <article>. A non-English article title should be tagged as a <trans-title> within <title-group>. A non-English abstract should be tagged as a <trans-abstract> following the <abstract>.
Non-English article with English title and/or abstract

Set the @xml:lang value to the main language of the article on the <article>. Use <trans-title> within <title-group> and <trans-abstract> to set the English translation of the article title and abstract (or for any other language version that is not in the main language of the article.
<aff>
Within <aff> it is not necessary to identify and tag each element of the address.
Set any label or symbol in the <label> element.
Do not add symbols or labels to define the relationship between contributors and affiliations.
Put the rest of the affiliation information into the <aff> as PCDATA. Follow copy for all punctuation.
See Author/Affiliation Relationship for information on how authors and affiliations are "linked".
See Sample 1 and Sample 2 for examples.
<article>
The root element. It allows <front>, <body>, <back>, <sub-article>, and <response>.
If namespace declarations for MathML and XLink are needed in the article, tag them as attributes on the article (see below).

attributes:
  • article-type—Most of the articles should have the value of "research-article". Allowed values are: "abstract", "addendum", "announcement", "article-commentary", "book-review", "books-received", "brief-report", "calendar", "case-report", "correction", "discussion", "editorial", "in-brief", "introduction ", "letter", "meeting-report", "news", "obituary", "oration ", "other", "product-review", "reply", "research-article", "retraction", "review-article". See the Tag Library for descriptions of these values. (#REQUIRED)
  • dtd-version—Do not use.
  • xmlns:mml—Fixed value "http://www.w3.org/1998/Math/MathML"
  • xmlns:xlink—Fixed value "http://www.w3.org/1999/xlink"
<article-categories>
This is a required element that holds subject and other "sorting" type information about the article. PMC requires that there be a <subj-group> with @subj-group-type="heading" to hold headings used to sort the articles on the TOC.
The content of the <subject> may describe the type of article ("Editorial", "Obituary") or the content of the article ("Physical Sciences", "Psychology"). If no subject is available, use "Article".
A set of subjects may have multiple levels, see the Tag Library for more details.
Other <subj-group> with different @subj-group-type may be included.
<article-id>
Contains any unique identifier assigned to the article, such as pii, doi, or PubMed ID.
Each <article-id> can store a single identifier.
See Sample 2 for examples.

attributes:
  • pub-id-type—Use values defined in Tag Library.
<article-meta>
Contains information specific to the article, like citation information and the Abstract. It includes the following elements, in order:
<article-id> - contains any unique identifier assigned to the article.
<title-group> - this includes the article title/subtitle
<contrib-group> - contains contributor information. <contrib-group> and <aff> may interleave.
<author-notes> - contains notes with information specific to the author(s).
<pub-date> - holds the publication date of the article.
<license> - contains information about terms of use of the content.
<self-uri> - a link to a different version of the article.
<abstract> - the article's abstract.
<kwd-group> - keywords, if supplied.
<contract-num> - contract number. This should link to/from <contract-sponsor>.
<contract-sponsor> - contract sponsor. This should link to/from <contract-number>.
<counts> - counts of objects in the article.
<custom-meta-wrap> - custom metadata (might not need this one yet).
<article-title>
This contains the article title exactly as it appears on the article.
If the article is a Book Review, use the explicit title found on the article. If no explicit title is present, use the title of the first (or only) book being reviewed. Then, use <product> to include detailed information about the title being reviewed; see Article Title for examples.
<author-notes>
This is a wrapper element for any footnotes that relate directly to the author. It includes <fn> and <corresp>. Correspondence information (beyond a simple corresponding author yes/no; see Author Names) should be set in <corresp>. All other author-related footnotes should be set in <fn>. Appropriate @fn-type values include:
ValueMeaning
comarticle was communicated by
conarticle was contributed by
current-affcurrent affiliation
deceasedPerson has died since the article was written.
equalcontributed equally in the creation of the document
present-addresscontributor's current address
See Sample 1 for an example.
<back>
Carries all of the article backmatter. It allows <ack>, <app-group>, <bio>, <fn-group>, <ref-list>, <glossary>, <notes>, and <sec>.
Do not set <title>.
<fn-group> should be used to hold all article-level footnotes.
<body>
carries all of the article body. It allows <p> and all paragraph-level objects and then repeating recursive sections (<sec>).
Use <title> for section titles. Do not just set the section titles in <bold> or other formatting.
All named Figures and Tables should be collected in a <sec> in <back>.
Back matter elements, such as <app-group>, <glossary> and <ref-list>, should not appear at the end of <body>.
<caption>
Within the caption, a <title> should be tagged separately from other text of the caption (inside <p>). Do not tag the object's identifying number here, instead use <label>.
<citation>
contains a bibliographic description of a work.
<citation> can appear within body or <ref-list>. It should include a citation-type attribute.
See Sample PubMed Central Citations for fully-tagged examples.

attributes:
  • citation-type—defines the type of work being cited (for example, book or journal.) Required.
<contract-num>
It should be linked to a <contract-sponsor>.

attributes:
  • rid—Reference to corresponding <contract-sponsor> with the prefix "CS".(#REQUIRED)
  • xlink:atts—Do not use.
<contract-sponsor>
<contract-sponsor> needs to be set only once, even if there is more than one grant listed. It should be linked to/from at least one <contract-num>.

attributes:
  • id—A target for the corresponding <contract-num> with the prefix "CS".(#REQUIRED)
  • xlink:atts—Do not use.
<contrib>
This holds information about a single author. We expect <name> (for people) or <collab> (for a group), <degrees>, <aff>, <email>, <ext-link> (@ext-link-type="uri" for an author's website), <role>, and <xref>.
Any <xref> in the author information should have @ref-type="author-notes" and point to a target <fn> in <author-notes>.
See Author Names for more information.
See Sample 1 and Sample 2 for examples.

attributes:
  • contrib-type—This should be set to "author" for all <contrib>, unless someone has been explicitly labelled as an editor.(#REQUIRED)
  • corresp—set as "yes" if the author is listed as the corresponding author.
  • deceased—set as "yes" if the author is indicated to have passed on.
  • equal-contrib—set as "yes" on each author that is indicated to have "contributed equally to this work".
  • id—Do not use.
  • rid—Do not use.
  • xlink:atts—Do not use.
<contrib-group>
This element holds the authors' names.
The relationships between <contrib>, and <aff> can be complex, but we should be able to simplify things here.
If there is an address or affiliation supplied for each <contrib>, include the <aff> in the tagging for the <contrib>.
If there is one affiliation supplied for all of the contributors, include the <aff> in the tagging for the <contrib-group>.
If there are multiple <contrib-group>, each with a different <aff>, include the corresponding <aff> in the tagging for the <contrib-group>.
See Sample 1 and Sample 2 for examples.
<copyright-statement>
contains the complete copyright statement as it appears in the source.
The contents will usually be the word "Copyright", a copyright symbol, the copyright year, and the name of the copyright holder. The year of copyright should also be tagged in <copyright-year>, whether or not it appears as part of the <copyright-statement>.
Should be contained in <permissions>.
<copyright-year>
Contains only the 4-digit year of copyright.
Should be contained in <permissions>.
<corresp>
Within <corresp> it is not necessary to identify and tag each element of the address.
Set any label or symbol in the <label> element.
Put the address, phone, and fax information into the <corresp> as PCDATA. Follow copy for all punctuation.
Tag any email address as <email>. Do not use <phone> or <fax>.
<day>
Must be an integer from 1-31.
<elocation-id>
Use only when article does not have a <fpage>. Used mostly for electronic-only or online first articles.
See Sample 1 for an example.
<ext-link>
Tag link information outside of the scope of the article.

attributes:
  • ext-link-type—Use value defined in Tag Library.
  • xlink:href—Must be defined to identify external reference.
<fig>
See Sample 2 for examples.

attributes:
  • id—Define using prefix "F".
  • position—Use "anchor" for an inline-figure, including figures that are contained within another object (<fig>, <table>, <media>) Use "float" for all other figures.
<fig-group>
Do not use.
<fpage>
The first page of the article.
If more than one article shares the same first page, specify @seq. The first article should have @seq="a", the second @seq="b", etc. See description of Continuous Makeup Articles.
<floats-wrap>
Tag all unreferenced, floating figures and tables in this element with @position="float".
<fn>
Tag author footnotes in <author-notes>
Tag footnotes that apply to the article as a whole in <fn-group> in <back>.
Tag table footnotes in <table-wrap-foot>.

attributes:
  • id—Required. Use prefix FN for non-table footnotes. Use prefix TFN for table footnotes.
  • symbol—Do not use. Capture information in <label>.
<front>
Carries all of the article frontmatter. It allows <journal-meta>, <article-meta>, and <notes>.
<journal-meta> and <article-meta> are required and detailed later in this document.
Do not use <notes> here in the frontmatter.
<history>
Contains one or more <date> elements.
See Sample 1 for an example.
<issn>
Contains the ISSN(s) for the journal. At least one <issn> must be supplied.
See Sample 2 for examples.

attributes:
  • pub-type—set as "ppub" for a print ISSN and "epub" for an electronic ISSN. If a journal has both, include two successive <issn> tags.(#REQUIRED)
<issue>
Tag numeric issues as an integer only. If Roman numerals are used, tag as the Roman numeral only. Do not include the word "issue" in the tag.
Tag "Part" issues as "Pt [integer]". Tag "Supplement" issues as "Suppl [integer]". If the Part or Supplement has no integer, then tag as "Pt" or "Suppl" only.
If there is no issue number, do not tag at all.
<journal-id>
Multiple journal-ids may be tagged. Specify type of id in @journal-id-type. If PubMed abbreviation is available, tag with @journal-id-type="nlm-ta".

attributes:
  • journal-id-type—Required
<journal-meta>
Contains all information about the journal.
Requires <issn>, <journal-id>, <journal-title>, and <publisher>.
<journal-title>
Contains the complete title of the journal in which the article is published.
<label>
Contains label information only (Table 1, Figure 1). Does not include <caption> data, including <title>.
Do not include emphasis that encompasses the entire contents.
<license>
Contains license information for the article. Should not contain <copyright-statement> information.
Should be contained in <permissions>.
A URI to the license description should be captured in the @xlink:href.
See Sample 1 for an example.
<license license-type="open-access" 
   xlink:href="http://creativecommons.org/licenses/by/2.5/">
	<p>This is an open-access article distributed under the terms of the 
	Creative Commons Attribution License, which permits unrestricted use, 
	distribution, and reproduction in any medium, provided the original 
	work is properly cited.</p>
</license>
	
<list>
<list> may or may not have a title. It must have one <list-item> for each point in the list.
See Sample 1 and Sample 4 for examples.

attributes:
  • list-type—Use list of values defined in Tag Library: "order", "bullet", "alpha-lower", "alpha-upper", "roman-lower", "roman-upper", "simple".(#REQUIRED)
  • prefix-word—This is for a word that should prefix the generated label in the list, eg. "Step".
<list-item>
Contains one item in the list per tag. Each item may contain multiple <p>.
Do not use <label>.
<lpage>
The numeric last page of the article. Tag even if the last page is the same as the first.
<mml:math>
This is the root element for MathML.
Do not use to tag single characters.
<month>
Must be an integer from 1-12.
If you have a month range (January-March), tag it as a <season>.
<notes>
is used for article-level notes. These often appear at the beginning of the article, the end of the article or after a <table>.
Notes about the author(s) should be placed in <article-notes>.
Article Disclaimers should be tagged as <notes> in the front matter.
Notes in Proof should be tagged as <notes> in the back matter.

attributes:
<p>
Paragraph. Contains text of article.
<product>
Tag the citation information of the product as completely as possible. Information that does not have an associated element should be captured in <comment>. If there is an associated image, tag as an <inline-figure> within product.
See Sample 4 for an example.
<pub-date>
holds the publication date(s) for the article.
If an article has more than one publication date, create successive <pub-date> tags.
The most common values are "epub", "ppub", "epub-ppub", "epreprint", and "collection". When an article is published in more than one medium, there should be more than one <pub-date>. The number and type of <pub-date> to be set depends on the publication model of the journal. There are two classes of publication: issue-based publication and article-based publication.
Issue-based publication is when an entire issue is "published" at one time — in print, online, or both. Issue publication dates and article publication dates coincide, so the issue publication date is all that is needed.
Article-based publication is when articles are "published" individually or in small groups. They may be published in Issues (eg, all of the articles published online in June 2005 are in the "June" issue. This issue may be printed, or it may not. Even if the articles are not collected in Issues, they are collected in some way - perhaps by random collection dates, by volume, or by year. This "collection" date usually does not coincide with the article publication date.
Hopefully some models will help illustrate:
Print-only Model — This is the traditional print model. Articles are collected by the editor, formed into issues, and published and issue-worth at a time. All articles have the same publication date, which coincides with the Issue cover date. For this model, each article will have <pub-date> with @pub-type="ppub". In this model, the issues may go "online" the same day as the print date or sometime thereafter, but the official publication date is controlled by the Issue cover date.
Print–Online Coincident Model — This is a similar model to the Print-only Model, except there is a little more emphasis on the fact that the online version of the issue is published on the same date. For this model, each article will have <pub-date> with @pub-type="epub-ppub".
Print with Electronic Articles Prepublished Model — In this model, some or all of the articles are published (their official publication date) online before the publication date of the print issue. For this model, each prepublished article will have <pub-date> with @pub-type="epub" to represent its individual online publication date, and every article will have a <pub-date> with @pub-type="ppub" to represent the printed issue date.
Print with Electronic "Preprints" — This is similar to the "Print with Electronic Articles Prepublished Model" where articles from an issue appear online before the publication date of the issue. The difference, however, is that the online versions of the articles are electronic preprints and are not officially published. All of the articles are published on the issue cover date, and they all have the same publication date. For this model, each preprint article will have <pub-date> with @pub-type="epreprint" to represent date it was available online, and every article will have a <pub-date> with @pub-type="ppub" to represent the printed issue date.
Articles Published Online and Collected into Print Issues Model — For this model, each article will have <pub-date> with @pub-type="epub" to represent its individual publication date and a <pub-date> with @pub-type="ppub" to represent publication date of the print issue.
Articles Published Online and Collected into "Issues" Online Model — Issues here do not need to be named "Issue 1", "Issue 2", etc. These are just collections of articles online. The collecting might be done by dates (Months, Quarters, Years) or by Volume. For this model, each article will have <pub-date> with @pub-type="epub" to represent its individual publication date and a <pub-date> with @pub-type="collection" to represent the date of the online collection it belongs in. This date may be a day/month/year, month/year, season/year, or just year.
See Sample 2 for examples.

attributes:
  • pub-type—The values depend on the model of publication used. See above for details.(#REQUIRED)
<ref-list>
<ref-list> contains a set of references (<ref>). It may or may not have a title.
<ref>
<ref> contains a reference of some kind. It will usually contain a single <citation> or <nlm-citation>; however, complex References may contain multiple citations or a combination of text and citations.
See References for details on complex references.
See Sample PubMed Central Citations for fully-tagged examples of <citation> and <nlm-citation>.
<related-article>
The related article's citation information should be captured using the attributes. When using a DOI as the citation, tag the DOI in @xlink:href and @ext-link-type="doi". May be empty.
See Sample 1 and Sample 5 for examples.

attributes:
  • related-article-type—Value should describe the type of article being pointed to. Only use a value listed in the Tag Library.(#REQUIRED)
  • ext-link-type—Required if @xlink:href is specified.
  • journal-id-type—Required if @journal-id is specified.
  • id—(#REQUIRED)
  • page—Include only the first page of the target article.
<response>
Tag fully, including metadata. Any <journal-meta> or <article-meta> not explicitly tagged in <front-stub> is inherited from the parent <article>.
See fully-tagged examples in Response and Sub-Article

attributes:
  • response-type—Specify. Values: reply, discussion, addendum
<season>
This is a text element. Values might include "Spring", "Fall-Winter", "April-June".
Ranges of months are considered seasons.
Do not include the year in <season>.
<sec>
Section should contain <title>, <label>, or both. If it does not have either, do not tag it as a section.
<sig-block>
Used to capture signatures. If multiple signatures appear, capture in a single <sig-block>.
<sub-article>
Tag fully, including metadata. Any <journal-meta> or <article-meta> not explicitly tagged in <front-stub> is inherited from the parent <article>.
See fully-tagged examples in Meeting Reports/Abstracts and Response and Sub-Article

attributes:
  • article-type—Specify. Use the same values described on <article>
<subject>
Defines the subjects of an article within <article-categories>. These are often used to indicate the Table of Contents headings for an article.
Do not include any <xref> within <subject>. Any article-level footnotes that must be referenced should be referenced from the <article-title>.
<table>
Table should be fully tagged.
See Sample 1 and Sample 2 for examples. More fully-tagged tables are available on the DTD website.

attributes:
  • frame—Specify as "hsides"
  • rules—Specify as "groups"
<table-wrap>


attributes:
  • id—Define using prefix "T".
  • position—Use "anchor" for an inline-table, including tables that are contained within another object (<fig>, <table>, <media>) Use "float" for all others.
<table-wrap-group>
Do not use.
<title-group>
This holds the <article-title>.
It also allows <subtitle>, <trans-title>, <alt-title> and <fn-group>.
If there is a footnote to the title, put the <xref> (with @ref-type="fn") in the <article-title> or <subtitle> element, and set the <fn> in the <fn-group> in <back>.
If the article is a book review, see <article-title> for special rules.
<volume>
Tag numeric volumes as an integer only. If Roman numerals are used, tag as the Roman numeral only. Do not include the word "volume" or any related abbreviation in the tag.
If the citation is part of a special issue with no specified volume (like "Supplement 2005"), tag "Suppl" in volume and "2005" in <year>.
If there is no volume number, do not tag at all.
<xref>
used to link to objects in the text. There is a complete list of @ref-types in the DTD documentation, but we should only need those from the following table in this project. The ID prefixes should correspond to the IDs defined under ID.
ValueMeaningPrefixTarget Element
appappendixAPP <app>
author-notesfootnote to authorFN <fn>
bibrbibliographic referenceR <ref>
boxed-texttextbox or sidebarBX <boxed-text>
disp-formuladisplay formulaFD <disp-formula>
figfigureF <fig>
fnfootnoteFN <fn>
listlist or list itemL <list> or <list-item>
secsectionS <sec>
supplementary-materialsupplementary contentSD <supplementary-material>
tabletableT <table-wrap>
table-fntable footnoteTFN <fn>
We don't expect to need @ref-type="aff" because the affiliation information should all be supplied with the <contrib> or <contrib-group>.
Do not tag emphasis or style inside or around the <xref>.
When the text contains a list of <xref>s, tag each individually with the corresponding @rid value. When there is a range, tag the first <xref> with the @rid of the beginning of the range and the last <xref> with the @rid with the end of the range. Separate the two with the Unicode en-dash (&#x2013;). Do not tag multiple values in @rid.
Set the "linked text" inside of the <xref>. Do not expect PMC to generate the content of the link based on the id.
See Sample 1 for an example.

attributes:
  • ref-type—This is the @id of the referenced object.(#REQUIRED)
  • rid—see the table above for allowed values and ID/IDREF prefixes.(#REQUIRED)
  • id—Do not use.
<year>
Must be 4-digit number.