Skip to main content

Section 6.1 Paragraphs

Much of your writing will happen in paragraphs, delimited by the simple tag, <p>. You are reading one right now. They are a basic building block of divisions, and also a basic building block of other structures. For example, an ordered list, <ol>, contains a sequence of list items, <li>, and a typical list item might be a sequence of paragraphs. (Do not confuse this element with the anomalous <paragraphs> subdivision, Section 6.7).

Paragraphs are a choke-point of sorts. Many tags can only be used within paragraphs, and many others cannot be used within paragraphs. Notice too, that we do a certain amount of manipulation of whitespace in a paragraph, in ways that you may not even notice.

The following subsections together contain allowed, or encouraged, markup within a paragraph. Many of these may be used in captions and titles, but some of the more complicated constructions (which appear later here) cannot be used in captions or titles.

Note that some of this may appear to be overkill, and if you choose not to use it, you may even have success for a while with one of the output formats. But if you wish to ever produce multiple outputs, then the following is necessary. For example, a plain octothorpe (hash), #, will migrate just fine to HTML output, but will cause fatal errors in output, and can also cause problems in output formats that employ Markdown. JSON is another component of some output formats which can be problematic.

We will say it again. PreTeXt is a markup language, and our various output formats (, HTML, EPUB, Jupyter notebooks) in turn employ markup languages. These use different escape characters and give different characters special meanings. Our job is to insulate you from this variety, but it requires that you use markup in places where you might “normally” just press a key on your keyboard. The descriptions below will contain more specific information.

One more comment: typewriters, computer keyboards, and the ASCII character set limit the full range of characters that typographers and printers have used historically. A case in point is the hyphen, which is a single key on a keyboard. However, there are at least three common dashes of differing lengths (hyphen, en dash, and em dash), and in the context of mathematics or a computer program, the hyphen might be the binary operation of subtraction or the unary operation of negation. PreTeXt will help you navigate this complexity, but you will want to use keyboard characters or markup appropriately. So if you care about communicating clearly, and making your writing easy for a reader to use, absorb the details that follow and the philosophy they implement.

We begin with some simple “grouping” elements which contain several excellent examples of the importance and utility of careful markup. There is a plethora of empty tags for individual characters, and these are very important (see Subsection 6.1.4). We defer them to the end of this section, since they are not as instructive, but do not think this means they are an afterthought. They can be extremely critical for successful conversions. Also do not miss Best Practice 6.1.1 in the conclusion of this section.

Subsection 6.1.1 Simple Markup in Paragraphs

Beyond empty tags that translate to various characters, there are relatively simple tags that can call attention to various portions of a sentence, or generate more complicated constructions than described above.

Most, if not all, of the markup in this subsection may also be used within titles and captions, though they might lose some of their features when used in a title, especially when the title is duplicated in other contexts, such as a Table of Contents.

<q>, quotes, “group”

This is the first of several grouping tags, using characters with left and right variants (see the individual characters in isolation above), and some of the most common markup in your writing. Presentation uses double quote marks that are smart quotes, meaning that they look different in their opening and closing variants. With output based on HTML, you may use your keyboard's double quote mark instead and get acceptable output, but your will look wrong (plus you will look like you are a beginner with ) and outputs based on JSON will be totally broken. Futhermore, if you use the <q> your HTML output will look better (typographically) than if you do not use it. (See <blockquote> for extensive runs of quoted text that can stand alone, and which can carry an attribution.)

<sq>, single quotes, ‘group’

Perhaps less-often used than <q>, so a couple more characters to type. Presentation is paired single-quotes, opening and closing. Read the discussion above about double-quotes if you have not already, this tag is entirely similar.

<braces>, braces, aka curly brackets, {group}

Left and right braces to enclose a phrase. This is not for creating a set in mathematics, use the appropriate mathematics tag and syntax for that.

<angles>, angle brackets, 〈group〉

Left and right angle brackets to enclose a phrase. This is not for creating a set of generators in mathematics, use the appropriate mathematics tag and syntax for that. Note also that the characters used here are distinct from the inequalities, <less> and <greater> (< and >).

<brackets>, square brackets, [group]

Left and right square brackets to enclose a phrase. This is not for grouping expressions in mathematics, use the appropriate mathematics tag and syntax for that.

<dblbrackets>, double square brackets, ⟦group⟧

Double left and right square brackets to enclose a phrase. This is not for grouping expressions in mathematics, use the appropriate mathematics tag and syntax for that. These are used in the analysis of texts to note various restorations or deletions. Inquire if you feel there should be more semantic markup for this purpose.

<em>, emphasis, important

Use this element to surround characters in a phrase that is to be emhasized. This will typically be rendered in italics, though this choice is left to the implementation of a particular conversion. See also, <alert>.

If you are new to using a markup language, this is a place to stop and think. As a PreTeXt author you are never able to say, “I want this text to appear in italics.” Rather, you specify that certain text has a certain purpose or meaning. Emphasis is a way of calling attention to a portion of a sentence or paragraph. A font change (to italic) is a common and effective device. But a particular format might have a better, or different, way to achive this. Perhaps in an electronic format, the letters are animated and dance up and down. (Just kidding. But you may be reminded of frequent blinking text in the early days of web design, supported by a non-standard <blink> element.) Seriously, now would be a good time to review Section 1.1.

<alert>, alert, critical

Use this to heavily emphasize something to a greater degree than just emphasis. Maybe think of it as SHOUTING. Bold italic font, or a bright color, or both, would be normal choices for presentation. Overuse of this tag will dilute its effectiveness.

<term>, terminology, larvae

Use this to identify a word or phrase that is being defined, in contrast to actually using a structured <definition>. Typical presentation is a bold font. Caution: the use of this tag is to communicate a defined term and converters may make use of this interpretation, given the importance of definitions in scholarly work. It would be considered tag abuse to use this tag to simply bold a word or phrase for some other reason, perhaps as an alternative to <em> or <alert>.

<foreign>, foreign words, idioms, phrases, Hola

This tag is used to identify words or phrases from a language other than the main one used for the overall document. It is best practice to use a @xml:lang attribute to identify the language, since this will assist screen readers and hyphenation algorithms. We may also recognize the need for a different script (font). Usual presentation is italics for languages using a Latin script. This should not be used for entire paragraphs as a way of assisting with a translation of an entire document.

Note that when we use italics for emphasis and to point out foreign words or phrases, there is a loss of information in our output. In other words, we can no longer reliably (in an automated way) convert our output back to equivalent PreTeXt source from its visual representations. C'est la vie. See Section 1.1 again.

<pubtitle>, <articletitle>, titles of books and articles

These provide the ability to typographically distinguish the title of another work, and are not a replacement for careful bibliographies and citations. Use <pubtitle> for longer, complete works, such as books, plays, or entire websites. Use <articletitle> for shorter, component works, such as a chapter of a book, a research article, or a single webpage.

Presentation for a longer work will be italics or an oblique (slanted) font, and for a shorter work, the title will simply be quoted.

<abbr>, <init>, <acro>, abbreviation, initialism, acronym, Mr., XML, SCUBA

An abbreviation is a condensed or shortened version of some word or phrase, such as Mr. for “Mister.” Converters should take care with periods (full stop) inside an <abbr> as distinguished from the end of a sentence (which may not always be clear given the absence of a tag delimiting sentences). An initialism is an abbreviation read as a sequence of letters, often the first letter of words in a phrase, such as HTML for “HyperText Markup Language.” An acronym is much like an initialism, but the letters are read as a pronouncable word (which sometimes actually enters the language as a word, such as “radar” which began as RAdio Detection And Ranging). An example is SCUBA which stands for “Self-Contained Underwater Breathing Apparatus.” Initialisms and acronyms may be presented in a small-capitals font or as regular capitals reduced in size.

<delete>, <insert>, <stale>, editing assistance, gone, new, old

These denote portions of a text that is being changed in some way, presumably as part of an editing process. Conceivably, they could be managed by some other tool acting on your source. Stale text is that which is slated for removal eventually, but is left in place so that it may be consulted. There is no support presently for actually deleting or incorporating text, though that would be a reasonable feature request.

Red and green, for leaving and entering, are natural choices for presentation. But in consideration of those readers who cannot always distinguish different colors, other devices, such as strikethrough or underlining, should also be employed.

<tag>, <tage>, <attr>, tag, empty tag, attribute, <section>, <hash />, @width

These are PreTeXt tags for when we write about PreTeXt and need to discuss tags, empty tags, and attributes. Given how we design PreTeXt tags the content of these elements should only be the 26 lower-case letters and a dash/hyphen. These should render in ways that make the three types of language elements obvious without much further discussion. Just a bit self-serving, but not unjustified.

<taxon>, scientific names, Escherichia coli

This element may surround a full scientific name, resulting in presentation in italics. There are subelements <genus> and <species> which may be used to delineate those components.

A @ncbi attribute on <taxon> accepts an identiier from the National Center for Biotechnology Information. Feature requests for ways to make this more useful are welcome.

<fn>, footnotes

A footnote can be inserted in a paragraph and a mark will be left behind. Where the content of the footnote goes depends on the capabilities of the output format. Because a footnote allows you to begin a new piece of text anywhere, it can be difficult to handle technically. For this reason it is banned from places like titles and its possible content is limited (for openers, no paragraphs).

A footnote is the farthest thing from structured writing that we can think of. It can go anywhere. Resist the temptation to use it, and your writing will improve. We frequently entertain the thought of making footnotes impossible in PreTeXt. See the <aside> for a possible alternative.

<m>, mathematics, \(x^2+y^2\)

Simple, inline expressions using mathematical notation may be used in paragraphs, and in titles and captions. The syntax is . See Section 6.8 for full details.

<c>, code, verbatim text, literal text, import

Short bursts of raw, or verbatim, text can be wrapped in the <c> element. Strictly speaking, “code” is a misnomer, as the text may be anything you need to communicate exactly as one would type it at a keyboard or as input to a computer program. Anything longer than a handful of characters, or needing accurate line breaks should consider the <cd>, <pre>, <program> or <console> tags. Presentation is normally a monospaced sans serif font, perhaps of a slightly heavier weight, and designed for the job with features such as unambiguous zeros (versus the letter ‘oh’). See Section 6.5 for details.

<email>, email address, nobody@example.com

Very similar to the <c> tag, this may be used to get a monospace presentation of an email address, possibly as an active link in some formats.

Subsection 6.1.2 Cross-References and Paragraphs

There are several devices for creating cross-references. Generally, these are unwise (or banned) in titles, and if allowed may be inactive in certain portions of an electronic output format (such as when migrating to a Table of Contents).

<url>, linking external resources

This is a cross-reference to some item separate and distinct for your document.

A Uniform Resource Locator (URL) is, loosely speaking, an Internet address for some item. Presentation depends on the capabilites of the output format to serve the resource. There is a mandatory atrribute, @href, that contains the full URL, including a protocol (such as http://). Used as an empty tag, the visual text will be the exact contents of the @href attribute. So, http://www.example.com can be achieved with

<url href="http://www.example.com"/>

You may also wish to provide some text other than the actual URL, which you can specify as the content of the <url> element. For example, IANA Test Site can be achieved with

<url href="http://www.example.com">IANA Test Site</url>

In order to have a URL occur in print output in a useful way, and in electronic output in an active way, I often shorten the full URL in the visual portion and mark it up as verbatim text. So illustrating again, we get example.com from

<url href="http://www.example.com"><c>www.example.com</c></url>

Notice the necessity and/or desirability of marking the text in a way that distinguishes it as literal text.

Note also that this tag is meant for external resources, so see the <xref> element (below) or Section 6.6 for ways to link internally (i.e. within your document).

<xref>, cross-references

This is a cross-reference to some item contained within your document.

Extensive and intuitive capabilities for cross-referencing are a primary feature of PreTeXt. Typical use is an empty tag with the @ref attribute specifying the value of an @xml:id on the target of the cross-reference. This should work easily without much more instruction, but familiarize yourself with the details in Section 6.6 to get the most out of some the available options.

<idx>, index target

This indicates that the containing structure (theorem, example, etc.), or top-level paragraph, should be the target of an entry of the index (a special sort of cross-reference). See Section 3.20 and Section 6.16 for general details.

<notation>, index target

This indicates that the containing definition, or top-level paragraph, should be the target of an entry of the list of notation (a special sort of cross-reference). See Section 3.20 and Section 6.16 for general details.

Subsection 6.1.3 Structured Markup in Paragraphs

There are three categories of items which typically are structured further, and which are almost entirely restricted to appearing in a paragraph. Given their complexity, details are covered in other sections of this guide.

Lists

With only one major exception (and three minor ones), a list must appear within a paragraph. See Section 3.7 for an introduction and Section 6.9 for precise details.

Display Mathematics

Displayed mathematics, which is a single equation or a sequence of (aligned) equations, can only be placed within a paragraph. The relevant tags are <me>, <men>, <md>, and <mdn>, with the latter two necessarily structured with <mrow> elements. See Section 3.5 for an introduction and Section 6.8 for precise details.

Display Verbatim

The <cd> tag, by analogy with the <md> tag for displayed mathematics, may be used to display one or more lines of verbatim text (such as a series of commands), possibly structured with the <cline> tag. See Section 3.14 for an introduction and Section 6.5 for precise details.

This should not be confused with the <pre>, <console>, or <program> tags, which have slightly different uses, and all of which must be used outside of a paragraph.

Subsection 6.1.4 Characters in Paragraphs

<nbsp />, non-breaking space

A space, but which ties two words together and discourages a line break when formatted, such as Summer<nbsp />1967. This can also be used to discourage a period in an abbreviation from being interpreted as the end of a sentence, such as C.R.<nbsp />Darwin.

<ndash />, –, en dash

A dash, the width of a lowercase ‘n’, or exactly half the width of the em dash. This is typically used to express a range, such as 1955<ndash />1975, with no intervening spaces. It is often expressed as two hyphens when typed. Bringhurst suggests an ndash surrounded by spaces – thusly – when setting off phrases.

<mdash />, —, em dash

A dash, the width of a lowercase ‘m’, or twice the width of the en dash. This is typically used to express a secondary part of a phrase, much like the use of a semi-colon or parentheses.

<ampersand />, &

This is the symbol often read as “and”, a stylised contraction of “et”. Be careful, ampersands in mathematics and in verbatim text (code) are implemented differently. This version is for use with “normal” text. This is the escape character for XML and has special meanings in .

ALWAYS use this empty element in normal text, or no processing of any kind will succeed. Review Section 3.13 for use within mathematics or verbatim text. See also Subsection 6.8.4 and the introduction to Section 6.5.

<less />, <greater />, <, >, less than, greater than

If you need these symbols in “normal” text (which should be unlikely), this is the way to get them. A bare ‘<’ will totally confuse the XML processor, since, as you know now they have a special mening in XML syntax. Mathematics and verbatim text have their own variants. Note that these are the keyboard characters commonly used for inequalities in text, and are different from the grouping characters <langle /> and <rangle /> discussed below.

ALWAYS use these empty elements in normal text, or no processing of any kind will succeed. Review Section 3.13 for use within mathematics or verbatim text. See also Subsection 6.8.4 and the introduction to Section 6.5.

<hash />, #, octothorpe, numeral sign, pound avoirdupois, hash

This character has special meaning in , and in Markdown, so consistently using this empty element for normal text is critical for successful conversion to popular outputs, and so is a good practice.

<dollar />, $, dollar sign

This character has special meaning in syntax for mathematics, so consistently using this empty element for normal text is critical for successful conversion to every output, and so should be used ALWAYS.

<percent />, %, percent, per centum

This character has special meaning in , so consistently using this empty element for normal text is critical for successful conversion to popular outputs, and so is a good practice.

<circumflex />, ^, circumflex, caret

This character has special meaning in , so consistently using this empty element for normal text is critical for successful conversion to popular outputs, and so is a good practice.

<underscore />, _, underscore

This character has special meaning in , and in Markdown, so consistently using this empty element for normal text is critical for successful conversion to popular outputs, and so is a good practice.

<lbrace />, <rbrace />, {, }, left brace, right brace

These characters have special meaning in , and in JSON, so consistently using these empty elements for normal text is critical for successful conversion to popular outputs, and so is a good practice.

<tilde />, ~, tilde

This character has special meaning in , so consistently using this empty element for normal text is critical for successful conversion to popular outputs, and so is a good practice.

<backslash />, \, backslash

This is the escape character for , Markdown, and JSON! So consistently using this empty element for normal text is critical for successful conversion to almost every popular output, and so is recommended ALWAYS.

<lq />, <rq />, “, ”, left and right double-quotes

These are the characters in isolation. If you are using them in pairs, see the <q> tag below for a matched pair. Note that syntax expresses quote marks in a very complicated way, and typographically a left quote mark is different from a right quote mark, so it is never good practice to use the keyboard versions.

<lsq />, <rsq />, ‘, ’, left and right single-quotes

These are the characters in isolation. If you are using them in pairs, see the <q> tag below for a matched pair. Note that syntax expresses quote marks in a very complicated way, and typographically a left quote mark is different from a right quote mark, so it is never good practice to use the keyboard versions. In particular, note that the <lsq /> character is not the same as the “backtick” of various markup languages.

<ellipsis />, …, ellipsis

Typically three low dots with no intervening space, to indicate a continuation. This will always perform better than three consecutive periods.

<asterisk />, ∗, asterisk, star

This is the character, and not the raised mark used to indicate a footnote or endnote. It has a special meaning in Markdown.

<backtick />, `, backtick, accent grave

This is the keyboard character that looks like a left single quote. It is not a quote mark (use <lsq> for that character). It is often used to modify other characters, as an accent. But that is not the use of this empty element either. This is the keyboard character, which is sometimes used in other markup languages. So, again, you could do fine pressing the key, but might turn it into a left quote mark, and it might cause confusion when Markdown is employed.

<slash />, /, slash, forward slash

This is the character used to separate words/information/text. It is not to be confused with the <solidus /> (or virgule) used to form a fraction in normal text. You should be able to reliably use the keyboard character / instead. We include the markup version if making the distinction is important for your project.

<midpoint />, ·, midpoint

A small centered (vertically) dot, which can be used to separate pieces of information, especially in displayed text (i.e. outside of paragraphs). Not to be confused with a bullet preceding a list item, or multiplication in mathematics.

<swungdash />, ⁓, swung dash

Another decorative separator, not to be confused with the keyboard tilde character.

<permille />, ‰, per mille

Like per cent, but now a number expressed as its product with \(1000\) (rather than with \(100\)).

<pilcrow />, ¶, pilcrow, paragraph mark

Mark used historically to indicate the start of an internal paragraph, and in a more modern use, to indicate a permalink.

<section-mark />, §, section mark

Used to prefix the number of a section, or other division. (So the word section is being used generically here.)

<copyright />, ©, copyright

The symbol used in publishing, legal, or business contexts. For a PreTeXt project, copyright information can be specified within the <colophon> portion of the <frontmatter>.

<trademark />, ™, trademark

The symbol used in legal or business contexts.

<registered />, ®, registered

The symbol used in legal or business contexts.

<today />, <timeofday />, September 16, 2018, 19:58:19 (-07:00)

Values at the time of XML processing. Useful for marking drafts or other frequently revised material such as online versions.

<tex />, <latex />, <pretext />, <webwork />, , , PreTeXt, WeBWorK

Conveniences for frequently-mentioned accessories to PreTeXt.

<ie />, <eg />, <circa />, <versus>, <etc />, i.e., e.g., c., vs., etc.

A small collection of frequently-used Latin abbreviations, with attempts to handle the tricky periods wisely in output. Strictly speaking BC is not Latin, but we include it for completeness. Tags are always lowercase, no punctuation, usually two letters.

Tag Realization Meaning
ad AD anno Domini, in the year of the Lord
am A.M. ante meridiem, before midday
bc BC English, before Christ
circa c. circa, about
eg e.g. exempli gratia, for example
etal et al. et alia, and others
etc etc. et caetera, and the rest
ie i.e. id est, in other words
nb N.B. nota bene, note well
pm P.M. post meridiem, after midday
ps P.S. post scriptum, after what has been written
vs vs. versus, against
viz viz. videlicet, namely
<fillin />, , fill-in blank

A “fill in the blank” blank. While meant for use with normal text, it may also be used within mathematics contexts. The @characters attribute may be used to hint at how long the line will be, should the default value of 10 be too long or too short. Here we have set @characters to the value 5.

<times />, ×, times, multiplication

For simple arithmetic expressions in text, this symbol may be used. Or it may be used to specify dimensions, as in “I bought a 2×4 at the lumber yard.”

<solidus />, ⁄, solidus, virgule, fraction bar

For simple arithmetic expressions in text, this symbol may be used to form a fraction. It shoulds appear to have a significantly shallower slope than the forward slash, /.

SI Units

Système international (d'unités) (International System of Units) is a system of measurement used universally in science. PreTeXt has comprehensive support for this system and its notation and abbreviations. See Section 3.24 for a short introduction and Section 6.22 for detailed descriptions of the relevant elements and their use.

𝄪, ♯, ♮, ♭, 𝄫, music notation

Notes, chords, and other notation may appear within sentences as part of a dicussion. See Section 6.21 for detailed descriptions of the relevant elements.

Best Practice 6.1.1 Understand the Importance of Careful Markup

There is a lot of detailed information in this section. Much of it is critically important, some is superfluous. If you are new to thinking in terms of markup (rather than WYSIWYG tools), it might be overwhelming, a lot to digest, and hard to separate the wheat from the chaff. Careful here means using the necessary markup, not using it for other purposes different than its intent (tag abuse), planning ahead for different output formats, but not becoming a slave to over-doing it.

So come back here often for a re-read. And keep in mind that PreTeXt is designed around principles (List 1.1.1), and that it is markup (Item 1.1.1:1) which enables multiple outputs (Item 1.1.1:3) and effective and beautiful online versions (Item 1.1.1:6).