Use lightweight markup - Specifics

Modern technical writing - Andrew Etter 2016

Use lightweight markup
Specifics

Now that we've discussed the basic things technical writers should do, we can delve into the nitty-gritty of how to do them.

Use lightweight markup

Someone once said, "XML is like violence: if it doesn’t solve your problem, you aren’t using enough of it." But whoever said this was definitely a) a developer and b) not writing XML by hand.

Because writing XML by hand is crazy, in the same way that writing any significant web application in JavaScript is crazy. Application developers should write TypeScript, Dart, CoffeeScript, or practically anything else and then compile to JavaScript. Likewise, if XML is a part of your publishing pipeline, you should write lightweight markup and then build to XML. The entire point of lightweight markup is to make it easier to produce well-formed XML, and we need XML in order to build websites.

Why is lightweight markup so superior? Consider the following AsciiDoc:

= My Title

This introductory paragraph contains a link to

https://www.google.com[Google].

[source,ruby]

puts "Here is some code."

== My Section

* List item

* Another list item

Here is the corresponding DocBook, an XML-based markup language:

This introductory paragraph contains a link to

Google.

puts "Here is some code."

List item

Another list item

The AsciiDoc content is human-readable in raw form, straightforward to learn, and weighs in at 179 characters. The DocBook content is challenging to parse even with syntax highlighting, requires specialized knowledge of odd tags, and contains 723 characters. That's four times the number of characters for the exact same content.

As we discussed earlier, one of the tenets of modern technical writing is that everyone is a contributor. Storing content directly in XML-based languages like XHTML, DocBook, and DITA dramatically reduces people's ability to contribute.

What about specialized editors? MadCap Flare and Adobe FrameMaker, two popular authoring applications, provide you with WYSIWYG editors for the rapid creation of well-formed XML. But they cost $1,448 and $9991, respectively, and are only available for Microsoft Windows. Rather than being mere deterrents (like writing in XML), specialized applications actually prevent people from contributing. Amazing text editors are available on every operating system, mostly for free, and writers can use whichever they like. Popular editors include:

· Atom

· Sublime Text

· Notepad++

· TextWrangler

· gedit

· Vim

· Emacs

Note

The steep learning curve on Vim and Emacs is such that I don't recommend them for most people, but their power and flexibility are undeniable.

Microsoft Word is a wonderful choice for creating résumés and a horrible choice for creating documentation. Its lone purpose in this world—one that, again, it really does perform admirably—is to create short, attractive PDFs that can be consumed and discarded. Documentation with any sort of lifespan needs to be kept in version control, which Word's DOCX file format (a compressed collection of XML files) actively opposes. Documentation should live online, and Word's abysmal HTML export is totally unsuitable for creating websites. You have to style content in Word as you write it, rather than taking advantage of the natural separation of content and style, of HTML and CSS. Even though most companies provide licenses to their employees, Word still costs money and is only available for Windows and macOS. For documentation, lightweight markup is free and superior in every meaningful way.

Plenty of lightweight markup languages exist, but only three are really worth discussing: Markdown, reStructuredText, and AsciiDoc.

Markdown

Markdown is simultaneously incredible and infuriating, wonderful and maddening. It's the most widely used lightweight markup language in the world2 and has the cleanest syntax, but it also has a limited set of features and no defined standard. These deficiencies have led to a couple dozen "flavors" of Markdown, including MultiMarkdown, Markdown Extra, GitHub Flavored Markdown, and a recent standardization effort called CommonMark. Each flavor adds features, some of which are implemented across multiple flavors with inconsistent syntax. The more maddening part, though, is that almost every Markdown parser (and many exist) processes whitespace just a little bit differently. Nested lists, for example, might have odd spacing in one flavor and not in another.

Using only "vanilla" Markdown syntax allows for broad compatibility, but you miss out on features like tables and "fenced" code blocks. Using a flavor of Markdown means you might have to rework your source files if you ever want to switch to a different flavor, but the level of effort in such a switch would likely be low. GitHub Flavored Markdown is a popular and fine choice for simple web-based help systems. Here's what it looks like:

# My Title

This introductory paragraph contains a link to

[Google](https://www.google.com).

```ruby

puts "Here is some code."

```

## My Section

- List item

- Another list item

Markdown's popularity means that many specialized text editors exist for it. The best are:

· MarkdownPad (Windows)

· iA Writer (macOS)

· ReText (Linux)

reStructuredText

reStructuredText (RST) comes from the Python community. Unlike Markdown, it has an actual, standardized implementation. It's also feature-rich, supporting tables3, footnotes, and a wide variety of directives for code and other content blocks. With this wealth of features comes a syntax that has more than a few rough edges. Whereas you can learn Markdown in minutes, learning RST takes an hour or two. This might not seem like a big deal, but when you're trying to empower contributors, the learning curve can be an unfortunate barrier. This book is written in RST.

The main appeal of reStructuredText is that it is the source language for one of the best documentation generators in the world: Sphinx. We'll discuss Sphinx later. Here's what RST looks like:

My Title

========

This introductory paragraph contains a link to `Google

`_.

.. code:: ruby

puts "Here is some code."

My Section

----------

- List item

- Another list item

AsciiDoc

AsciiDoc is popular in the Linux community and is the language with which I have the least familiarity. Quite a few O'Reilly technology books are written in AsciiDoc.

Summary

AsciiDoc is semantically equivalent to DocBook and is thus a straightforward, obvious improvement to existing DocBook toolchains. This equivalence likely makes it the best choice for creating books, especially ones that require complex formatting. RST is a decent all-purpose language, but its lack of popularity means that it won't evolve at the same pace as Markdown. I am quite confident that the next great documentation site generator will use a flavor of Markdown as its source language, so choosing Markdown gives you a certain measure of future-proofing. At time of writing, however, Markdown is missing many useful features compared to AsciiDoc and reStructuredText.

Online editors with live previews exist for Markdown, restructuredText, and AsciiDoc. These editors are helpful for testing out features, learning language syntax, and deciding which you prefer:

· dillinger.io (Markdown)

· rst.ninjs.org (reStructuredText)

· asciidoclive.com (AsciiDoc)

LaTeX

LaTeX is a powerful markup language with complex syntax that I purposely didn't mention until now. If you need it, you already know that you need it. If you don't need it (and few outside of academia do), its only purpose is as a stop in your publishing pipeline on the road to a PDF. More on the pipeline later.