Create PDF documentation like a pro

In this article I will show you how you can use a mixture of MarkdownLaTeX, and JavaScript to create PDF documents as part of your Grunt project. There are several advantages to this approach:

  • Text-based documents are much more friendly to revision control systems than, say, Microsoft Word documents or even LibreOffice Writer documents.
  • Markdown is convenient. 90% of the time it is all you need.
  • LaTeX is powerful. You can use it for that other 10% of the stuff you want to do. But you don’t want it to get in the way every time you start a new section or add a hyperlink.
  • JavaScript is even more powerful. Thanks to grunt you can link variables from your project’s metadata into your text. Think names and version numbers. When your project’s metadata change, your documents reflects this. If this is not cool, I don’t know what is.
  • All of this can be nicely integrated to your build process. I will show you how I did this using grunt, grunt-contrib-watch, grunt-panda, and a dual monitor PC for a near-WYSIWYG authoring experience. Read on!

RTFM in the third millennium?

Generally speaking, software documentation is a thing of the past. Most end-user software does not come with instructions any more. The reason? Most end-users do not read the instructions. (This is why engineers had to come up with the acronym RTFM. But users have their own acronym: TL;DR!) So the fact remains that users don’t read instructions, even if they are right there on the screen, let alone in a separate PDF file on their computer. If the UI is not intuitive enough for them to figure out how to do something, users will do one of the following:

  1. Switch to some user-friendlier piece of software.
  2. Ask their children or grandchildren to figure out how to do it. Depending on availability.
  3. If all else fails, they might go ask a question in a user community such as a forum, issue tracker, comment section, twitter, etc. This is the ideal user.

Users will crawl on broken glass to avoid intentionally learning something new. Users will never read your documentation, not even at gunpoint.

But there still are, and always will be, cases when you do need to produce documentation for your software. For instance, your intended audience might be technically minded. Or you might be explicitly required to produce a User’s Manual. Envato for instance is a platform that asks from its contributors to include a manual with themes and plugins. This makes sense, because users of themes and plugins are often technically minded themselves. They might even find it welcome to know that if all else fails they can dive into a manual and find the solution to their problem. It’s all a matter of perspective.

How pros create PDFs

Traditionally, whenever they dabble in typography, computer geeks such as ourselves have been using LaTeX, an evolution of Donald Knuth’s TeX. In addition to the immense power of TeX, there is also the advantage of the esoteric syntax which makes you feel like a l33t h@xor when you use it. In fact the language is so complex that there is an entire stackexchange QA site devoted to it: http://tex.stackexchange.com/.

Purists might argue that easier is better. Bah! As a geek who enjoys assembly code and Perl, I tend to disagree, but I see their point: Pressing deadlines leave little time for you to play with TeX when fancy word processors are at your fingertips. But as a programmer you really don’t want to be using binary files in your revision control system. They don’t diff well. Hence people nowadays use Markdown, which is very easy to write and can readily be transformed to HTML, PDF, etc. albeit at a loss of expressive power.

If you enjoy such handy features as unicode support and using your own fonts, you might decide to use XeTeX and its XeLaTeX compiler instead of LaTeX.

pandoc

Enter pandoc. Pandoc is really cool, so you’ll want to install it:

sudo apt-get install pandoc

Here’s how you’d convert your markdown document to a PDF using XeTeX:

pandoc readme.md -o readme.pdf --latex-engine=xelatex

Easy. Pandoc silently converts the markdown to LaTeX and then spares you the usual complexity of converting LaTeX to DVI, DVI to PS, and PS to PDF. In goes markdown, out comes PDF. Magic!

grunt

Since pandoc is a thing that exists in this world, naturally there must be a grunt plugin for it, right? Correct. It is called panda. You’ll need to install it in your grunt project as a dev dependency with

npm install grunt-panda --save-dev

and then also load it in your Gruntfile.js:

grunt.loadNpmTasks( 'grunt-panda' );

You’ll want to read the panda README for details, but this is what one might do to convert a markdown document into PDF:

 panda: {
   doc: {
     options: {
       process: true,
       pandocOptions: "--latex-engine=xelatex"
     },
     files: {
       'build/readme.pdf': [ 'docs/readme.md' ],
     }
   }
 },

If you want to control your LaTeX headers or other document metadata, you can prepend your markdown code with YAML headers.

---
title: '**<%= pkg.name.replace(/_/g,' ') %>** **<%= pkg.version %>** user manual'
author: 
- name: '<%= pkg.author %>'
header-includes: |
 \setmainfont{Arial}
abstract: |
 This paragraph is an abstract that briefly talks about this software manual.
---

= Introduction

_Hello_ PDF world!

Check out how I defined the title of the document and author name using grunt-flavored JavaScript. The values come from my package.json file. All of this is possible thanks to panda’s option process: true. Since JavaScript is turing-complete, you could potentially go full meta in there and generate chunks of your document dynamically. Please don’t do that! :-p Here I just replace underscores with spaces in the pkg.name field.

You can go on and use LaTeX syntax, Markdown syntax and <%= %> tags throughout the document. The only downside is that if you do something really really funky, panda might fail silently. In that case, simply attempt to convert the file manually using pandoc, as we saw before. You will then get error messages from the xelatex engine that will point you in the right direction.

WYSIWYG

Once you’re satisfied that calling grunt panda does indeed generate your documents, why not sprinkle some grunt-contrib-watch on top of it all?

 watch: {
   docs: {
     files: [ 'docs/**/*.md' ],
     tasks: [ 'panda' ],
     options: {
       spawn: false,
     },
   },
 },

Aaah nice. Now whenever you save changes to your markdown, the PDF is regenerated. If you’re on Linux, I recommend that you open the PDF with Evince, which is usually the default PDF viewer in Gnome desktops. It has the nice feature that it auto-reloads a PDF that has changed on disk. Slide the window to your second monitor, and leave grunt watch running. Now you can edit your Markdown/LaTeX/JavaScript source and see the result almost in real time.

1 thought on “Create PDF documentation like a pro

Leave a Reply

Your email address will not be published. Required fields are marked *