My prefered way to construct an informal report describing a data analysis project is as a web page. A great advantage is that I don’t need to worry about page breaks and the placement of figures.

Web pages are written in html. But html is cumbersome to write directly, and so for analysis reports, I’ll generally use either Markdown or AsciiDoc. These are two systems for writing simple, readable text, with the sort of marks that you’d use in an email message (for example, **bold** for bold or _italics_ for italics), that can be easily converted to html.

Here, I’ll discuss Markdown. This is a prerequisite for what comes next: R Markdown with knitr.

HTML

It’s helpful to know a bit of html, which is the markup language that web pages are written in. html really isn’t that hard; it’s just cumbersome.

An html document contains pairs of tags to indicate content, like <h1> and </h1> to indicate that the enclosed text is a “level one header”, or <em> and </em> to indicate emphasis (generally italics). A web browser will parse the html tags and render the web page, often using a Cascading style sheet (CSS) to define the precise style of the different elements.

But we won’t get into all of that; html is great, but the code is cumbersome to create directly, as it looks something like this:

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
</head>

<body>
<h1>Markdown example</h1>

<p>This is a simple example of a Markdown document.</p>

<p>Use a blank link between paragraphs.
You can use a bit of <strong>bold</strong> or <em>italics</em>. Use backticks to indicate
<code>code</code> that will be rendered in monospace.</p>

<p>Here's a list:</p>

<ul>
<li>an item in the list</li>
<li>another item</li>
<li>yet another item</li>
</ul>

<p>You can include blocks of code using three backticks:</p>

<p><code>
x &lt;- rnorm(100)
y &lt;- 2*x + rnorm(100)
</code></p>

<p>Or you could indent four spaces:</p>

<pre><code>mean(x)
sd(x)
</code></pre>

<p>It'll figure out numbered lists, too:</p>

<ol>
<li>First item</li>
<li>Second item</li>
</ol>

<p>And it's easy to create links, like to
the <a href="http://daringfireball.net/projects/markdown/">Markdown</a>
page.</p>
</body>
</html>

Pretty ugly. That’s probably more than you really needed to see. But knowing about html gives you a greater appreciation of Markdown.

Note that there are six levels of headers, with tags <h1>, <h2>, <h3>, …, <h6>. Think of these as the title, section, subsection, sub-subsection, …

A key design principle for creating good html documents (as well as Markdown, AsciiDoc, and LaTeX documents), is that you want to focus on the semantics (ie, the meaning of elements) rather than the style in which the material is to be presented. So focus on things like “section” or “heading” rather than “large and bold”. The reason for this, is that you’re giving the web browser more information about the material, and also you can more easily revise, externally (with Cascading Style Sheets (CSS)), the style in which the material is to be presented without having to go in and revise the html code.

Markdown

As I mentioned above, Markdown is a system for writing simple, readable text that is easily converted into html. The reason it’s useful to know a bit of html is that then you have a better idea how the final product will look. (Plus, if you want to get fancy, you can just insert a bit of html within the Markdown document.)

A Markdown document looks like this:

# Markdown example

This is a simple example of a Markdown document.

Use a blank link between paragraphs.
You can use a bit of **bold** or _italics_. Use backticks to indicate
`code` that will be rendered in monospace.

Here's a list:

- an item in the list
- another item
- yet another item

You can include blocks of code using three backticks:

```
x <- rnorm(100)
y <- 2*x + rnorm(100)
```

Or you could indent four spaces:

    mean(x)
    sd(x)

It'll figure out numbered lists, too:

1. First item
2. Second item

And it's easy to create links, like to
the [Markdown](http://daringfireball.net/projects/markdown/)
page.

That bit of Markdown text gets converted to the html code in the previous section. (Here is the source file and the derived html file.)

I hope the markup is reasonably self-explanatory. Markdown is just a system of marks that will get searched-and-replaced to create an html document. A big advantage of the Markdown marks is that the source document is much like what you might write in an email, and so it’s much more human-readable.

Here’s a more extensive Markdown example. Also look at the Markdown basics page, and the more complete Markdown syntax, or just the Markdown cheatsheet.

Converting Markdown to html

You can skip this section and move on to knitr with R Markdown, but for completeness let me explain how to convert a Markdown document to html.

Via RStudio

If you use RStudio, the simplest way to convert a Markdown document to html is to open the document within RStudio. You’ll see a “Preview HTML” button just above the document. Click that, and another window will open, with a preview of the result. (The resulting .html file will be placed in the same directory as your .md file.) You can click “Open in browser” to open the document in your web browser, or “Publish” to publish the document to the web (where it will be viewable by anyone).

Another a nice feature in RStudio: when you open a Markdown document, you’ll see a little button with a question mark. Click that, and then “Markdown Quick Reference,” and you’ll get a cheat-sheet on the Markdown syntax. Like @StrictlyStat, I seem to visit the Markdown site almost every time I’m writing a Markdown document. If I used RStudio, I’d have easier access to this information.

Via the command line

Markdown is a formatting syntax, but it’s also a software tool; in particular, it’s a Perl script. So one approach to converting a Markdown document to html is to download and use that perl script.

But I prefer to use the markdown package for R.

Within R, you can install the package with install.packages("markdown"). Then load it with library(markdown). And then convert a Markdown document to html with

markdownToHTML('markdown_example.md', 'markdown_example.html')

In practice, I do this on the command line, as so:

R -e "markdown::markdownToHTML('markdown_example.md', 'markdown_example.html')"

(Note that in Windows, it’s important to use double-quotes on the outside and single-quotes inside, rather than the other way around.)

Rather than actually type that line, I include it within a GNU make file, like this one. (Also see my minimal make tutorial.)

RStudio uses the rmarkdown package package to convert from Markdown to html. This uses pandoc for the actual conversion. The RStudio Desktop software includes pandoc, so if you install RStudio, you won’t need to install pandoc separately; you just need to include it within your PATH. On a Mac, you’d use:

export PATH=$PATH:/Applications/RStudio.app/Contents/MacOS/pandoc

In Windows, you’d include "c:\Program Files\RStudio\bin\pandoc" in your Path system environment variable. (For example, see this page, though it’s a bit ad-heavy.)

To convert your Markdown document to HTML, you’d then use

R -e "rmarkdown::render('markdown_example.md')"

(I still sort of prefer the markdown package to the use of the rmarkdown package and pandoc; the output file is a lot larger with the latter. But it’s best to follow the RStudio folks on this.

Up next

Now go to heart of this tutorial, knitr with R Markdown.