My prefered way to construct an informal report describing a data analysis project is as a web page. A great advantage is that I don’t need to worry about page breaks and the placement of figures.
Web pages are written in
html. But html is cumbersome to
write directly, and so for analysis reports, I’ll generally use either
AsciiDoc. These are two systems
for writing simple, readable text, with the sort of marks that you’d
use in an email message (for example,
**bold** for bold or
_italics_ for italics), that can be easily converted to html.
It’s helpful to know a bit of html, which is the markup language that web pages are written in. html really isn’t that hard; it’s just cumbersome.
An html document contains pairs of tags to indicate content, like
</h1> to indicate that the enclosed text is a “level one
</em> to indicate emphasis (generally
italics). A web browser will
parse the html tags and render
the web page, often using a
Cascading style sheet (CSS)
to define the precise style of the different elements.
But we won’t get into all of that; html is great, but the code is cumbersome to create directly, as it looks something like this:
<!DOCTYPE html> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> </head> <body> <h1>Markdown example</h1> <p>This is a simple example of a Markdown document.</p> <p>Use a blank link between paragraphs. You can use a bit of <strong>bold</strong> or <em>italics</em>. Use backticks to indicate <code>code</code> that will be rendered in monospace.</p> <p>Here's a list:</p> <ul> <li>an item in the list</li> <li>another item</li> <li>yet another item</li> </ul> <p>You can include blocks of code using three backticks:</p> <p><code> x <- rnorm(100) y <- 2*x + rnorm(100) </code></p> <p>Or you could indent four spaces:</p> <pre><code>mean(x) sd(x) </code></pre> <p>It'll figure out numbered lists, too:</p> <ol> <li>First item</li> <li>Second item</li> </ol> <p>And it's easy to create links, like to the <a href="http://daringfireball.net/projects/markdown/">Markdown</a> page.</p> </body> </html>
Pretty ugly. That’s probably more than you really needed to see. But knowing about html gives you a greater appreciation of Markdown.
Note that there are six levels of headers, with tags
<h6>. Think of these as the title,
section, subsection, sub-subsection, …
A key design principle for creating good html documents (as well as Markdown, AsciiDoc, and LaTeX documents), is that you want to focus on the semantics (ie, the meaning of elements) rather than the style in which the material is to be presented. So focus on things like “section” or “heading” rather than “large and bold”. The reason for this, is that you’re giving the web browser more information about the material, and also you can more easily revise, externally (with Cascading Style Sheets (CSS)), the style in which the material is to be presented without having to go in and revise the html code.
As I mentioned above, Markdown is a system for writing simple, readable text that is easily converted into html. The reason it’s useful to know a bit of html is that then you have a better idea how the final product will look. (Plus, if you want to get fancy, you can just insert a bit of html within the Markdown document.)
A Markdown document looks like this:
# Markdown example This is a simple example of a Markdown document. Use a blank link between paragraphs. You can use a bit of **bold** or _italics_. Use backticks to indicate `code` that will be rendered in monospace. Here's a list: - an item in the list - another item - yet another item You can include blocks of code using three backticks: ``` x <- rnorm(100) y <- 2*x + rnorm(100) ``` Or you could indent four spaces: mean(x) sd(x) It'll figure out numbered lists, too: 1. First item 2. Second item And it's easy to create links, like to the [Markdown](http://daringfireball.net/projects/markdown/) page.
I hope the markup is reasonably self-explanatory. Markdown is just a system of marks that will get searched-and-replaced to create an html document. A big advantage of the Markdown marks is that the source document is much like what you might write in an email, and so it’s much more human-readable.
Converting Markdown to html
You can skip this section and move on to knitr with R Markdown, but for completeness let me explain how to convert a Markdown document to html.
If you use RStudio, the simplest way to
convert a Markdown document to html is to open the document within
RStudio. You’ll see a
“Preview HTML” button just above the document. Click that, and another
window will open, with a preview of the result. (The resulting
file will be placed in the same directory as your
.md file.) You
can click “Open in browser” to open the document in your web browser,
or “Publish” to publish the document to the web (where it will be
viewable by anyone).
Another a nice feature in RStudio: when you open a Markdown document, you’ll see a little button with a question mark. Click that, and then “Markdown Quick Reference,” and you’ll get a cheat-sheet on the Markdown syntax. Like @StrictlyStat, I seem to visit the Markdown site almost every time I’m writing a Markdown document. If I used RStudio, I’d have easier access to this information.
Via the command line
Within R, you can install the package with
install.packages("markdown"). Then load it with
library(markdown). And then convert a Markdown document to html with
In practice, I do this on the command line, as so:
R -e "markdown::markdownToHTML('markdown_example.md', 'markdown_example.html')"
(Note that in Windows, it’s important to use double-quotes on the outside and single-quotes inside, rather than the other way around.)
RStudio uses the
rmarkdown package package to
convert from Markdown to html. This uses
pandoc for the actual
RStudio Desktop software
includes pandoc, so if you install RStudio, you won’t need to install
pandoc separately; you just need to include it within your
a Mac, you’d use:
In Windows, you’d include
"c:\Program Files\RStudio\bin\pandoc" in
Path system environment variable. (For example, see
though it’s a bit ad-heavy.)
To convert your Markdown document to HTML, you’d then use
R -e "rmarkdown::render('markdown_example.md')"
Now go to heart of this tutorial, knitr with R Markdown.