Resources and further reading
Reproducible research
- Best Practices for Scientific Computing (paper)
- Ten simple rules for reproducible computational research (paper)
- Top ten reasons not to share your code
- Computational and Policy Tools for Reproducible Research (slides for a talk by Roger Peng)
- Roger Peng’s 2009 ENAR course: Methods for Reproducible Research
- Reproducible Research in Signal Processing - What, why, and how
- Replication, reproduction, and remixing in research software
- Implementing reproducible research (book; chapter PDFs online)
- Starting data analysis and wrangling with R, Stian Håklev
Make
- GNU make
- Manual for make
- Minimal Make tutorial
- Managing projects with GNU make book (part of the Open Books project)
- Software carpentry’s make tutorial
- Mike Bostock’s “Why Use Make”
- GNU Make for reproducible data analysis by Zachary Jones
- Makefiles for R/LaTeX projects by Rob Hyndman
Unix command line
- Survival guide for Unix newbies
- Settling into Unix
- Shell programming with bash
- Essential Unix commands
- Basic Unix commands
- Command line essentials (slides)
- How to look like a unix guru
- Linux essentials
- Important unix commands
- UW-Madison software carpentry tutorial on the shell
- A command-line murder mystery
- Command-line bootcamp
- Book: Learning the bash shell
- Karl’s
.bash_profile
and.bashrc
files - How to install the Xcode command line tools (Mac)
- Git for Windows for Git BASH and other command-line tools in Windows
- Mintty and ConEmu are terminal emulators for Windows.
R
RStudio, emacs, vim
- RStudio
- RStudio documentation
- Navigating code in RStudio
- Emacs Wiki
- Book: Learning GNU Emacs, 3rd ed
- Karl’s
.emacs
file - Viper mode to mimic vi key bindings in emacs [Manual | Installation]
- Evil: Vim features within emacs
- Vim
- Vim Tips wiki
- Vim-R-plugin [github], like ESS for Vim
- Book: Learning the vi and Vim editors
Knitr, Markdown, Asciidoc
- knitr
- knitr in a knutshell
- Dynamic Documents with R and knitr (book)
- R Markdown
- Markdown
- MathJax
- knitrBootstrap for nice-looking reports (see L. Collado-Torres’s post)
- pander, an R package that is especially good for making tables with knitr, R Markdown, and pandoc.
- xtable. an R package for creating LaTeX and HTML tables; it’s better for LaTeX than for HTML.
- asciidoc
- knitr resources
- polymode, an emacs mode that handles R Markdown nicely
Git, Github and Bitbucket
- Karl’s git/github guide
- GitHub
- Education discount on GitHub personal account (allows private repositories)
- Happy Git and GitHub for the useR (from Jenny Bryan’s Data Science course
- Pro Git book
- The git documentation
- The github help pages
- Software carpentry notes on git
- Git can facilitate greater reproducibility and increased transparency in science (paper)
- Karthik Ram’s slides
- git - the simple guide
- magit, an emacs mode for git
- Tutorial on magit; also this one
Organizing projects and data
- Hadley Wickham’s paper on Tidy Data
- KnitR’s
stitch()
andspin()
- Nine simple ways to make it easier to (re)use your data (paper)
- Some simple guidelines for effective data management (paper)
- Quick guide to organizing computational biology projects (paper)
Writing clear code
- Best Practices for Scientific Computing (paper)
- Tidyverse style guide
- Oliveira & Stewart, Writing scientific software (book)
- Kernighan & Plauger, The elements of programming style, 2nd ed (book)
- Kernighan & Pike, The practice of programming (book)
R packages
- Hadley Wickham’s R Packages book
- Documenting R packages and functions, from that book
- Karl’s R package primer
- devtools package
- Jeff Leek on developing R packages
- Hilary Parker on making a simple R package
- Stat 545 guide to writing an R package
- Writing R extensions (official manual): [pdf | html]
- Developing R packages with RStudio
- How to build package vignettes with knitr
- Rd2roxygen package: Translate Rd files to roxygen comments
- Building R packages on Windows
- Karl’s largely out-of-date page about building R packages on Windows
- A web service for building and checking R packages for Windows: upload page
- R for Mac developer’s page
- R for Mac tools
Testing and debugging
- assertthat and testthat packages
- Hadley Wickham’s paper on testthat
- Testing in the R Packages book
- Debugging in Hadley Wickham’s Advanced R book
- Dr. Climate blog post on testing your code
- Yihui Xie’s testit package
- Jeff Leek’s notes on developing R packages talk a bit about unit tests
- Software Carpentry on testing
- Marick, The craft of software testing (book)
- Agans, Debugging (book)
- Debugging with RStudio
Big jobs
- KnitR cache [Also see Knitr options]
- KnitR chunk references
- Parallel R (book)
- A no BS guide to the basics of parallelization in R at librestats <!–
- Parallel Options for R slides by Glenn Lockwood –>
- HTCondor
- UW-Madison Center for High Throughput Computing
- Kill Linux processes easier with pkill
- Reproducibility of parallel tasks in R
LaTeX
- LyX, a WYSIWYG application for LaTeX, with knitr
- Overleaf, Authorea: online collaborative LaTeX editors
- Setting up LaTeX by Rob Hyndman
- A not so short introduction to LaTeX
- Getting started with LaTeX (pdf)
- Slides with introduction to LaTeX
- A hitchhiker’s guide to LaTeX (pdf)
- Detexify2 - LaTeX symbol classifier
- xtable, an R package for creating LaTeX and HTML tables
- xtableGallery (pdf vignette)
- UnicodeIt (for Mac) - convert latex expressions into unicode characters
- LaTeXit (for Mac) - LaTeX-based equation editor
- LaTeX table generator
- AucTex, for working with LaTeX within emacs.
Slides and posters
- LaTeX Beamer
- Beamer Quickstart
- Beamer appearance cheat sheet
- LaTeX templates for conference posters
- Modern, simple presentations written in R Markdown by Benomics
- How to give a scientific presentation (pdf) [source on github]
- PDFjam
- LaTeX poster templates
- Nathaniel Johnston’s poster template
Python
- Python for biologists
- Software Carpentry
- Python Scientific lecture notes
- Jupyter notebooks
- Practical computing for biologists (book)
- Bioinformatics programming using Python (book)
- Python scripting for computational science (book)
- Python unittest introduction
- Python nose unit testing quick start
- Differences between Python 2 and 3
- What’s new in Python 3.0 (vs 2.6)
- Guide to getting started with Data Science and Python by Thomas Wiecki
- Learn Python for epidemiology (particularly regarding pandas)
Copyright and software/data licenses
- Victoria Stodden:
- Coding horror blog
- Understanding open source and free software licensing (book)
- The whys and hows of licensing scientific code by Jake VanderPlas
- A quick guide to software licensing for the scientist-programmer (Morin et al. PLoS Comput Biol 8:e1002598, 2012)
- Creative Commons licenses
- tl;dr legal: Software licenses in plain English
- MIT license at wikipedia
- GNU General Public License
- GPL frequently asked questions
- Copyright basics (pdf from US Copyright Office)
- Works for hire (pdf from US Copyright Office)
- Fair use (US Copyright Office)
- Copyright basics
- Copyright and fair use
- Copyright of facts and data [concise]
- Database legal protections [detailed]
- VertNet guide to copyright and licenses for dataset publication
- UW-Madison: