Putting your R package on GitHub
This page will be more motivational than instructional, but there’s a bit of instruction at the end.
Use version control
If you are developing an R package, well really for almost everything you do on a computer, you need a system for keeping track of the changes to the source code for the package. There are three basic ways to keep track of those changes.
- Don’t keep track at all.
- Periodically save numbered zip files (or
- Use a formal version control system, like git.
Over my career, I did a bit of the first, then a whole bunch of the second, I dipped my toe in the first third with a single project, and now I use git to keep track of everything I do: software, data analysis projects, manuscripts, slides for talks, and websites (like this one).
The advantages of a formal version control system like git include
- You have a full record of exactly how your code got to be in its current state.
- If something stops working, you can easily go back to previous versions to see when it stopped working, and so more easily identify why it stopped working.
- You can try out new features without worrying about breaking things that work.
- git, in particular, does a fantastic job at merging simultaneous changes from multiple collaborators. Even if your collaborator is not using version control, that you yourself are using it will make it easier to incorporate and understand their changes.
If you’re totally new to version control, consider my git/github guide.
What is GitHub?
GitHub is a website that serves as home for git repositories. It’s sort of like facebook for programmers (and data scientists): everyone’s on there; you can look at what they’re working on and easily peruse their code and make suggestions or changes.
GitHub lowers the barriers to collaboration. It’s easy to offer suggested changes to others’ code through GitHub, and it’s easy for them to incorporate your suggested changes.
Why put your R package on GitHub?
There are a number of advantages to putting your R package on GitHub.
- It’ll be easier for others to peruse your code. They can do so in the web browser without having to download, extract, and start fishing.
- GitHub includes issue tracking: people (including yourself) can note problems they’re having or suggestions for improvements they’d like you to make.
- In addition to just pointing out problems, people can actually fix the problem and send you a patch, which you can easily test and then incorporate into your package. Some of this can be done entirely online, with no knowledge of git. Rather than having someone say, “There’s a typo in your documentation,” they can say “Here, I’ve fixed a typo in your documentation.”
- With the
install_github()function in Hadley Wickham’s devtools package, it’s easy for people to install your package directly from GitHub. It doesn’t have to be on CRAN. (As you’ll see, getting your package on CRAN can be a bit difficult.)
It’s important to mention that there are alternatives to GitHub. The main one is BitBucket. GitHub has the advantage of being more popular, and I prefer its interface. But BitBucket allows unlimited private repositories. (With a free account at GitHub, all of your repositories must be completely open, though faculty and students can get a free upgrade, for educational use, to an account that allows up to 5 private repositories.) With BitBucket, you can use either git or the mercurial version control system; mercurial is a bit simpler than git.
How to install a package from GitHub
How do you install a package that’s sitting on GitHub?
Load the devtools package.
There’s some extra fanciness that you need to do if the version you want sits on some branch of the repository, or if the package is in a subdirectory of the main repository.
Put your R package on GitHub
- Change to the package directory
- Initialize the repository with
- Add and commit everything with
git add .and
- Create a new repository on GitHub
Connect your local repository to the GitHub one
git remote add origin https://github.com/username/reponame
Push everything to github
git push -u origin master
One thing extra you may want to add is a
README file (or a
README.md). This will show up nicely at your GitHub repository, below
the list of files.
R will largely ignore your
README.md file. And it will also ignore the
subdirectory that is created when you make the package a git
You may also want to create a website for your package. It’s easy to
do so with GitHub Pages. For an example,
website for my R/qtlcharts package.
You create an empty
gh-pages branch for your package git repository
and fill it will a website. GitHub makes it easy to use
Jekyll Bootstrap for the website, so
you can write things in
Markdown rather than
HTML. See my simple site tutorial
– particularly the page on
making a project site.
Note that if you call your
README file something other than
R CMD check --as-cran will
report a “Note” (not as bad as a warning or error, but to
be avoided). Personally, I prefer
ReadMe.md. One solution to this:
.Rbuildignorefile with the line
instsubdirectory containing a soft-link to your file.
mkdir inst cd inst ln -s ../ReadMe.md
R CMD build will then ignore the
ReadMe.md file in the root
directory of your package, but it moves everything in the
subdirectory into the root directory, and so this method lets you name
your readme file whatever you want, but avoids the flag from
R CMD check --as-cran.
Now go to the page about getting your R package on CRAN.