I was at the useR! Conference at The University of Warwick in Coventry, UK, last week. My goal in going was to learn the latest things regarding (simple) dynamic graphics, (simple) web-based apps, parallel computing, and memory management (dealing with big data sets). I got just what I was hoping for and more. There are a lot of useful tools available that I want to adopt. I’ll summarize the high points below, with the particular areas of interest to me covered more exhaustively than just “highlights”.
I left feeling that my programming skills are crap. My biggest failing is in not making sufficient use of others’ packages, but rather just building what I need from scratch (with great effort) and skipping dynamic graphics completely.
There were 440 participants from 41 countries (342 Europe; 60 North America).
There are now >3000 packages on CRAN, with 110 submissions per week (of which 80 are successful), basically all handled by Kurt Hornik.
CRAN will throw out binaries of packages that are more than two years old.
What’s within the base of R will shrink rather than grow.
There have been a lot of improvements in the rendering of graphics.
R is heavily dependent on a small number of altruistic developers, many of whom feel their contributions are not treated with respect.
library()is to be replaced by
There will soon be a
parallelpackage for parallel computing.
Barry Rowlingson gave a great lightning talk, “Why R-help must die!” He suggested the Q-and-A type sites Stack Overflow (on programming) and Cross Validated (on statistics), both part of Stack Exchange; they hope to create an R-specific site soon (currently R things are split between the two).
Tal Galili (founder of R-bloggers) gave a lightning talk about blogging and R. He emphasized that one need not write frequent posts. At the 2010 useR! Conference he gave a more comprehensive introduction to blogging available here.
Patrick Burns talked about random input software testing and had a great analogy: if writing test suites is like digging ditches, random input testing is like digging in the sand (ie, fun). (I do random input testing, but with my users providing the inputs.) [slides (with transcript)]
Olaf Mersmann gave a cool talk about microbenchmark to get accurate nanosecond timings of R expressions. Why do that? Because if you repeat something 1000 times in a
for loop and then time it with
system.time(), you are including the overhead of the
The next useR! Conference will be at Vanderbilt in Nashville, TN, June 12-15, 2012.
Toby Dylan Hocking had a poster about the directlabels package for putting labels near curves or clusters of points in a plot rather than have a separate legend. I really like the technique and am eager to try out the package. For figures for a publication, one will probably want to edit things by hand, but for day-to-day, the package looks extremely useful.
Sina Rüeger presented uniPlot to reduce time to polish reports, by making base graphics, ggplot2, and lattice all use the same style (otherwise the reader may be distracted by the differences). [CRAN]
Alexander Kowarik presented sparkTable for creating html tables with very small graphs included.
Simon Urbanek mentioned several new features in R graphics:
R will soon include
dev.flush()(written by Prof. Brian Ripley) so that you can tell a graphics device when you actually want to see a plot. This should improve graphics rendering.
rasterImage()is way faster than
polypath()to plot polygons with holes.
Simon Urbanek talked about iPlots eXtreme (currently: codename Acinonyx) which has fabulous and easy-to-create dynamic graphics. You basically just prefix the usual plot functions with an “i” (iplot, ihist). Super fast and can handle big data sets. It uses OpenGL (a solution developed by the gaming industry).
TIBCO Spotfire has some ways to develop interactive graphics tools, but it’s commercial and Windows only.
Adrian Waddell talked about RnavGraph for interactive graphics. He had some neat ideas about navigating among multiple scatterplots: a graph where nodes are images and where moving along an edge between two nodes involves morphing from one image to another. Moving from one scatterplot to another is like rotating a 3d scatterplot. [CRAN]
Simple web applications
Timothee Carayol gave a lightning talk about how to use RGoogleDocs and rApache for quick and easy deployment of a web interface. You set up a spreadsheet, which acts as a configuration file for rApache, so RGoogleDocs handles the inputs in place of what could be complex web programming. It sounds neat but I don’t fully understand it. But Timothee wrote to say that he would write a tutorial in the coming weeks. [slides]
Eleni-Anthippi Chatzimichali talked about iWebPlots for making dynamic, web-based scatterplots.
Comprehensive web applications and GUIs
E. James Harner spoke about Rc2 for collaborative use of R (including shared R sessions with voice chat), aimed to support distance learning. It seems really complicated and not easily adapted for others’ use.
Naim Matasci had a poster about iPlant which has a fancy web front end with interactive analysis (using R). They have something like 10 developers working on it, but he said that the source code will be available and that it could be adapted for other purposes.
Xavier de Pedro Puente discussed the use of Tiki with R to make comprehensive web sites with wiki-like pages including R (converted to output or graphs on the fly) or web-based forms. It uses his PluginR package. It seems a great idea, but is likely too complicated for me. The key coding is with Smarty. [slides]
Sheri Gilley from Revolution Analytics presented a GUI that they’re working on. It looks like it will be superb for the novice who wants a GUI. They’ll have a beta by the end of 2011, with the real release in 2012. Sheri spent 25 years doing UI design at SPSS (so I guess she was 10 or 15 when she started).
I was surprised by the large number of companies forming around R.
Revolution Analytics: have code for handling large datasets and parallel computing and are developing a GUI.
RStudio: aimed to be an IDE (supporting programmers) rather than a GUI. Upcoming features (including quick traversal of code across multiple files) look cool, but I’ll probably stay with emacs.
TIBCO: purchased Insightful (who had bought Splus) in 2008.
CloudNumbers: cloud-based computing, including the use of R.
Talks I wish I’d seen
Andrej Blejec talked about his animatoR package for creating animations in R.
Jonathan Rougier talked about nomograms (and donkeys).
Things (particularly packages) I need to try out
directlabels package for automatically putting labels directly next to curves or clouds of points.
hexbin package for dense scatterplots.
animation for making animations in R
grid and ggplot2 (I’m still just using base graphics)
gridSVG for making complex web-based dynamic graphics
sparkTable for making html tables with small figures inserted
compareGroups for making complex tables with confidence intervals and p-values and such, like epidemiologists (and my collaborators) often want
arrayQualityMetrics which creates fancy web-based dynamic reports
osmap (contained in snippets) for making maps.
animatoR for making animations