6 min read

updating d3panels and R/qtlcharts for D3 version 4

I just spent some time updating my d3panels library and R/qtlcharts package for D3 version 4. It took just about a day, and the majority of the time was spent puzzling over d3-force and d3-brush.

(Note that I just barely know what I’m doing, by which I mean I don’t know what I’m doing. I’m able to get things to work, but I don’t always know why.)

I was very glad that I’d written a bunch of tests, because I could use those to figure out what was working and what was not working, and whether my changes were effective. Tests, test, tests. There’s nothing better than tests for this sort of refactoring business.

I like the changes in D3 version 4, but trying to figure them out feels a bit like being back in 2012, when I was first trying to understand D3: I’ve not found many tutorials that explain how to use the new version, so I mostly focused on reading the API documentation, which isn’t easy for me to understand, or to walk through the code for some of [Michael Bostock]’s examples. There are loads of books and tutorials on D3, but they’re almost all still talking about D3 version 3. (This will change shortly. For example, the 2nd edition of Scott Murray’s excellent Interactive Data Visualization for the Web is at the printer, and covers D3 version 4.)

Irene Ros’s slides on what’s new in D3 V4 were super helpful. (Also I just saw Tom Roth’s nice tutorial on d3-force, linked on the D3 tutorials page; duh.)

My D3 code isn’t particularly fancy. (And note that I’m still writing in CoffeeScript; I toyed with the idea of switching to ES6, particularly because we can now use (d) => d.x. But I love CoffeeScript and don’t want to lose list comprehensions, and actually my favorite thing is:

some_option = chartOpts?.some_option ? the_default

I do fight with the indentation at times, but I’ve grown accustomed to it.

Anyway, back to the point of this: what did I have to change to get d3panels and R/qtlcharts to work with D3 version 4?

Simple replacements

The bulk of the changes were simple replacements:

  • d3.scale.linear()d3.scaleLinear()
  • d3.svg.line()d3.line()
  • d3.scale.category20().range()d3.schemeCategory20
  • d3.random.normal(10,3)d3.randomNormal(10,3)

Slightly more tricky: d3.scale.ordinal().rangeBands([0,w],0,0) became d3.scaleBand().range([0,w]).

Also, I had written (well, borrowed from somewhere) methods .moveToFront() and .moveToBack(), which I can omit as D3 version 4 has .raise() and .lower().

So really, the majority of the changes were made by for d3., making some tiny edit, and then seeing if my tests were working.

d3-force

My D3 code is rather primitive. I’m basically just drawing and then adding some .on("mouseover", something) or .on("click", something_else) for interactivity.

But force-directed graphics are pretty awesome, particularly for beeswarm-type plots, so I did make use of d3.layout.force in two places.

But I didn’t really know what I was doing before, and that made the change to d3.forceSimulation a bit more puzzling. My code implementing a beeswarm-type dot chart is really ugly and so not worth looking at or discussing. The new version is just 21 lines (vs 65 lines before), and way easier to read.

For d3panels.dotchart, the main bit looks like this:

d3.range(scaledPoints.length).map( (i) ->
    scaledPoints[i].fy = scaledPoints[i].y)

force = d3.forceSimulation(scaledPoints)
      .force("x", d3.forceX((d) -> d.x))
      .force("collide", d3.forceCollide(pointsize*1.1))
      .on("tick", ticked)

I have a data set scaledPoints which is an array of objects with x and y values for point locations. I add .fy to each element of the array, to prevent the y values from being changed. (This is a beeswarm-type chart where the x-axis is a category, and I want those values to be dynamically adjusted using the force, but the y-axis is the quantitative value, and I don’t want those values to change.)

Next I use d3.forceSimulation, pass in my data, and then add a force that makes the points want to go towards their x value another another force that makes them not collide with each other.

Finally, I have the ticked function that does the updating of the point locations.

ticked = () ->
    points.attr("cx", (d) -> d.x)
          .attr("cy", (d) -> d.y)

And that’s that. There’s a bit more code since I want the option of having the opposite orientation, with the categories on the y-axis and the quantitative values on the x-axis, but really it’s just those 9 lines of code plus a couple of blank lines.

It was hard work figuring them out, but only because I was spending too much time hacking away without understanding, rather than trying to come to some understanding before doing any hacking.

d3-brush

The last major thing I had to figure out was d3-brush. I’ve not implemented any actual brushing in either d3panels or R/qtlcharts, but I did use d3.svg.brush in one my tests of d3panels.scatterplot, to show that it could be done.

In my original brush code, which was applied for a matrix of three scatterplots, I was creating a separate brush for each of the three scatterplots. And with the old d3.svg.brush(), you’d pass x- and y-axis scales with .x() and .y().

You don’t pass scales to the new d3.brush(). Instead, you use d3.event.selection to grab the current selection in screen coordinates and then have to convert them back to plot coordinates with your scales' inverses.

But I revised my code to what I thought should be working, and which was actually working for the first of the three scatterplots, but I got cryptic errors if I tried to brush the other two scatterplots. And so finally, in the revised code, I decided to use a single brush that was applied across the three scatterplots. (It seems that you can implement multiple brushes, but it’s complicated.) The single-brush solution is perfectly fine for my test case, and actually it’s maybe easier to understand.

So the conversion from d3.svg.brush to d3.brush was really pretty easy. I needed to abandon the multiple brushes, and then the back-calculation from screen coordinates to plot coordinates is slightly tedious but not a big deal.

So that was the last thing, and I now have both d3panels and R/qtlcharts working for D3 version 4.

What’s the point?

The point of all this is that I have ideas for further plots I want to make in R/qtlcharts, such as a tool for exploring pleiotropy (that is, whether two traits are affected by a common genetic locus, or instead are each controlled by separate but closely linked loci). For that thing, I wanted a double-slider, and it seemed best to implement it using D3 version 4, which I did. But I want to use my slider with d3panels, and really I want to incorporate it into R/qtlcharts, so it was clear that I needed to spend some time refactoring.

And actually, I was surprised at how easy it was. (I thought it would be considerably more than a day’s work.) And I feel like I now kind of understand d3-force and d3-brush, so the effort involved was definitely worthwhile.