I just spent some time updating my d3panels library and R/qtlcharts package for D3 version 4. It took just about a day, and the majority of the time was spent puzzling over d3-force and d3-brush.
(Note that I just barely know what I’m doing, by which I mean I don’t know what I’m doing. I’m able to get things to work, but I don’t always know why.)
I was very glad that I’d written a bunch of tests, because I could use those to figure out what was working and what was not working, and whether my changes were effective. Tests, test, tests. There’s nothing better than tests for this sort of refactoring business.
I like the changes in D3 version 4, but trying to figure them out feels a bit like being back in 2012, when I was first trying to understand D3: I’ve not found many tutorials that explain how to use the new version, so I mostly focused on reading the API documentation, which isn’t easy for me to understand, or to walk through the code for some of [Michael Bostock]’s examples. There are loads of books and tutorials on D3, but they’re almost all still talking about D3 version 3. (This will change shortly. For example, the 2nd edition of Scott Murray’s excellent Interactive Data Visualization for the Web is at the printer, and covers D3 version 4.)
Irene Ros’s slides on what’s new in D3 V4 were super helpful. (Also I just saw Tom Roth’s nice tutorial on d3-force, linked on the D3 tutorials page; duh.)
My D3 code isn’t particularly fancy. (And note that I’m still writing
in CoffeeScript; I toyed with the idea of
switching to ES6,
particularly because we can now use (d) => d.x
.
But I love CoffeeScript and don’t want to lose list comprehensions,
and actually my favorite thing is:
some_option = chartOpts?.some_option ? the_default
I do fight with the indentation at times, but I’ve grown accustomed to it.
Anyway, back to the point of this: what did I have to change to get d3panels and R/qtlcharts to work with D3 version 4?
Simple replacements
The bulk of the changes were simple replacements:
d3.scale.linear()
→d3.scaleLinear()
d3.svg.line()
→d3.line()
d3.scale.category20().range()
→d3.schemeCategory20
d3.random.normal(10,3)
→d3.randomNormal(10,3)
Slightly more tricky:
d3.scale.ordinal().rangeBands([0,w],0,0)
became
d3.scaleBand().range([0,w])
.
Also, I had written (well, borrowed from somewhere) methods
.moveToFront()
and .moveToBack()
, which I can omit as D3 version 4
has .raise()
and .lower()
.
So really, the majority of the changes were made by for d3.
,
making some tiny edit, and then seeing if my tests were working.
d3-force
My D3 code is rather primitive. I’m basically just drawing and then
adding some .on("mouseover", something)
or .on("click", something_else)
for interactivity.
But force-directed graphics are pretty awesome, particularly for
beeswarm-type plots, so I
did make use of d3.layout.force
in two places.
But I didn’t really know what I was doing before, and that made the
change to d3.forceSimulation
a bit more puzzling. My code
implementing a beeswarm-type dot
chart
is really ugly and so not worth looking at or discussing. The new
version
is just 21 lines (vs 65 lines before), and way easier to read.
For
d3panels.dotchart
,
the main bit looks like this:
d3.range(scaledPoints.length).map( (i) ->
scaledPoints[i].fy = scaledPoints[i].y)
force = d3.forceSimulation(scaledPoints)
.force("x", d3.forceX((d) -> d.x))
.force("collide", d3.forceCollide(pointsize*1.1))
.on("tick", ticked)
I have a data set scaledPoints
which is an array of objects with x
and y
values for
point locations. I add .fy
to each element of the array, to prevent the
y
values from being changed. (This is a beeswarm-type chart
where the x-axis is a category, and I want those values to be
dynamically adjusted using the force, but the y-axis is the
quantitative value, and I don’t want those values to change.)
Next I use d3.forceSimulation
, pass in my data, and then add a force
that makes the points want to go towards their x value another another
force that makes them not collide with each other.
Finally, I have the ticked
function that does the updating of the
point locations.
ticked = () ->
points.attr("cx", (d) -> d.x)
.attr("cy", (d) -> d.y)
And that’s that. There’s a bit more code since I want the option of having the opposite orientation, with the categories on the y-axis and the quantitative values on the x-axis, but really it’s just those 9 lines of code plus a couple of blank lines.
It was hard work figuring them out, but only because I was spending too much time hacking away without understanding, rather than trying to come to some understanding before doing any hacking.
d3-brush
The last major thing I had to figure out was d3-brush. I’ve not
implemented any actual brushing in either
d3panels or
R/qtlcharts, but I did use
d3.svg.brush
in one my tests of
d3panels.scatterplot
,
to show that it could be done.
In my original brush
code,
which was applied for a matrix of three scatterplots, I was creating a
separate brush for each of the three scatterplots. And with the old
d3.svg.brush()
, you’d pass x- and y-axis scales with .x()
and
.y()
.
You don’t pass scales to the new d3.brush()
. Instead, you use
d3.event.selection
to grab the current selection in screen
coordinates and then have to convert them back to plot coordinates
with your scales' inverses.
But I revised my code to what I thought should be working, and which was actually working for the first of the three scatterplots, but I got cryptic errors if I tried to brush the other two scatterplots. And so finally, in the revised code, I decided to use a single brush that was applied across the three scatterplots. (It seems that you can implement multiple brushes, but it’s complicated.) The single-brush solution is perfectly fine for my test case, and actually it’s maybe easier to understand.
So the conversion from d3.svg.brush
to d3.brush
was really pretty
easy. I needed to abandon the multiple brushes, and then the
back-calculation from screen coordinates to plot coordinates is
slightly tedious but not a big deal.
So that was the last thing, and I now have both d3panels and R/qtlcharts working for D3 version 4.
What’s the point?
The point of all this is that I have ideas for further plots I want to make in R/qtlcharts, such as a tool for exploring pleiotropy (that is, whether two traits are affected by a common genetic locus, or instead are each controlled by separate but closely linked loci). For that thing, I wanted a double-slider, and it seemed best to implement it using D3 version 4, which I did. But I want to use my slider with d3panels, and really I want to incorporate it into R/qtlcharts, so it was clear that I needed to spend some time refactoring.
And actually, I was surprised at how easy it was. (I thought it would be considerably more than a day’s work.) And I feel like I now kind of understand d3-force and d3-brush, so the effort involved was definitely worthwhile.