reproducibility guidelines - the stupidest thing...

I’m in Delft, The Netherlands, for a few days this week, at a workshop to develop guidelines for reproducibility for AGILE conference papers. I’m here as one of the outside “experts”. (It’s always a bit disconcerting to be considered an expert.) The leaders of this effort wrote an interesting PeerJ paper last year, evaluating the reproducibility of previous AGILE papers.

There will be a community webinar tomorrow (2019-04-02) at 16:00 CEST (which is 9:00am US central daylight time), to report our initial thoughts and get community feedback.

I’m particularly interested in developing these guidelines and using what I learn to further reproduciblity of work published in the journals Genetics and G3. These journals have been doing a great job of enforcing their policy that data be made publicly available. They’ve also had a policy that software be made available, but there has not been much effort to enforce that, and it’s clear that many authors need further pressure (or encouragement) before they will comply.

I’ve done a better job with reproducibility of my recent papers (about R/qtl2 and genotype diagnostics), but it remains a challenge to keep things well-organized and well-documented.

Regarding these reproducibility guidelines, I think authors will benefit from very specific suggestions that can accommodate different levels of computational “sophistication.” And I think what will be particularly helpful will be a variety of specific examples that authors might use as templates. I’ve been impressed with the care that Will Valdar’s group has been devoting to reproducibility. For example, the recent paper by Shorter et al. is accompanied by a proper R Compendium. Will’s student Roberty Corty has github repositories for his recent papers on variance QTL, including this and that.