Differences between R/qtl and R/qtl2
There have been a number of big changes. We summarize some of them here in order to assist long-time R/qtl users in their transition to R/qtl2.
New data file formats
R/qtl cross objects can be converted to the new format with the
qtl2geno::convert2cross2() function, and so one can continue to use the R/qtl function
read.cross() to read in data and then convert them to the new format:
library(qtl) mycross <- read.cross("csv", file="mydata.csv") library(qtl2geno) mycross <- convert2cross2(mycross)
New data structures
The data structures used by R/qtl2 are completely different than those used by R/qtl. The new data structures are no simpler than before, but they tend to be a bit “flatter” (that is, less deeply nested). Most everything is a “list”, and we’re using fewer “attributes” and instead including such things as components at the top level.
Split into multiple packages
R/qtl2 is not a single package as R/qtl is. Rather it’s split into multiple packages:
- qtl2geno, for calculating genotype probabilities, imputations, and genetic maps
- qtl2scan, for QTL genome scans and related calculations
- qtl2plot, for data visualization
- qtl2convert, for converting data among the R/qtl2, R/qtl, and DOQTL formats
We have in mind that, for high-dimensional data, the QTL genotype probability calculations with qtl2geno will be performed once and saved, and that the genome scans with qtl2scan will be performed in “batch” (e.g., on a cluster) and also saved, and that interactive analyses will mostly be in the data visualizations with qtl2plot.
It can be confusing to remember which function is in which package. For this reason, we created an additional, largely empty package qtl2. If you load qtl2 with
library(qtl2), the three main packages, qtl2geno, qtl2scan, and qtl2plot, will all be loaded.
New functions names
The names of all of the main functions have changed, mostly with periods replaced by underscores. For example:
Treatment of intermediate calculations
We’re no longer storing intermediate calculations as part of the cross object. For example,
calc_genoprob(), to calculate QTL genotype probabilities given the observed marker data, returns a list with the probabilities.
scan1(), to perform a genome scan, takes these probabilities plus a phenotype matrix.
Use of individual identifiers for aligning data
Individual identifiers are now used to ensure the alignment of individuals, for example between the QTL genotype probabilties and the phenotype data.
For example, in
scan1(), to perform a genome scan, it’s necessary that the phenotype data carry the corresponding individual IDs as row names.
As a result, when subsetting out, say, females, when calling
scan1, you only need to subset one of the inputs, and the rest will be automatically subset for you.
out_all <- scan1(probs, pheno, kinship) out_f <- scan1(probs, pheno[sex=="female",], kinship)
Order of arguments when subsetting cross objects
In R/qtl, when subsetting a cross object, you can use square brackets, like this:
library(qtl) mycross[chr, ind]
But the order of those two arguments was not very well chosen. It’s better to think of individuals as rows and chromosomes as column, and so put the individuals first.
And so in R/qtl2, we’ve switched the order of arguments: in bracket subsetting of cross objects, individuals now come first.
library(qtl2geno) mycross[ind, chr]
In scan of X chr, need to provide special covariates
In R/qtl, when you perform a single-QTL scan of the X chromosome, it identifies appropriate covariates to include, to avoid spurious linkage due to sex and cross-direction differences.
In R/qtl2, you need to provide such covariates yourself via the
Xcovar argument to
scan1(), There is a function in qtl2geno,
get_x_covar, for deriving these, but you’re a bit more on your own.