R/lmmlite is a light implementation of the fit of linear mixed models for QTL analysis, following pylmm. It includes both plain R and C++ implementations (the latter using RcppEigen).
Here I seek to compare the speed of my R and C++ implementations to pylmm. For timings within R, I use the microbenchmark package.
The source for this vignette is at GitHub.
I’ll focus on the recla
dataset that I include with the package. This is data on a set of diversity outcross mice, from Recla et al. (2014) and Logan et al. (2013). The data are originally from the QTL Archive/Mouse Phenotype Database and I’ve also placed them online in the format used by R/qtl2. I used R/qtl2 to calculate genotype probabilities from the MUGA SNP array data. The data include 261 mice, and form a list with three components: the estimated kinship matrix, a matrix with 26 phenotypes, and a covariate matrix that contains an intercept (all 1’s) and sex (0 for female and 1 for male).
library(lmmlite)
library(microbenchmark)
library(broman)
data(recla)
As we’ll see, about half of the time is spent on an initial eigen decomposition of the kinship matrix. Missing values in the phenotype data are a bit of an annoyance, as we’ll need to re-compute the eigen decomposition for each batch of phenotypes that has the same set of missing values, as we need to omit individuals with missing phenotypes from the analysis.
I first determine the batches of phenotypes with common missing data patterns.
batch_cols_by_missing <-
function(mat)
{
# pattern of missing data (as character string with 0's and 1's
pat <- apply(is.na(mat), 2, function(a) paste(as.numeric(a), collapse=""))
u <- unique(pat)
result <- split(seq(along=pat), match(pat, u))
names(result) <- NULL
missing_pat <- lapply(strsplit(u, ""), function(a) which(a=="1"))
attr(result, "missing_pattern") <- missing_pat
result
}
batches <- batch_cols_by_missing(recla$pheno)
Now let’s compare the full analysis for the R and C++ implementations in lmmlite. I’ll run the code 10 times.
time_full <- microbenchmark(R={
for(i in seq(along=batches)) {
omit <- attr(batches, "missing_pattern")[[i]]
keep <- is.na(match(1:nrow(recla$pheno), omit))
eigenrot <- eigen_rotation(recla$kinship[keep,keep],
recla$pheno[keep,batches[[i]]],
recla$covar[keep,], use_cpp=FALSE)
for(j in 1:ncol(eigenrot$y))
res <- fitLMM(eigenrot$Kva, eigenrot$y[,j], eigenrot$X, reml=TRUE, use_cpp=FALSE)
} },
cpp={
for(i in seq(along=batches)) {
omit <- attr(batches, "missing_pattern")[[i]]
keep <- is.na(match(1:nrow(recla$pheno), omit))
eigenrot <- eigen_rotation(recla$kinship[keep,keep],
recla$pheno[keep,batches[[i]]],
recla$covar[keep,], use_cpp=TRUE)
for(j in 1:ncol(eigenrot$y))
res <- fitLMM(eigenrot$Kva, eigenrot$y[,j], eigenrot$X, reml=TRUE, use_cpp=TRUE)
} }, times=10)
print(time_full, digits=4)
## Unit: milliseconds
## expr min lq mean median uq max neval cld
## R 421.4 422.0 450.7 449.3 476.5 490.4 10 b
## cpp 168.3 170.1 170.7 170.4 171.3 174.1 10 a
The average computation time for the R code is 451 milliseconds, while the C++ code takes 171 milliseconds, a speed-up of 2.6x.
Here’s just the code to do the initial eigen decomposition and the “rotation” of the phenotypes and covariates.
time_eigen <- microbenchmark(R={
for(i in seq(along=batches)) {
omit <- attr(batches, "missing_pattern")[[i]]
keep <- is.na(match(1:nrow(recla$pheno), omit))
eigenrot <- eigen_rotation(recla$kinship[keep,keep],
recla$pheno[keep,batches[[i]]],
recla$covar[keep,], use_cpp=FALSE)
} },
cpp={
for(i in seq(along=batches)) {
omit <- attr(batches, "missing_pattern")[[i]]
keep <- is.na(match(1:nrow(recla$pheno), omit))
eigenrot <- eigen_rotation(recla$kinship[keep,keep],
recla$pheno[keep,batches[[i]]],
recla$covar[keep,], use_cpp=TRUE)
} }, times=10)
print(time_eigen, digits=4)
## Unit: milliseconds
## expr min lq mean median uq max neval cld
## R 209.2 268.3 275.3 269.8 274.8 327.9 10 b
## cpp 159.1 160.0 167.2 161.5 162.7 221.1 10 a
The mean computation time for the R code is 275 milliseconds, while the C++ code takes 167 milliseconds, a speed-up of 1.6x.
I’ll re-calculate the eigen decomposition for each batch, and save the results
eigenrot_cpp <- eigenrot_R <- vector("list", length(batches))
for(i in seq(along=batches)) {
omit <- attr(batches, "missing_pattern")[[i]]
keep <- is.na(match(1:nrow(recla$pheno), omit))
eigenrot_R[[i]] <- eigen_rotation(recla$kinship[keep,keep],
recla$pheno[keep,batches[[i]]],
recla$covar[keep,], use_cpp=FALSE)
eigenrot_cpp[[i]] <- eigen_rotation(recla$kinship[keep,keep],
recla$pheno[keep,batches[[i]]],
recla$covar[keep,], use_cpp=TRUE)
}
And now I’ll time the fit of the LMM.
time_llm <- microbenchmark(R={
for(i in seq(along=batches)) {
for(j in 1:ncol(eigenrot_R[[i]]$y))
res <- fitLMM(eigenrot_R[[i]]$Kva, eigenrot_R[[i]]$y[,j], eigenrot_R[[i]]$X, reml=TRUE, use_cpp=FALSE)
} },
cpp={
for(i in seq(along=batches)) {
for(j in 1:ncol(eigenrot_cpp[[i]]$y))
res <- fitLMM(eigenrot_cpp[[i]]$Kva, eigenrot_cpp[[i]]$y[,j], eigenrot_cpp[[i]]$X, reml=TRUE, use_cpp=TRUE)
} }, times=10)
print(time_llm, digits=4)
## Unit: milliseconds
## expr min lq mean median uq max neval cld
## R 161.464 162.051 164.437 163.37 163.698 178.903 10 b
## cpp 9.654 9.683 9.711 9.69 9.741 9.826 10 a
The mean computation time for the R code is 164 milliseconds, while the C++ code takes 10 milliseconds, a speed-up of 16.9x.
Now I’ll try running pylmm in the same way. First, I need to save the data to a set of CSV files.
for(i in c("kinship", "pheno", "covar"))
write.table(recla[[i]], paste0(i, ".csv"), sep=",", quote=FALSE,
row.names=FALSE, col.names=FALSE)
I’m not sure the best way to time the code, so I’ll just grab the start and stop time for the interesting bit.
# this chunk is python code
import lmm
import numpy
import time
n_times = 10
# load the datasets
kinship = numpy.genfromtxt('kinship.csv', delimiter=',')
pheno = numpy.genfromtxt('pheno.csv', delimiter=',')
covar = numpy.genfromtxt('covar.csv', delimiter=',')
# loop over the columns in pheno
# fit the LMM by REML or ML and print the results to STDIN
start_time = time.time()
for j in range(n_times):
for i in range(pheno.shape[1]):
result = lmm.LMM(pheno[:,[i]], kinship, X0=covar).fit(REML=True)
stop_time = time.time()
ave_time = (stop_time - start_time)/n_times
open('pylmm_time.txt', 'w').write(str(ave_time) + "\n")
This isn’t really a fair comparison, but I’m seeing a time of 1.47 sec for pylmm, 0.45 sec for the R version of lmmlite, and 0.17 sec for the C++ version of lmmlite, so speed-ups of 3.3x and 8.6x for the R and C++ code in lmmlite, vs pylmm.
As a more fair comparison, I’ll re-run the lmmlite analysis, considering each phenotype individually.
time_full2 <- microbenchmark(R={
for(i in 1:ncol(recla$pheno)) {
keep <- !is.na(recla$pheno[,i])
eigenrot <- eigen_rotation(recla$kinship[keep,keep],
recla$pheno[keep,i],
recla$covar[keep,], use_cpp=FALSE)
res <- fitLMM(eigenrot$Kva, eigenrot$y, eigenrot$X, reml=TRUE, use_cpp=FALSE)
}},
cpp={
for(i in 1:ncol(recla$pheno)) {
keep <- !is.na(recla$pheno[,i])
eigenrot <- eigen_rotation(recla$kinship[keep,keep],
recla$pheno[keep,i],
recla$covar[keep,], use_cpp=TRUE)
res <- fitLMM(eigenrot$Kva, eigenrot$y, eigenrot$X, reml=TRUE, use_cpp=TRUE)
}}, times=10)
print(time_full2, digits=4)
## Unit: seconds
## expr min lq mean median uq max neval cld
## R 1.649 1.714 1.736 1.737 1.747 1.837 10 b
## cpp 1.058 1.061 1.081 1.065 1.073 1.217 10 a
Now, in comparison to 1.47 sec for pylmm, I see 1.74 sec for the R version of lmmlite, and 1.08 sec for the C++ version of lmmlite, so speed-ups of 0.8x and 1.4x for the R and C++ code in lmmlite, vs pylmm. (Actually, the R version of lmmlite is slower than pylmm, so that’s not really a “speed-up”. The C++ version is faster, though.)
devtools::session_info()
## Session info --------------------------------------------------------------
## setting value
## version R version 3.2.2 (2015-08-14)
## system x86_64, darwin13.4.0
## ui X11
## language en_US.UTF-8
## collate en_US.UTF-8
## tz America/Chicago
## date 2015-11-24
## Packages ------------------------------------------------------------------
## package * version date source
## assertthat 0.1 2013-12-06 CRAN (R 3.2.0)
## broman * 0.61-3 2015-11-22 local
## codetools 0.2-14 2015-07-15 CRAN (R 3.2.0)
## colorspace 1.2-6 2015-03-11 CRAN (R 3.2.0)
## devtools 1.9.1 2015-09-11 CRAN (R 3.2.0)
## digest 0.6.8 2014-12-31 CRAN (R 3.2.0)
## evaluate 0.8 2015-09-18 CRAN (R 3.2.0)
## formatR 1.2.1 2015-09-18 CRAN (R 3.2.0)
## ggplot2 1.0.1.9003 2015-11-11 Github (hadley/ggplot2@23a748f)
## gtable 0.1.2 2012-12-05 CRAN (R 3.2.0)
## htmltools 0.2.6 2014-09-08 CRAN (R 3.2.0)
## knitr 1.11.16 2015-11-16 Github (yihui/knitr@6e8ce0c)
## lattice 0.20-33 2015-07-14 CRAN (R 3.2.0)
## lmmlite * 0.1-9 2015-11-24 local
## magrittr 1.5 2014-11-22 CRAN (R 3.2.0)
## memoise 0.2.1 2014-04-22 CRAN (R 3.2.0)
## microbenchmark * 1.4-2 2014-09-28 CRAN (R 3.2.0)
## multcomp 1.4-1 2015-07-23 CRAN (R 3.2.0)
## munsell 0.4.2 2013-07-11 CRAN (R 3.2.0)
## mvtnorm 1.0-3 2015-07-22 CRAN (R 3.2.0)
## plyr 1.8.3 2015-06-12 CRAN (R 3.2.0)
## Rcpp 0.12.2 2015-11-15 CRAN (R 3.2.2)
## rmarkdown 0.8.1 2015-10-10 CRAN (R 3.2.2)
## sandwich 2.3-4 2015-09-24 CRAN (R 3.2.2)
## scales 0.3.0 2015-08-25 CRAN (R 3.2.0)
## stringi 1.0-1 2015-10-22 CRAN (R 3.2.0)
## stringr 1.0.0 2015-04-30 CRAN (R 3.2.0)
## survival 2.38-3 2015-07-02 CRAN (R 3.2.0)
## TH.data 1.0-6 2015-01-05 CRAN (R 3.2.0)
## yaml 2.1.13 2014-06-12 CRAN (R 3.2.0)
## zoo 1.7-12 2015-03-16 CRAN (R 3.2.0)