diff options
author | Justin Bedo <cu@cua0.org> | 2018-09-06 16:21:17 +1000 |
---|---|---|
committer | Justin Bedo <cu@cua0.org> | 2018-09-06 16:21:17 +1000 |
commit | 0427cd1c270d3e64b368ed1da2ddef95816fab5a (patch) | |
tree | b8c2c6a1efe38fd40d8d90843a73ab4167854344 | |
parent | 895cad8c9c6fa7357e6ac2ba5812ee8652264789 (diff) |
Editing
-rw-r--r-- | abcbs_2018.md | 12 |
1 files changed, 8 insertions, 4 deletions
diff --git a/abcbs_2018.md b/abcbs_2018.md index d13b9f3..63bb0a7 100644 --- a/abcbs_2018.md +++ b/abcbs_2018.md @@ -2,10 +2,14 @@ Justin Bedő, Leon Di Stefano, and Tony Papenfuss -> We show how Nix, a next generation cross-platform software deployment system, cleanly solves a number of reproducibility headaches in bioinformatics and computational biology. -> Nix uses hash-based naming to ensure that its builds are uniquely-specified, isolation and completeness to ensure that they are deterministic, and a very simple programming language to ensure that they are easily-manageable. +> A cornerstone of science is reproducibility, the ability to independently verify experimental research. For bioinformatics to support scientific reproducibility, the computational portion of a research project has to be well specified and recomputable. However, it is difficult to guarantee reproducibility for a bioinformatics pipeline, in part due to the large number of software invoked, their complicated interactions, and the size of our data. Recent approaches such as containerisation does not solve this problem as it simply shifts the difficulty to managing containers instead of managing software artifacts. Furthermore, the execution of a pipeline is usually disjoint from the container construction, adding further management difficulties. + +> We show how Nix, a next generation cross-platform software deployment system, can cleanly solve a number of reproducibility headaches in bioinformatics and computational biology. +> Nix uses hash-based naming to ensure that its builds are uniquely specified, isolation and completeness to ensure that they are deterministic, and a simple programming language to ensure that they are easily manageable. +> Nix is well supported and mature software with a large community that has been in development for over 10 years. > -> Nix, like tools such as (mini)conda, can straightforwardly create and manage isolated environments, and with our transparent and lightweight extensions it can also succinctly describe computational pipelines, manage their execution in HPC environments or in parallel across a collection of machines, and produce portable containers (Docker or Singularity images) to share with others. -> Nix has an extensive package collection which includes the whole of CRAN and Bioconductor, and while it lacks Bioconda's coverage of standalone bioinformatics tools, we show that Bioconda can be used within Nix expressions (with some attendant loss of reproducibility). +> With our transparent and lightweight extensions Nix succinctly describe computational pipelines, manage their execution in HPC environments or in parallel across a collection of machines, and produce containers (e.g., Docker, Singularity) to share with others. +> Nix has an extensive package collection that includes the whole of CRAN and Bioconductor, which can be leveraged in our pipelines. +> While Nix lacks Bioconda's coverage of standalone bioinformatics tools, we show that Bioconda can be used within Nix expressions, with some attendant loss of reproducibility. > > In our talk we will use Nix to specify a typical bioinformatics pipeline, and show how it can be executed in whole or in part on an HPC queuing system. |