From a1d18efc18772a233aa759b622c3a9960824f109 Mon Sep 17 00:00:00 2001 From: Justin Bedo Date: Tue, 30 Apr 2019 09:47:49 +1000 Subject: cleanup examples and stray files --- abcbs_2018.md | 44 -------------------------------------------- 1 file changed, 44 deletions(-) delete mode 100644 abcbs_2018.md (limited to 'abcbs_2018.md') diff --git a/abcbs_2018.md b/abcbs_2018.md deleted file mode 100644 index a8adc45..0000000 --- a/abcbs_2018.md +++ /dev/null @@ -1,44 +0,0 @@ -**Nix for reproducible research** - -Justin Bedő, Leon Di Stefano, and Tony Papenfuss - -> A challenge for bioinformaticians is to make our computations reproducible — that is, easy to rerun, combine, share, and guaranteed to generate the same results. -> We show how Nix, a next generation cross-platform software deployment system, cleanly overcomes problems usually tackled with a combination of package managers (e.g., conda), containers (e.g., Docker, Singularity), and workflow engines (e.g., Toil, Ruffus). -> -> On its own Nix can be used as a package manager; it can also easily create isolated development environments and export portable containers to share with others. -> We have created a number of transparent and lightweight extensions that enable Nix to succinctly specify bioinformatics analysis environments and pipelines locally, in HPC environments, or in the cloud. -> -> Nix uses hash-based naming to ensure that what it builds is uniquely specified, isolation and completeness to ensure that its build processes are deterministic, and a simple programming language to ensure that the whole system is easy to manage. -> It has an extensive package collection, which includes all of CRAN and Bioconductor, and the conda package manager allowing access to Bioconda recipes. -> Nix is well supported and general-purpose software that has been in development for over 10 years. -> -> We will demonstrate how Nix with our extensions can be used to succinctly specify a typical bioinformatics pipeline and contrast this against other dedicated bioinformatics pipeline languages. -> We then show how it can be executed in whole or in part on an HPC queuing system -> Finally, we show that the pipeline can also be executed using cloud resources. - -### Stuff to match in competitors - -- **A few standard pipelines** -- Dealing with big files -- Slightly complicated analyses -- local, HPC, and cloud execution -- Resumable, parallel -- Bioconda import - -### Points of difference - -- **Full-stack reproduciblity with one tool** -- **A language rather than a configuration format (cf. CWL/Javascript)** -- Not bioinformatics-specific -- Mature (~10y) -- Containers obsolete (but easy to generate) -- Higher level of reproducibility overall (hashing of inputs, outputs, derivations) -- Safety - - Declarative language - - Type/tag system (to do) - -### Weaknesses - -- Small bioinformatics collection -- No build execution stats -- Subtleties around filesystems and the Nix store -- cgit v1.2.3