BioNix is a tool for reproducible bioinformatics that unifies workflow engines, package managers, and containers. It is implemented as a lightweight library on top of the [Nix](https://nixos.org/nix/) deployment system. BioNix is currently a work in progress, so documentation is sparse. Please get in contact with us for more information, help, and contributing (see bottom of this page). ## Installation BioNix requires no dependencies beyond [Nix](http://nixos.org/nix), which may be installed by: ```{sh} curl -L https://nixos.org/nix/install | sh ``` If you do not have root access a variety of [rootless install](https://nixos.wiki/wiki/Nix_Installation_Guide#Installing_without_root_permissions) options are available. API docs can be generated by executing `nix build` in the `doc` directory and viewing `result/OEBPS/index.html`. ## Examples Several examples are available in `./examples/`. The main example is presented in `./examples/default.nix` and can be built using `nix build` in `./examples/`. This sample pipeline performs variant calling using [`octopus`](https://github.com/luntergroup/octopus), alignment using [`bwa mem`](https://github.com/lh3/bwa), and preprocessing using [`samtools`](http://www.htslib.org/). See the documentation in `./examples/README.md` for more detail about this pipeline and the other examples. - The pipeline itself is specified in `examples/call.nix` and `examples/default.nix`. - The BioNix wrapper to run `octopus` is in `tools/octopus-call.nix`. - The Nix expression for the `octopus` software itself can be found in [nixpkgs](https://github.com/NixOS/nixpkgs/blob/master/pkgs/by-name/oc/octopus-caller/package.nix). ## Constructing workflows Writing workflows requires some familiarity with the Nix programming language and deployment system. Good introductions can be found [here](https://learnxinyminutes.com/docs/nix/) and [here](https://github.com/tazjin/nix-1p). To understand how to construct workflows it is recommended to study the examples provided. Thanks to the flexibility of Nix, the workflows can be constructed in different ways to suit the intended purposes and the examples illustrate some of the ways one might approach various problems. For constructing tool wrappers, take a look in the `./tools/` directory for the currently existing tool wrappers. A good starting point are the wrappers for BWA. ## HPC execution BioNix supports submission of jobs to computing queues rather than directly building them using the Nix build engine. The two supported engines are Slurm and PBS represented by the `slurm` and `qsub` entries in the root BioNix tree, which take an attribute set of default parameters to a new tree of tools. Simply use tools out of these trees to submit jobs, and specify resource requirements as ordinary configuration options to the tools. The following resource parameters can be specified: - *ppn*: The number of cores to request; - *mem*: The amount of memory to request (GB); - *walltime*: A string defining the maximum walltime. As we rely on side effects to submit jobs sandbox builds cannot be used and must be disabled (`--option sandbox false` with `nix-build` or `--no-sandbox` with `nix build`). ### Slurm specifics Slurm jobs are submitted by executing the `salloc` binary on the cluster. By default this is assumed to be `/usr/bin/salloc`; if this is not the case on your cluster then you need to additionally specify the path to salloc via the `salloc` parameter. When launching the build, it is important that the `TMPDIR` environment variable points to a location which is on shared storage (i.e., available from all nodes). This will be the location used for temporary files during the execution of stages. ### PBS specifics The PBS wrapper is considerably more complicated as initiating interactive processes is not as reliable as Slurm's `salloc`. Consequently, jobs are submitted via non-interactive queue submissions and the queue polled to determine when the submitted job has completed. The path to the PBS executables (i.e., `qsub` and `qstat`) has to be given in the `qsubPath` attribute. Furthermore, a temporary directory that's shared across all nodes must be specified in `tmpDir`. ## Distributed execution Nix has support for distributing jobs amongst a collection of distributed machines. See the [manual](https://nixos.org/nix/manual/#chap-distributed-builds) and [wiki](https://nixos.wiki/wiki/Distributed_build) for more information. ## Citing 1. Bedő, J., Di Stefano, L., & Papenfuss, A. T. (2020). Unifying package managers, workflow engines, and containers: Computational reproducibility with BioNix. GigaScience, 9(11). https://doi.org/10.1093/gigascience/giaa121 ## Getting help and contributing For general questions and reporting problem please open an issue. For real-time help there is a chat room at [#bionix:nixos.org](https://matrix.to/#/#bionix:nixos.org).