aboutsummaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorJustin Bedo <cu@cua0.org>2019-06-11 16:44:36 +1000
committerJustin Bedo <cu@cua0.org>2019-06-11 18:23:29 +1000
commitd12f635279ec1b43a57293c9c95919a489606d8c (patch)
tree85339d6ca9fdbcc1a80a4963ee59ea557155ee8a /README.md
parentdc448fa97dfc683ccd491fc4a00e58b6d0f150ef (diff)
rewrite README documentation
Significant expansion of README to include installation instructions and instructions on how to use with HPC.
Diffstat (limited to 'README.md')
-rw-r--r--README.md122
1 files changed, 100 insertions, 22 deletions
diff --git a/README.md b/README.md
index 8858a25..be73a22 100644
--- a/README.md
+++ b/README.md
@@ -1,40 +1,118 @@
<h1 align="center"> BioNix </h1>
-BioNix is a tool for reproducible bioinformatics that unifies workflow engines, package managers, and containers.
-It is implemented as a lightweight library on top of the [Nix](https://nixos.org/nix/) deployment system.
+BioNix is a tool for reproducible bioinformatics that unifies workflow
+engines, package managers, and containers. It is implemented as a
+lightweight library on top of the [Nix](https://nixos.org/nix/)
+deployment system.
BioNix is currently a work in progress, so documentation is sparse.
+Please get in contact with us for more information, help, and
+contributing (see bottom of this page).
-## Getting started
-
-Install [Nix](http://nixos.org/nix):
+## Installation
+BioNix requires no dependencies beyond [Nix](http://nixos.org/nix),
+which may be installed by:
```{sh}
curl https://nixos.org/nix/install | sh
```
-To run a sample pipeline, clone this project and run `nix-build` in the `/examples` directory:
+If you do not have root access a variety of [rootless
+install](https://nixos.wiki/wiki/Nix_Installation_Guide#Installing_without_root_permissions)
+options are available.
-```{sh}
-$ git clone https://github.com/PapenfussLab/bionix
-$ cd examples
-$ nix-build
-```
+API docs can be generated by executing `nix build` in the `doc`
+directory and viewing `result/OEBPS/index.html`.
+
+## Examples
+
+Several examples are available in `./examples/`. The main example is
+presented in `./examples/default.nix` and can be built using `nix build`
+in `./examples/`. This sample pipeline performs variant calling using
+[`platypus`](https://github.com/andyrimmer/Platypus), alignment using
+[`bwa mem`](https://github.com/lh3/bwa), and preprocessing using
+[`samtools`](http://www.htslib.org/).
+
+See the documentation in `./examples/README.md` for more detail about
+this pipeline and the other examples.
+
+- The pipeline itself is specified in `examples/call.nix` and
+ `examples/default.nix`.
+- The BioNix wrapper to run `platypus` is in
+ `tools/platypus-callVariants.nix`.
+- The Nix expression for the `platypus` software itself can be found in
+ [nixpkgs](https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/science/biology/platypus/default.nix).
+
+## Constructing workflows
+
+Writing workflows requires some familiarity with the Nix
+programming language and deployment system. Good introductions can be
+found [here](https://learnxinyminutes.com/docs/nix/) and
+[here](https://ebzzry.io/en/nix/).
+
+To understand how to construct workflows it is recommended to study the
+examples provided. Thanks to the flexibility of Nix, the workflows can
+be constructed in different ways to suit the intended purposes and the
+examples illustrate some of the ways one might approach various
+problems.
+
+For constructing tool wrappers, take a look in the `./tools/`
+directory for the currently existing tool wrappers. A good starting
+point are the wrappers for BWA.
+
+## HPC execution
+
+BioNix supports submission of jobs to computing queues rather than
+directly building them using the Nix build engine. The two supported
+engines are Slurm and PBS represented by the `slurm` and `qsub` entries
+in the root BioNix tree, which take an attribute set of default
+parameters to a new tree of tools. Simply use tools out of these trees
+to submit jobs, and specify resource requirements as ordinary
+configuration options to the tools.
+
+The following resource parameters can be specified:
+
+- *ppn*: The number of cores to request;
+- *mem*: The amount of memory to request (GB);
+- *walltime*: A string defining the maximum walltime.
+
+As we rely on side effects to submit jobs sandbox builds cannot be used
+and must be disabled (`--option sandbox false` with `nix-build` or
+`--no-sandbox` with `nix build`).
+
+### Slurm specifics
+
+Slurm jobs are submitted by executing the `salloc` binary on the
+cluster. By default this is assumed to be `/usr/bin/salloc`; if this is
+not the case on your cluster then you need to additionally specify the
+path to salloc via the `salloc` parameter.
-The sample pipeline performs variant calling using [`platypus`](https://github.com/andyrimmer/Platypus), alignment using [`bwa mem`](https://github.com/lh3/bwa), and preprocessing using [`samtools`](http://www.htslib.org/).
-BioNix will download or build all of the necessary software and create a soft link (`result`) to the workflow output.
+When launching the build, it is important that the `TMPDIR`
+environment variable points to a location which is on shared storage
+(i.e., available from all nodes). This will be the location used for
+temporary files during the execution of stages.
-Next, check out the code:
+### PBS specifics
-- The pipeline itself is specified in `examples/call.nix` and `examples/default.nix`.
-- The BioNix wrapper to run `platypus` is in `tools/platypus-callVariants.nix`.
-- The Nix expression for the `platypus` software itself can be found in [nixpkgs](https://github.com/NixOS/nixpkgs/blob/master/pkgs/applications/science/biology/platypus/default.nix).
+The PBS wrapper is considerably more complicated as initiating
+interactive processes is not as reliable as Slurm's `salloc`.
+Consequently, jobs are submitted via non-interactive queue submissions
+and the queue polled to determine when the submitted job has completed.
-BioNix pipelines can be easily wrapped in shell scripts: see `examples/ex-tnpair/tnpair` for an example script that accepts a reference fasta, along with paired normal and tumor fastq files, and performs alignment, preprocessing, and variant calling with [`strelka`](https://github.com/Illumina/strelka).
+The path to the PBS executables (i.e., `qsub` and `qstat`) has to be
+given in the `qsubPath` attribute. Furthermore, a temporary directory
+that's shared across all nodes must be specified in `tmpDir`.
-Writing your own pipelines requires some familiarity with the Nix programming language and deployment system. Good introductions can be found [here](https://learnxinyminutes.com/docs/nix/) and [here](https://ebzzry.io/en/nix/).
+## Distributed execution
-We have successfully run BioNix pipelines in a zero-install manner (using a [statically linked binary](https://matthewbauer.us/blog/static-nix.html) and [user namespaces](https://www.redhat.com/en/blog/whats-next-containers-user-namespaces)), but this feature is currently unstable. Stay tuned!
+Nix has support for distributing jobs amongst a collection of
+distributed machines. See the
+[manual](https://nixos.org/nix/manual/#chap-distributed-builds) and
+[wiki](https://nixos.wiki/wiki/Distributed_build) for more information.
-## Contact
+## Getting help and contributing
-Please come chat with us at [#bionix:cua0.org](http://matrix.to/#/#bionix:cua0.org).
+For general questions, issues, and
+[contributing](https://git-send-email.io), please
+[email](mailto:bionix@cua0.org) or [subscribe
+to](mailto:bionix+subscribe@cua0.org) our mailing list. You may also
+chat with us at [#bionix:cua0.org](http://matrix.to/#/#bionix:cua0.org).