ISB-CGC-pipelines Framework¶
This framework was built around the Google Genomics Pipelines API (described in more detail below) and is intended to allow you to run single tasks at scale, allowing you to tailor how and when the tasks are submitted, monitor them as they finish etc.
Note
With the advent of Google’s dsub the ISB-CGC-pipelines framework is deprecated and no longer supported.
Google Genomics Pipelines¶
The so-called “Pipelines API” is a task runner that lets you run a command-line executable in Docker on a Google Compute Engine VM. Since it is truly a “task” runner rather than a full “pipeline” runner, we generally refer to it as GGP so that the usage of the word “pipeline” is not confusing. We also find the additional term “API” unnecessary.
GGP can be “called” using command-line interface (part of the Cloud SDK gcloud tool),
or as a web service API that can be called programmatically.
When using GGP from the command-line, each task is defined in a YAML (or JSON) file.
The Google documentation for the “Genomics Pipelines” can be found here and on readthedocs, and there are numerous easy-to-follow examples on github here.