Running Batch Jobs on GCP via dsub
OfficialRun viral-genomics jobs in parallel on GCP.
Data & Analytics#batch processing#gcs#parallel jobs#dsub#google cloud batch#dockerized bioinformatics#viral genomics
Authorbroadinstitute
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This Skill helps you execute compute-heavy viral NGS tasks on Google Cloud in a reliable, scalable way when running locally is too slow, too memory-intensive, or requires specialized Docker images.
Core Features & Use Cases
- Parallel batch execution with dsub: Runs many independent jobs from a single task TSV to maximize throughput.
- Dockerized tool execution: Uses a specified container image so tool versions and dependencies stay consistent.
- GCS-first data flow: Downloads declared inputs from GCS and uploads outputs/logs back to GCS for auditing and debugging.
- Common viral analysis scenarios: Useful for memory-heavy steps like VADR, BLAST, and genome assembly, especially when processing dozens of sequences or assemblies.
Quick Start
Run dsub with the GCS-based provider settings, a Docker image, a machine type, a script, and a tasks TSV so each TSV row launches an independent containerized job and writes logs to your GCS logging path.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: Running Batch Jobs on GCP via dsub Download link: https://github.com/broadinstitute/viral-ngs/archive/main.zip#running-batch-jobs-on-gcp-via-dsub Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.