BioImageTools/spark_cluster
subworkflowCreates a Spark cluster and waits for it to be ready
spark bigdata infrastructure
Module Information
Components
This subworkflow uses the following modules:
Inputs
| Name | Type | Description |
|---|---|---|
| spark_work_dir | Shared work directory for the Spark cluster Structure: [ val(spark_work_dir) ] | |
| data_dir | path | Paths to be mounted in the Spark workers for data access |
| spark_workers | integer | Number of workers in the cluster |
| spark_worker_cores | integer | Number of cores per worker |
| spark_gb_per_core | integer | Number of GB of memory per worker core |
Outputs
| Name | Type | Description |
|---|---|---|
| done | URI of the Spark cluster and its work directory Structure: [ spark_uri, spark_work_dir ] |
Quick Start
Include this subworkflow in your Nextflow pipeline:
include { SPARK_CLUSTER } from 'https://github.com/BioImageTools/nextflow-modules/tree/main/subworkflows/bits/spark_cluster'