BioImageTools/spark_cluster

subworkflow

Creates a Spark cluster and waits for it to be ready

spark bigdata infrastructure

Module Information

Repository: https://github.com/BioImageTools/nextflow-modules/tree/main/subworkflows/bits/spark_cluster
Source: BioImageTools
Organization: BioImageTools
Authors: @krokicki , @cgoina

This subworkflow uses the following modules:

Name	Type	Description
spark_work_dir		Shared work directory for the Spark cluster Structure: [ val(spark_work_dir) ]
data_dir	path	Paths to be mounted in the Spark workers for data access
spark_workers	integer	Number of workers in the cluster
spark_worker_cores	integer	Number of cores per worker
spark_gb_per_core	integer	Number of GB of memory per worker core

Name	Type	Description
done		URI of the Spark cluster and its work directory Structure: [ spark_uri, spark_work_dir ]

Include this subworkflow in your Nextflow pipeline:

include { SPARK_CLUSTER } from 'https://github.com/BioImageTools/nextflow-modules/tree/main/subworkflows/bits/spark_cluster'