BioImageTools/spark_cluster

subworkflow

Creates a Spark cluster and waits for it to be ready

spark bigdata infrastructure

Module Information

Repository
https://github.com/BioImageTools/nextflow-modules/tree/main/subworkflows/bits/spark_cluster
Source
BioImageTools
Organization
BioImageTools
Authors
@krokicki , @cgoina

Components

This subworkflow uses the following modules:

Inputs

Name Type Description
spark_work_dir Shared work directory for the Spark cluster Structure: [ val(spark_work_dir) ]
data_dir path Paths to be mounted in the Spark workers for data access
spark_workers integer Number of workers in the cluster
spark_worker_cores integer Number of cores per worker
spark_gb_per_core integer Number of GB of memory per worker core

Outputs

Name Type Description
done URI of the Spark cluster and its work directory Structure: [ spark_uri, spark_work_dir ]

Quick Start

Include this subworkflow in your Nextflow pipeline:

include { SPARK_CLUSTER } from 'https://github.com/BioImageTools/nextflow-modules/tree/main/subworkflows/bits/spark_cluster'
View on GitHub Report Issue