BioImageTools/spark_startworker

module

Start a Spark worker which runs until the terminate module is called

spark

Module Information

Repository
https://github.com/BioImageTools/nextflow-modules/tree/main/modules/bits/spark/startworker
Source
BioImageTools
Organization
BioImageTools
Authors
@krokicki , @cgoina

Tools

spark

Apache Spark is an analytics engine for large-scale data processing

License: Apache License 2.0

Inputs

Name Type Description
spark_uri string URI of the Spark manager
cluster_work_dir path The cluster work directory where the manager will run
worker_id string Identifier (usually an index) for the worker.
data_dir path Shared path (or list of paths) which should be mounted into the worker process for data access.
worker_cores path Total number of cores to allocate to the worker
worker_mem_in_gb path Total amount of memory to allocate to the worker

Outputs

Name Type Description
spark_uri string Full path to the cluster work directory where the manager is running

Quick Start

Include this module in your Nextflow pipeline:

include { SPARK_STARTWORKER } from 'https://github.com/BioImageTools/nextflow-modules/tree/main/modules/bits/spark/startworker'
View on GitHub Report Issue