BioImageTools/spark_startworker

module

Start a Spark worker which runs until the terminate module is called

spark

Module Information

Repository: https://github.com/BioImageTools/nextflow-modules/tree/main/modules/bits/spark/startworker
Source: BioImageTools
Organization: BioImageTools
Authors: @krokicki , @cgoina

Apache Spark is an analytics engine for large-scale data processing

License: Apache License 2.0

Name	Type	Description
spark_uri	string	URI of the Spark manager
cluster_work_dir	path	The cluster work directory where the manager will run
worker_id	string	Identifier (usually an index) for the worker.
data_dir	path	Shared path (or list of paths) which should be mounted into the worker process for data access.
worker_cores	path	Total number of cores to allocate to the worker
worker_mem_in_gb	path	Total amount of memory to allocate to the worker

Name	Type	Description
spark_uri	string	Full path to the cluster work directory where the manager is running

Include this module in your Nextflow pipeline:

include { SPARK_STARTWORKER } from 'https://github.com/BioImageTools/nextflow-modules/tree/main/modules/bits/spark/startworker'