BioImageTools/spark_startworker
moduleStart a Spark worker which runs until the terminate module is called
spark
Module Information
Tools
spark
Apache Spark is an analytics engine for large-scale data processing
License: Apache License 2.0
Inputs
| Name | Type | Description |
|---|---|---|
| spark_uri | string | URI of the Spark manager |
| cluster_work_dir | path | The cluster work directory where the manager will run |
| worker_id | string | Identifier (usually an index) for the worker. |
| data_dir | path | Shared path (or list of paths) which should be mounted into the worker process for data access. |
| worker_cores | path | Total number of cores to allocate to the worker |
| worker_mem_in_gb | path | Total amount of memory to allocate to the worker |
Outputs
| Name | Type | Description |
|---|---|---|
| spark_uri | string | Full path to the cluster work directory where the manager is running |
Quick Start
Include this module in your Nextflow pipeline:
include { SPARK_STARTWORKER } from 'https://github.com/BioImageTools/nextflow-modules/tree/main/modules/bits/spark/startworker'