JaneliaSciComp/dask_start

subworkflow

Creates a Dask cluster and waits for it to be ready

dask infrastructure distributed

Module Information

Repository
https://github.com/JaneliaSciComp/nextflow-modules/tree/main/subworkflows/janelia/dask_start
Source
Janelia
Organization
JaneliaSciComp
Authors
@krokicki , @cgoina

Inputs

Name Type Description
meta_and_files tuple Channel containing metadata and files that need to be accessible by the cluster. Structure: [ val(meta), path(files)... ]
distributed boolean If true, create a distributed Dask cluster; if false, return empty context
dask_config file Path to Dask configuration file (optional)
dask_work_path directory Path to Dask work directory where cluster files will be stored (optional)
total_workers integer Number of total workers to start in the cluster
required_workers integer Minimum number of workers required before the cluster is considered ready
dask_worker_cpus integer Number of CPU cores allocated per worker
dask_worker_mem_gb integer Memory in GB allocated per worker

Outputs

Name Type Description
dask_context tuple Dask cluster context information. If distributed=true, contains cluster details; if distributed=false, contains empty map. Structure: [ val(meta), val(dask_info) ] Where dask_info is a map containing: - scheduler_address: Address of the Dask scheduler - cluster_work_dir: Path to the cluster work directory - available_workers: Number of workers that joined the cluster

Quick Start

Include this subworkflow in your Nextflow pipeline:

include { DASK_START } from 'https://github.com/JaneliaSciComp/nextflow-modules/tree/main/subworkflows/janelia/dask_start'
View on GitHub Report Issue