JaneliaSciComp/dask_start

subworkflow

Creates a Dask cluster and waits for it to be ready

dask infrastructure distributed

Module Information

Repository: https://github.com/JaneliaSciComp/nextflow-modules/tree/main/subworkflows/janelia/dask_start
Source: Janelia
Organization: JaneliaSciComp
Authors: @krokicki , @cgoina

Name	Type	Description
meta_and_files	tuple	Channel containing metadata and files that need to be accessible by the cluster. Structure: [ val(meta), path(files)... ]
distributed	boolean	If true, create a distributed Dask cluster; if false, return empty context
dask_config	file	Path to Dask configuration file (optional)
dask_work_path	directory	Path to Dask work directory where cluster files will be stored (optional)
total_workers	integer	Number of total workers to start in the cluster
required_workers	integer	Minimum number of workers required before the cluster is considered ready
dask_worker_cpus	integer	Number of CPU cores allocated per worker
dask_worker_mem_gb	integer	Memory in GB allocated per worker

Name	Type	Description
dask_context	tuple	Dask cluster context information. If distributed=true, contains cluster details; if distributed=false, contains empty map. Structure: [ val(meta), val(dask_info) ] Where dask_info is a map containing: - scheduler_address: Address of the Dask scheduler - cluster_work_dir: Path to the cluster work directory - available_workers: Number of workers that joined the cluster

Include this subworkflow in your Nextflow pipeline:

include { DASK_START } from 'https://github.com/JaneliaSciComp/nextflow-modules/tree/main/subworkflows/janelia/dask_start'