Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel6
outlinefalse
stylenone
typelist
printabletrue

...

One of the core features of Nextflow is the ability to cache task executions and re-use them in subsequent runs to minimize duplicate work. Resumability is useful both for recovering from errors and for iteratively developing a pipeline.

Resume option

You can enable resumability in Nextflow with the -resume flag when launching a pipeline with nextflow run.

...

Structure of work folder

When nextflow Nextflow runs, it assigns a unique ID to each task. This unique ID is used to create a separate execution directory, within the work directory, where the tasks are executed and the results stored. A task’s unique ID is generated as a 128-bit hash number.

...

When a task requires recomputation, iei.e. the conditions above are not fulfilled, the downstream tasks are automatically invalidated.

...

By default the pipeline results are cached in the directory work where the pipeline is launched.

We can use the Bash tree command to list the contents of the work directory. Note: By default tree does not print hidden files (those beginning with a dot .). Use the -a to view all files.

Code Block
tree -a work

Provide a relevant example from test run

Example of work directoryExample of work directory from nf-core/smrnaseq pipeline:

Code Block
work/
├── 1200
│   └── 5489f3c7dbd521c0e43f43b4c1f352ba88dbe5efa0127c9872d349b32714
│         ├── Clone9_N3.fastp.commandfastq.begingz -> /mnt/hpccs01/home/gauthiem/smrnaseq_cl/work/36/1a0fc3ca0c220051ed3ca36d7d2974/Clone9_N3.fastp.fastq.gz
│       ├── .command.err
│Clone9_N3_mature.bam
│         ├── .command.log
│  Clone9_N3_mature.sam
│       ├── .command.outbegin
│         ├── .command.runerr
│        ├── .command.shlog
│       ├── .command.exitcodeout
│         └── temp33_1_2.fq.gz -> /home/training/data/yeast/reads/temp33_1_2.fq.gz
├── 3b
│   └── a3fb24ad3242e4cc8e5aa0c24d174b
│       ├── .command.run
│       ├── .command.beginsh
│       ├── .command.errtrace
│         ├── .command.logexitcode
│         ├── fasta_bidx.command1.outebwt -> /mnt/hpccs01/home/gauthiem/smrnaseq_cl/work/ef/afac598b79ec7f0bd3d8ee66506fa4/fasta_bidx.1.ebwt
│       ├── fasta_bidx.command2.runebwt -> /mnt/hpccs01/home/gauthiem/smrnaseq_cl/work/ef/afac598b79ec7f0bd3d8ee66506fa4/fasta_bidx.2.ebwt
│       ├── fasta_bidx.command3.shebwt -> /mnt/hpccs01/home/gauthiem/smrnaseq_cl/work/ef/afac598b79ec7f0bd3d8ee66506fa4/fasta_bidx.3.ebwt
│       ├── .exitcode
│  fasta_bidx.4.ebwt -> /mnt/hpccs01/home/gauthiem/smrnaseq_cl/work/ef/afac598b79ec7f0bd3d8ee66506fa4/fasta_bidx.4.ebwt
│       └──├── temp33_2_fasta_bidx.rev.1.fq.gzebwt -> /mnt/hpccs01/home/gauthiem/trainingsmrnaseq_cl/datawork/yeastef/readsafac598b79ec7f0bd3d8ee66506fa4/temp33_2_fasta_bidx.rev.1.fq.gz
├── 4c
│   └── 125b5e5a5ee144fa25dd9bccd467e9
│  ebwt
│       ├── .command.begin
│  fasta_bidx.rev.2.ebwt -> /mnt/hpccs01/home/gauthiem/smrnaseq_cl/work/ef/afac598b79ec7f0bd3d8ee66506fa4/fasta_bidx.rev.2.ebwt
│       ├── .command.err
│unmapped
│       │    ├── .command.log
│└── Clone9_N3_mature_unmapped.fq.gz
│       └── versions.yml
├── .command.out
│   02
│   └── 87c26bf3248f4fafea7bfd75f4dc8a
│       ├── .command.run
│  C1-N3-R1_S6_L001_R1_001.fastq.gz -> /mnt/hpccs01/home/gauthiem/smrnaseq_cl/work/stage-449030e3-e32d-4010-bc1c-9d6f1f62cc45/d2/11a8747c3a99874284cb3e80c0fa33/C1-N3-R1_S6_L001_R1_001.fastq.gz
│       ├── .command.sh
│  Clone1_N3.raw_fastqc.html
│       ├── .exitcode
│Clone1_N3.raw_fastqc.zip
│       ├──  └── temp33_3_1.fqClone1_N3.raw.gz -> /home/training/data/yeast/reads/temp33_3_1.fq.gz
├── 54
│   └── eb9d72e9ac24af8183de569ab0b977
│ C1-N3-R1_S6_L001_R1_001.fastq.gz
│        ├── .command.begin
│         ├── .command.err
│         ├── .command.log
│       ├── .command.out
│         ├── .command.run
│         ├── .command.sh
│         ├── .command.exitcodetrace
│         └── temp33_2_2.fq.gz -> /home/training/data/yeast/reads/temp33_2_2.fq.gz
├── e9
│├── .exitcode
│       └── 31f28c291481342cc45d4e176a200aversions.yml
├── 04
│   ├── 4597052a2b5e6fcc4266f2461b0884
│   │   ├── .command.begin
│   │   ├── .command.err
│   │   ├── .command.log
│   │   ├── .command.out
│   │        ├── .command.run
│   │   ├── .command.sh
│    │   ├── .command.shtrace
│   │   ├── Control_N2.raw_fastqc.html
│    │   ├── .exitcode
│       └── temp33_1_1.fqControl_N2.raw_fastqc.zip
│   │   ├── Control_N2.raw.gz -> Ctl-N2-R1_S2_L001_R1_001.fastq.gz
│   │   ├── Ctl-N2-R1_S2_L001_R1_001.fastq.gz -> /mnt/hpccs01/home/training/data/yeast/reads/temp33_1_1.fq.gz
└── fa
    └── cd3e49b63eadd6248aa357083763c1
   gauthiem/smrnaseq_cl/work/stage-449030e3-e32d-4010-bc1c-9d6f1f62cc45/0a/a104e9b9dbac753afb050de1d079eb/Ctl-N2-R1_S2_L001_R1_001.fastq.gz
│   │   ├── .exitcode
│   │   └── versions.yml
│   └── 9e40d148b1fe948d43fc18132875a0
│       ├── .command.begin
Clone1_N1_mature_hairpin.bam -> /mnt/hpccs01/home/gauthiem/smrnaseq_cl/work/66/7553250d29e8612d19d1feb0347a58/Clone1_N1_mature_hairpin.bam
│       ├── Clone1_N1_mature_hairpin.sorted.bam
│       ├── .command.errbegin
│       ├── .command.err
│       ├── .command.log
│          ├── .command.out
   │       ├── .command.run
│          ├── .command.sh
│       ├── .command.trace
│       ├── .exitcode
│          └── temp33_3_2.fq.gz├── hairpin.fa_igenome.fa_idx.fa -> /mnt/hpccs01/home/training/data/yeast/reads/temp33_3_2.fq.gzgauthiem/smrnaseq_cl/work/95/6d0db42195231bb9861871c7700798/hairpin.fa_igenome.fa_idx.fa
│       └── versions.yml
....

Task execution directory

Within the work directory there are multiple task execution directories. There is one directory for each time a process is executed. These task directories are identified by the process execution hash. For example the task directory fa/cd3e49b63eadd6248aa357083763c1 would be location for the process identified by the hash fa/cd3e49 .

...

Code Block
nextflow clean [run_name|session_id] [options]

If no run name or session id is provided, it will clean the latest run.