Page Comparison

Table of Contents

minLevel	1
maxLevel	6
outline	false
style	none
type	list
printable	true

Samplesheet input

Nextflow pipelines generally need an input file, often referred to as a samplesheet, which contains information about the samples you would like to analyse.

...

The samplesheet has to be a comma-separated file with a minimum set of columns (which will vary depending of the pipeline you are interested to run), and a header row.

Examples of samplesheets

For the nf-core/smrnaseq pipeline, the samplesheet has to be a comma-separated file with the following 2 columns.

...

Column names has to be specified in a header row as shown in the samplesheet example below:

...

sample,fastq_1
Clone1_N1,s3://ngi-igenomes/test-data/smrnaseq/C1-N1-R1_S4_L001_R1_001.fastq.gz
Clone1_N3,s3://ngi-igenomes/test-data/smrnaseq/C1-N3-R1_S6_L001_R1_001.fastq.gz
Clone9_N1,s3://ngi-igenomes/test-data/smrnaseq/C9-N1-R1_S7_L001_R1_001.fastq.gz
Clone9_N2,s3://ngi-igenomes/test-data/smrnaseq/C9-N2-R1_S8_L001_R1_001.fastq.gz
Clone9_N3,s3://ngi-igenomes/test-data/smrnaseq/C9-N3-R1_S9_L001_R1_001.fastq.gz
Control_N1,s3://ngi-igenomes/test-data/smrnaseq/Ctl-N1-R1_S1_L001_R1_001.fastq.gz
Control_N2,s3://ngi-igenomes/test-data/smrnaseq/Ctl-N2-R1_S2_L001_R1_001.fastq.gz
Control_N3,s3://ngi-igenomes/test-data/smrnaseq/Ctl-N3-R1_S3_L001_R1_001.fastq.gz

...

For the nf-core/rnaseq pipeline, the samplesheet has to be a comma-separated file with the following 4 columns:

...

Column names has to be specified in a header row as shown in the samplesheet example below:

...

sample,fastq_1,fastq_2,strandedness
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz,auto
CONTROL_REP1,AEG588A1_S1_L003_R1_001.fastq.gz,AEG588A1_S1_L003_R2_001.fastq.gz,auto
CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz,AEG588A1_S1_L004_R2_001.fastq.gz,auto

...

Please note that in this example, the same sample (CONTROL_REP1) was sequenced across 3 lanes. The nf-core/sarek pipeline will concatenate the raw reads before performing any downstream analysis.

Exercise 1

The following samplesheet file for the nf-core/rnaseq pipeline consisting of both single- and paired-end data is ready for analysis.

...

Expand

title	Solution:

There are 6 samples in total, as TREATMENT_REP3 has been sequenced twice. There are 3 single-end and 3 paired-end samples.

Exercise 2

Find what are the minimal columns required in the samplesheet to run nfcore/ampliseq

Expand

title	Solution

You will need to go to the usage page of nfcore/ampliseq which can be found at https://nf-co.re/ampliseq/2.9.0/docs/usage#samplesheet-input

(make sure you are using the latest version of the pipeline).

The input specification section will specify that the samplesheet must minimally contain 2 columns: sampleID and forwardReads.

Input folder

Some pipelines like nf-core/ampliseq will let you specify directly the path to the folder that contains your input FASTQ files, as an alternative to using a samplesheet.

...

Versions Compared

Old Version 2

New Version 3

Key

Samplesheet input

Examples of samplesheets

Exercise 1

Exercise 2

Input folder