Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...
The samplesheet has to be a comma-separated file with a minimum set of columns (which will vary depending of the pipeline you are interested to run), and a header row.
Examples of samplesheets
For the nf-core/smrnaseq pipeline, the samplesheet has to be a comma-separated file with the following 2 columns.
...
Column names has to be specified in a header row as shown in the samplesheet example below:
...
sample,fastq_1
Clone1_N1,s3://ngi-igenomes/test-data/smrnaseq/C1-N1-R1_S4_L001_R1_001.fastq.gz
Clone1_N3,s3://ngi-igenomes/test-data/smrnaseq/C1-N3-R1_S6_L001_R1_001.fastq.gz
Clone9_N1,s3://ngi-igenomes/test-data/smrnaseq/C9-N1-R1_S7_L001_R1_001.fastq.gz
Clone9_N2,s3://ngi-igenomes/test-data/smrnaseq/C9-N2-R1_S8_L001_R1_001.fastq.gz
Clone9_N3,s3://ngi-igenomes/test-data/smrnaseq/C9-N3-R1_S9_L001_R1_001.fastq.gz
Control_N1,s3://ngi-igenomes/test-data/smrnaseq/Ctl-N1-R1_S1_L001_R1_001.fastq.gz
Control_N2,s3://ngi-igenomes/test-data/smrnaseq/Ctl-N2-R1_S2_L001_R1_001.fastq.gz
Control_N3,s3://ngi-igenomes/test-data/smrnaseq/Ctl-N3-R1_S3_L001_R1_001.fastq.gz
...
For the nf-core/rnaseq pipeline, the samplesheet has to be a comma-separated file with the following 4 columns:
Column | Description |
---|---|
| Custom sample name. This entry will be identical for multiple sequencing libraries/runs from the same sample. Spaces in sample names are automatically converted to underscores ( |
| Full path to FastQ file for Illumina short reads 1. File has to be gzipped and have the extension “.fastq.gz” or “.fq.gz”. |
| Full path to FastQ file for Illumina short reads 2. File has to be gzipped and have the extension “.fastq.gz” or “.fq.gz”. |
| Sample strand-specificity. Must be one of |
Column names has to be specified in a header row as shown in the samplesheet example below:
...
sample,fastq_1,fastq_2,strandedness
CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz,auto
CONTROL_REP1,AEG588A1_S1_L003_R1_001.fastq.gz,AEG588A1_S1_L003_R2_001.fastq.gz,auto
CONTROL_REP1,AEG588A1_S1_L004_R1_001.fastq.gz,AEG588A1_S1_L004_R2_001.fastq.gz,auto
...
Please note that in this example, the same sample (CONTROL_REP1) was sequenced across 3 lanes. The nf-core/sarek pipeline will concatenate the raw reads before performing any downstream analysis.
...
How many samples does it have in total? Tip: Make sure you check whether there are samples with replicates.
How many are single-end and paired-end? Tip: Single end only have 1 fastq.gz file, paired-end have a pair of fastq.gz files (generally
*_R{1,2}_001.fastq.gz
).
Code Block |
---|
sample,fastq_1,fastq_2,strandedness CONTROL_REP1,AEG588A1_S1_L002_R1_001.fastq.gz,AEG588A1_S1_L002_R2_001.fastq.gz,forward CONTROL_REP2,AEG588A2_S2_L002_R1_001.fastq.gz,AEG588A2_S2_L002_R2_001.fastq.gz,forward CONTROL_REP3,AEG588A3_S3_L002_R1_001.fastq.gz,AEG588A3_S3_L002_R2_001.fastq.gz,forward TREATMENT_REP1,AEG588A4_S4_L003_R1_001.fastq.gz,,reverse TREATMENT_REP2,AEG588A5_S5_L003_R1_001.fastq.gz,,reverse TREATMENT_REP3,AEG588A6_S6_L003_R1_001.fastq.gz,,reverse TREATMENT_REP3,AEG588A6_S6_L004_R1_001.fastq.gz,,reverse |
...
Find what are the minimal columns required in the samplesheet to run nfcore/ampliseq
Expand | ||
---|---|---|
| ||
You will need to go to the usage page of nfcore/ampliseq which can be found at https://nf-co.re/ampliseq/2.911.0/docs/usage#samplesheet-inputusage (make sure you are using the latest version of the pipeline). The input specification section will specify that the samplesheet must minimally contain 2 columns: |
Input folder
Some pipelines like nf-core/ampliseq will let you specify directly the path to the folder that contains your input FASTQ files, as an alternative to using a samplesheet.
...
Code Block |
---|
--input_folder 'path/to/folder/ontianing/the/data/' |
File names must follow a specific pattern, default is /*_R{1,2}_001.fastq.gz
, but this can be adjusted with the --extension
parameter.
...