...
Now let’s prepare a samplesheet.csv file that specifies the name of your samples and the location of the raw FASTQ files
Code Block |
---|
sample,fastq_1
SRR24302008,/path/to/raw/FASTQ/files/SRR24302008.fastq.gz
SRR24302009,/path/to/raw/FASTQ/files/SRR24302009.fastq.gz
SRR24302010,/path/to/raw/FASTQ/files/SRR24302010.fastq.gz
SRR24302011,/path/to/raw/FASTQ/files/SRR24302011.fastq.gz
SRR24302012,/path/to/raw/FASTQ/files/SRR24302012.fastq.gz
SRR24302013,/path/to/raw/FASTQ/files/SRR24302013.fastq.gz
SRR24302014,/path/to/raw/FASTQ/files/SRR24302014.fastq.gz
SRR24302015,/path/to/raw/FASTQ/files/SRR24302015.fastq.gz
SRR24302016,/path/to/raw/FASTQ/files/SRR24302016.fastq.gz
SRR24302017,/path/to/raw/FASTQ/files/SRR24302017.fastq.gz
SRR24302018,/path/to/raw/FASTQ/files/SRR24302018.fastq.gz
SRR24302019,/path/to/raw/FASTQ/files/SRR24302019.fastq.gz
SRR24302020,/path/to/raw/FASTQ/files/SRR24302020.fastq.gz
SRR24302021,/path/to/raw/FASTQ/files/SRR24302021.fastq.gz
SRR24302022,/path/to/raw/FASTQ/files/SRR24302022.fastq.gz
SRR24302023,/path/to/raw/FASTQ/files/SRR24302023.fastq.gz
SRR24302024,/path/to/raw/FASTQ/files/SRR24302024.fastq.gz
SRR24302025,/path/to/raw/FASTQ/files/SRR24302025.fastq.gz
SRR24302026,/path/to/raw/FASTQ/files/SRR24302026.fastq.gz
SRR24302027,/path/to/raw/FASTQ/files/SRR24302027.fastq.gz |
To generate the above file, let’s use the following PBS Pro script (i.e., called “launch_create_smRNAseq_samplesheet.pbs”)
Code Block |
---|
#!/bin/bash -l #PBS -N nfsmrnaseq #PBS -l select=1:ncpus=2:mem=4gb #PBS -l walltime=12:00:00 #work on current directory (folder) cd $PBS_O_WORKDIR #User defined variables ########################################################## DIR='/path/to/raw/FASTQ/files' INDEX='samplesheet.csv' ########################################################## #load python module module load python/3.10.8-gcccore-12.2.0 #fetch the script to create the sample metadata table wget -L https://raw.githubusercontent.com/nf-core/rnaseq/master/bin/fastq_dir_to_samplesheet.py chmod +x fastq_dir_to_samplesheet.py #generate initial sample metadata file ./fastq_dir_to_samplesheet.py $DIR index.csv \ --strandedness auto \ --read1_extension .fastq.gz #format index file cat index.csv | awk -F "," '{print $1 "," $2}' > ${INDEX} #Remove intermediate files: rm index.csv fastq_dir_to_samplesheet.py |
...