eResearch Paired-end (PE) data sets
Generating a ‘samplesheet.csv’ file for PE data:
When PE data is available for the RNAseq analysis, to create the ‘samplesheet.csv’ input file we add “--read2_extension” to the PBS Pro script file:
!/bin/bash -l #PBS -N samplesheet #PBS -l select=1:ncpus=2:mem=4gb #PBS -l walltime=12:00:00
#work on current directory (folder) cd $PBS_O_WORKDIR
#User defined variables ########################################################## DIR='$HOME/workshop/data/' INDEX='samplesheet.csv' ##########################################################
#load python module module load python/3.10.8-gcccore-12.2.0
#fetch the script to create the sample metadata table wget -L https://raw.githubusercontent.com/nf-core/rnaseq/master/bin/fastq_dir_to_samplesheet.py chmod +x fastq_dir_to_samplesheet.py
#generate initial sample metadata file ./fastq_dir_to_samplesheet.py $DIR $INDEX \ --strandedness auto \ --read1_extension _R1.fq.gz \ --read2_extension _R2.fq.gz |
---|
Paired-end FASTQ files:
sample,fastq_1,fastq_2,strandedness
control_1,/path/to/directory/containing/fastq_files/control-1_R1.fastq.gz,/path/to/directory/containing/fastq_files/control-1_R2.fastq.gz,auto
control_2,/path/to/directory/containing/fastq_files/control-2_R1.fastq.gz,/path/to/directory/containing/fastq_files/control-2_R2.fastq.gz,auto
control_3,/path/to/directory/containing/fastq_files/control-3_R1.fastq.gz,/path/to/directory/containing/fastq_files/control-3_R2.fastq.gz,auto
infected_1,/path/to/directory/containing/fastq_files/infected-1_R1.fastq.gz,/path/to/directory/containing/fastq_files/infected-1_R2.fastq.gz,auto
infected_2,/path/to/directory/containing/fastq_files/infected-1_R1.fastq.gz,/path/to/directory/containing/fastq_files/infected-2_R2.fastq.gz,auto
infected_3,/path/to/directory/containing/fastq_files/infected-1_R1.fastq.gz,/path/to/directory/containing/fastq_files/infected-3_R2.fastq.gz,auto