Content Comparison

Copy and paste the code below to the terminal:

Code Block
cp $HOME/workshop/2024/rnaseq/data/samplesheet.csv $HOME/workshop/2024-2/session4_RNAseq/runs/run3_RNAseq cd $HOME/workshop/2024-2/session4_RNAseq/runs/run3_RNAseq

Line 1: Copy the samplesheet.csv file to the working directory
Line 2: move to the working directory

Copy the PBS Pro script to run the nf-core/rnaseq pipeline:

Code Block
cp $HOME/workshop/2024/rnaseq/scripts/launch_nf-core_RNAseq_pipeline.pbs $HOME/workshop/2024-2/session4_RNAseq/runs/run3_RNAseq

Adjusting the Trim Galore (read trimming) options

Print the content of the launch_RNAseq.pbs script:

Code Block
cat launch_nf-core_RNAseq_pipeline.pbs

#!/bin/bash -l

#PBS -N nfRNAseq

#PBS -l select=1:ncpus=2:mem=4gb

#PBS -l walltime=48:00:00

#work on current directory

cd $PBS_O_WORKDIR

#load java and set up memory settings to run nextflow

module load java

export NXF_OPTS='-Xms1g -Xmx4g'

nextflow run nf-core/rnaseq --input samplesheet.csv \

--outdir results \

-r 3.14.0 \

--genome GRCm38-local \

-profile singularity \

--aligner star_salmon \

--extra_trimgalore_args "--quality 30 --clip_r1 10 --clip_r2 10 --three_prime_clip_r1 1 --three_prime_clip_r2 1 "

Submitting the job

Code Block
qsub launch_nf-core_RNAseq_pipeline.pbs

Monitoring the Run

Code Block
qjobs

Outputs

The pipeline will produce two folders, one called “work,” where all the processing is done, and another called “results,” where we can find the pipeline's outputs. The content of the results folder is as follows:

Code Block

/work/training/2024/rnaseq/runs/run3_RNAseq/results/
├── fastqc
│   ├── SRR20622172_fastqc.html
│   ├── SRR20622172_fastqc.zip
│   ├── SRR20622173_fastqc.html
│   ├── SRR20622173_fastqc.zip
│   ├── SRR20622174_fastqc.html
│   ├── SRR20622174_fastqc.zip
│   ├── SRR20622175_fastqc.html
│   ├── SRR20622175_fastqc.zip
│   ├── SRR20622176_fastqc.html
│   ├── SRR20622176_fastqc.zip
│   ├── SRR20622177_fastqc.html
│   ├── SRR20622177_fastqc.zip
│   ├── SRR20622178_fastqc.html
│   ├── SRR20622178_fastqc.zip
│   ├── SRR20622179_fastqc.html
│   ├── SRR20622179_fastqc.zip
│   ├── SRR20622180_fastqc.html
│   └── SRR20622180_fastqc.zip
├── multiqc
│   └── star_salmon
├── pipeline_info
│   ├── execution_report_2024-08-08_12-45-46.html
│   ├── execution_timeline_2024-08-08_12-45-46.html
│   ├── execution_trace_2024-08-08_12-45-46.txt
│   ├── params_2024-08-08_14-01-19.json
│   ├── pipeline_dag_2024-08-08_12-45-46.html
│   └── software_versions.yml
├── star_salmon
│   ├── bigwig
│   ├── deseq2_qc
│   ├── dupradar
│   ├── featurecounts
│   ├── log
│   ├── metadata.xlsx
│   ├── picard_metrics
│   ├── qualimap
│   ├── rseqc
│   ├── salmon.merged.gene_counts_length_scaled.rds
│   ├── salmon.merged.gene_counts_length_scaled.tsv
│   ├── salmon.merged.gene_counts.rds
│   ├── salmon.merged.gene_counts_scaled.rds
│   ├── salmon.merged.gene_counts_scaled.tsv
│   ├── salmon.merged.gene_counts.tsv
│   ├── salmon.merged.gene_lengths.tsv
│   ├── salmon.merged.gene_tpm.tsv
│   ├── salmon.merged.transcript_counts.rds
│   ├── salmon.merged.transcript_counts.tsv
│   ├── salmon.merged.transcript_lengths.tsv
│   ├── salmon.merged.transcript_tpm.tsv
│   ├── samtools_stats
│   ├── SRR20622172
│   ├── SRR20622172.markdup.sorted.bam
│   ├── SRR20622172.markdup.sorted.bam.bai
│   ├── SRR20622173
│   ├── SRR20622173.markdup.sorted.bam
│   ├── SRR20622173.markdup.sorted.bam.bai
│   ├── SRR20622174
│   ├── SRR20622174.markdup.sorted.bam
│   ├── SRR20622174.markdup.sorted.bam.bai
│   ├── SRR20622175
│   ├── SRR20622175.markdup.sorted.bam
│   ├── SRR20622175.markdup.sorted.bam.bai
│   ├── SRR20622176
│   ├── SRR20622176.markdup.sorted.bam
│   ├── SRR20622176.markdup.sorted.bam.bai
│   ├── SRR20622177
│   ├── SRR20622177.markdup.sorted.bam
│   ├── SRR20622177.markdup.sorted.bam.bai
│   ├── SRR20622178
│   ├── SRR20622178.markdup.sorted.bam
│   ├── SRR20622178.markdup.sorted.bam.bai
│   ├── SRR20622179
│   ├── SRR20622179.markdup.sorted.bam
│   ├── SRR20622179.markdup.sorted.bam.bai
│   ├── SRR20622180
│   ├── SRR20622180.markdup.sorted.bam
│   ├── SRR20622180.markdup.sorted.bam.bai
│   ├── stringtie
│   └── tx2gene.tsv
└── trimgalore
    ├── fastqc
    ├── SRR20622172.fastq.gz_trimming_report.txt
    ├── SRR20622173.fastq.gz_trimming_report.txt
    ├── SRR20622174.fastq.gz_trimming_report.txt
    ├── SRR20622175.fastq.gz_trimming_report.txt
    ├── SRR20622176.fastq.gz_trimming_report.txt
    ├── SRR20622177.fastq.gz_trimming_report.txt
    ├── SRR20622178.fastq.gz_trimming_report.txt
    ├── SRR20622179.fastq.gz_trimming_report.txt
    └── SRR20622180.fastq.gz_trimming_report.txt

The quantification of the gene and transcript expressions can be found in the ‘star_salmon’ directory.

Code Block
cd results/star_salmon

The following feature count tables are generated:

Code Block

#gene level expression
salmon.merged.gene_counts_length_scaled.rds
salmon.merged.gene_counts_length_scaled.tsv
salmon.merged.gene_counts.rds
salmon.merged.gene_counts_scaled.rds
salmon.merged.gene_counts_scaled.tsv
salmon.merged.gene_counts.tsv   <--- This file will be used for differential expression analysis using DESeq2
salmon.merged.gene_tpm.tsv
#transcript level expression
salmon.merged.transcript_counts.rds
salmon.merged.transcript_counts.tsv
salmon.merged.transcript_tpm.tsv

Version	Old Version 1	New Version 2
Changes made by	Roberto Barrero Gumiel	Roberto Barrero Gumiel
Saved on	08/10/2024	08/10/2024

Versions Compared

Key

Adjusting the Trim Galore (read trimming) options

Submitting the job

Monitoring the Run

Outputs