Overview
Use modified launch script to run the full pipeline, including trimming parameters based on the QC output.
Inspect precomputed results
Run full nf-core/rnaseq pipeline
STEP1: copy metadata (sample sheet.csv) into the working folder (run2_RNAseq)
Code Block |
---|
cp $HOME/workshop/2024-2/session4_RNAseq/data/mouse/samplesheet.csv $HOME/workshop/2024-2/session4_RNAseq/runs/run2_RNAseq
cd $HOME/workshop/2024-2/session4_RNAseq/runs/run2_RNAseq |
Line 1: Copy the samplesheet.csv file to the working directory
Line 2: move to the working directory
Copy the PBS Pro script to run the nf-core/rnaseq pipeline:
Code Block |
---|
cp $HOME/workshop/2024-2/session4_RNAseq/scripts/launch_nf-core_RNAseq_pipeline.pbs $HOME/workshop/2024-2/session4_RNAseq/runs/run2_RNAseq |
NOTE: if you had issues with the above lines. Alternatively, run the following code to copy the sample sheet.csv and launch files:
Code Block |
---|
cp /work/training/2024/rnaseq/data/samplesheet.csv $HOME/workshop/2024-2/session4_RNAseq/runs/run2_RNAseq
cp /work/training/2024/rnaseq/scripts/launch_nf-core_RNAseq_pipeline.pbs
cd $HOME/workshop/2024-2/session4_RNAseq/runs/run2_RNAseq |
Adjusting the Trim Galore (read trimming) options
Print the content of the launch_RNAseq.pbs
script:
Code Block |
---|
cat launch_nf-core_RNAseq_pipeline.pbs |
...
Submitting the job
Code Block |
---|
qsub launch_nf-core_RNAseq_pipeline.pbs |
Monitoring the Run
Code Block |
---|
qjobs |
Outputs
The pipeline will produce two folders, one called “work,” where all the processing is done, and another called “results,” where we can find the pipeline's outputs. The content of the results folder is as follows:
Code Block |
---|
/work/training/2024/rnaseq/runs/run3_RNAseq/results/
├── fastqc
│ ├── SRR20622172_fastqc.html
│ ├── SRR20622172_fastqc.zip
│ ├── SRR20622173_fastqc.html
│ ├── SRR20622173_fastqc.zip
│ ├── SRR20622174_fastqc.html
│ ├── SRR20622174_fastqc.zip
│ ├── SRR20622175_fastqc.html
│ ├── SRR20622175_fastqc.zip
│ ├── SRR20622176_fastqc.html
│ ├── SRR20622176_fastqc.zip
│ ├── SRR20622177_fastqc.html
│ ├── SRR20622177_fastqc.zip
│ ├── SRR20622178_fastqc.html
│ ├── SRR20622178_fastqc.zip
│ ├── SRR20622179_fastqc.html
│ ├── SRR20622179_fastqc.zip
│ ├── SRR20622180_fastqc.html
│ └── SRR20622180_fastqc.zip
├── multiqc
│ └── star_salmon
├── pipeline_info
│ ├── execution_report_2024-08-08_12-45-46.html
│ ├── execution_timeline_2024-08-08_12-45-46.html
│ ├── execution_trace_2024-08-08_12-45-46.txt
│ ├── params_2024-08-08_14-01-19.json
│ ├── pipeline_dag_2024-08-08_12-45-46.html
│ └── software_versions.yml
├── star_salmon
│ ├── bigwig
│ ├── deseq2_qc
│ ├── dupradar
│ ├── featurecounts
│ ├── log
│ ├── metadata.xlsx
│ ├── picard_metrics
│ ├── qualimap
│ ├── rseqc
│ ├── salmon.merged.gene_counts_length_scaled.rds
│ ├── salmon.merged.gene_counts_length_scaled.tsv
│ ├── salmon.merged.gene_counts.rds
│ ├── salmon.merged.gene_counts_scaled.rds
│ ├── salmon.merged.gene_counts_scaled.tsv
│ ├── salmon.merged.gene_counts.tsv
│ ├── salmon.merged.gene_lengths.tsv
│ ├── salmon.merged.gene_tpm.tsv
│ ├── salmon.merged.transcript_counts.rds
│ ├── salmon.merged.transcript_counts.tsv
│ ├── salmon.merged.transcript_lengths.tsv
│ ├── salmon.merged.transcript_tpm.tsv
│ ├── samtools_stats
│ ├── SRR20622172
│ ├── SRR20622172.markdup.sorted.bam
│ ├── SRR20622172.markdup.sorted.bam.bai
│ ├── SRR20622173
│ ├── SRR20622173.markdup.sorted.bam
│ ├── SRR20622173.markdup.sorted.bam.bai
│ ├── SRR20622174
│ ├── SRR20622174.markdup.sorted.bam
│ ├── SRR20622174.markdup.sorted.bam.bai
│ ├── SRR20622175
│ ├── SRR20622175.markdup.sorted.bam
│ ├── SRR20622175.markdup.sorted.bam.bai
│ ├── SRR20622176
│ ├── SRR20622176.markdup.sorted.bam
│ ├── SRR20622176.markdup.sorted.bam.bai
│ ├── SRR20622177
│ ├── SRR20622177.markdup.sorted.bam
│ ├── SRR20622177.markdup.sorted.bam.bai
│ ├── SRR20622178
│ ├── SRR20622178.markdup.sorted.bam
│ ├── SRR20622178.markdup.sorted.bam.bai
│ ├── SRR20622179
│ ├── SRR20622179.markdup.sorted.bam
│ ├── SRR20622179.markdup.sorted.bam.bai
│ ├── SRR20622180
│ ├── SRR20622180.markdup.sorted.bam
│ ├── SRR20622180.markdup.sorted.bam.bai
│ ├── stringtie
│ └── tx2gene.tsv
└── trimgalore
├── fastqc
├── SRR20622172.fastq.gz_trimming_report.txt
├── SRR20622173.fastq.gz_trimming_report.txt
├── SRR20622174.fastq.gz_trimming_report.txt
├── SRR20622175.fastq.gz_trimming_report.txt
├── SRR20622176.fastq.gz_trimming_report.txt
├── SRR20622177.fastq.gz_trimming_report.txt
├── SRR20622178.fastq.gz_trimming_report.txt
├── SRR20622179.fastq.gz_trimming_report.txt
└── SRR20622180.fastq.gz_trimming_report.txt |
The quantification of the gene and transcript expressions can be found in the ‘star_salmon’ directory.
Code Block |
---|
cd results/star_salmon |
The following feature count tables are generated:
Code Block |
---|
#gene level expression
salmon.merged.gene_counts_length_scaled.rds
salmon.merged.gene_counts_length_scaled.tsv
salmon.merged.gene_counts.rds
salmon.merged.gene_counts_scaled.rds
salmon.merged.gene_counts_scaled.tsv
salmon.merged.gene_counts.tsv <--- This file will be used for differential expression analysis using DESeq2
salmon.merged.gene_tpm.tsv
#transcript level expression
salmon.merged.transcript_counts.rds
salmon.merged.transcript_counts.tsv
salmon.merged.transcript_tpm.tsv |