6.5 Huntington Disease samples profiling against MirGeneDB curated database

Overview

  • Similar to exercise 6.4 we will:

    • Use created “samplesheet.csv” metadata file for small RNAseq datasets in exercise 6.4.

    • Use a “nextflow.config” file in the working directory to override Nextflow parameters (e.g., specify where to find the pipeline assets).

    • Use a PBS script to run the expression profiling of miRNAs against MirGeneDB, a curated database that includes experimentally validated miRNAs.

Prepare pipeline inputs

Let’s move to the working directory:

cd $HOME/workshop/2024-2/session6_smallRNAseq/runs/run2_human_MirGeneDB

Now, let’s copy the samplesheet.csv, nextflow.config and the PBS script to run the pipeline agains MirGeneDB files:

cp $HOME/workshop/2024-2/session6_smallRNAseq/data/human_disease/samplesheet.csv . cp $HOME/workshop/2024-2/session6_smallRNAseq/scripts/nextflow.config . cp $HOME/workshop/2024-2/session6_smallRNAseq/scripts/launch_nf-core_smallRNAseq_MirGeneDB.pbs .

Print the content of the launch script:

cat launch_nf-core_smallRNAseq_MirGeneDB.pbs
image-20241027-041344.png

Submit the job to the HPC cluster:

Monitor the progress:

The job will take several hours to run, hence we will use precomputed results for the statistical analysis in the next section.

Outputs

The pipeline will produce two folders, one called “work,” where all the processing is done, and another called “results,” where we can find the pipeline's outputs. The content of the results folder is as follows:

The quantification of the mature miRNA and hairpin expressions can be found in the /results/mirna_quant/edger_qc directory.

Let’s inspect the mature.csv file. Let’s use the ‘cat’ command to print it on the screen:

Note: the “mature_counts.csv” needs to be transposed prior running the statistical analysis. This can be done either user the R script or using a script called “transpose_csv.py”.

Let’s copy the transpose_csv.py script to the working folder:

The check how to use the script do the following:

To transpose the initial “mature_counst.csv” file do the following:

Let’s now print the transposed mature counts table: