Table of Contents | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...
As a first exercise we will download and run the nf-core/smrnaseq which is a bioinformatics best-practice analysis pipeline for Small RNA-Seq. We will use the test data provided by the developers to ensure the pipeline installed successfully. This control dataset contains 8 samples.
Run the following command from your home directory:
Code Block |
---|
cd $HOME/workshop/2024-2/session3 mkdir smrnaseq_cl cd smrnaseq_cl export NXF_OPTS='-Xms1g -Xmx4g' nextflow pull file:///work/training/smrnaseq nextflow run nf-corefile:///work/training/smrnaseq -profile test,singularity --outdir results -r 2.3.1 |
Line 1: Move to your home directorythe directory created for this workshop.
Line 2: Make a temporary folder called smrnaseq_cl for Nextflow to test the smrnaseq pipeline.
Line 3: Change directory to the newly created folder smrnaseq_cl.
Line 4: In some cases, the Nextflow Java virtual machines can start to request a large amount of memory. We recommend adding the following line to your environment to limit this.
Line 5: Download and run the test code.
This will download the smrnaseq pipeline and then run the test code. It should take ~20-30 minutes to run to completion.
Nextflow will first download the pipeline:
...
It will first then display the version of the pipeline which was downloaded: version 2.3.1.
It will then also list all the parameters that differ from the pipeline default.
...
Before running a process, it will download the required singularity imageimages and required reference and input files for testing.
In the screenshot below, all the jobs which will be run are listed.
...
Going back to the terminal from which you launched the Nextflow analysis, you can check the nextflow log to see how the analysis is progressing.
For example in the screenshot below, taken half way through the Nextflow analysis, several processes have run to completion for all 8 samples tested.
...
Move back into your home directory and create a separate rnaseq_pbs
folder:
Code Block |
---|
mkdir ~-p $HOME/workshop/2024-2/session3/rnaseq_pbs cd ~$HOME/workshop/2024-2/session3/rnaseq_pbs |
Create the script file smrnaseqrnaseq_test.sh
by running the following command:
Code Block |
---|
cat <<EOF > rnaseq_test.sh #!/bin/bash -l #PBS -N nfrnaseq_test #PBS -l select=1:ncpus=2:mem=4gb #PBS -l walltime=6:00:00 cd \$PBS_O_WORKDIR module load java NXF_OPTS='-Xms1g -Xmx4g' nextflow run nf-corefile:///work/training/nextflow_intro/rnaseq -r 3.14.0 -profile test,singularity --outdir results EOF |
Line 3: Set your PBS job name to be
nfrnaseq_test
Line 4: Specify memory and CPU resource that you want to allocate to your job
Line 5: Specify that you want to allocate 6h for your job to run to.completion
Line 7: Change directory to $PBS_O_WORKDIR, which is a special environment variable created by PBS. This will be the folder where you ran the qsub command
Line 8: Load java
Line 9: In some cases, the Nextflow Java virtual machines can start to request a large amount of memory. We recommend adding the following line to your environment to limit this
Line 10: Run the
nf-core/rnaseq
pipeline using the test data provided
You can check the content of the PBS script you just created using the command:
Code Block |
---|
cat rnaseq_test.sh |
Make the command executable and then submit your job to the PBS queue by running the following commands:
...
Once again you can monitor your jobs using the qjobs
qstat -u $user
command.
The test should take ~ 30 min to run.
...