Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Be notified when the job starts, use the -m option eg be sent an email if the job is aborted, when it begins, and when it ends: #PBS -m abe

  • Give the job a name: To find your job in a long list give it a meaning name with the -N option: #PBS -N MyJob01

  • Merge the error file into the standard output file: #PBS -j oe

  • Overriding the email address: If you want to send the job notification email to another address, use the -M option, eg #PBS -M bob@bob.com

...

Example 2

From the Introduction to the Unix Shell for HPC users course lets run the do-stats.sh script as a job.

...

Code Block
cd ~/workshop/2024-2/shell-lesson-data/north-pacific-gyre

Then, create the submission script (copy and paste!)

Code Block
#!/bin/bash -l
#PBS -N GooStatsRun01
#PBS -l select=1:ncpus=1:mem=2gb
#PBS -l walltime=00:30:00
#PBS -m abe
cd $PBS_O_WORKDIR

# Calculate stats for data files.
for datafile in NENE*A.txt NENE*B.txt
do
    echo $datafile
    bash goostats.sh $datafile stats-$datafile
done

Call this do-goostats.pbs

Now submit to the scheduler:

...

$PBS_O_WORKDIR is a special environment variable created by PBS. This will be the folder where you ran the qsub command.

Example 3

We can run commands to check and trim fastq sequences.

Start by changing directory:

Code Block
cd $HOME/workshop/2024-2

Then, download the data and scripts (Copy and Paste!):

Code Block
wget https://github.com/eresearchqut/hpc_training_3/archive/refs/tags/1.0.tar.gz

Then, unpack the downloaded file with the tar command

Code Block
tar xvzf 1.0.tar.gz

Change into the newly create folder (Don’t forget TAB completion):

Code Block
cd hpc_training_3-1.0

Examine the data and scripts

Code Block
ls
cat stage1.pbs

stage1.pbs

Code Block
#!/bin/bash -l
#PBS -N Stage1-FastQC
#PBS -m abe
#PBS -l select=1:ncpus=1:mem=4gb
#PBS -l walltime=1:00:00

# Change to folder where we launched the job
cd $PBS_O_WORKDIR

# Make a directory for the fastqc results
mkdir fastqc

# Load the fastqc module
module load fastqc/0.11.7-java-1.8.0_92

# Run fastqc
fastqc --threads 2  --outdir fastqc data/sample1_R1.fastq.gz data/sample1_R2.fastq.gz

This job script will load the fastqc app from the module system and run with the samples. This job script will produce the fastqc directory that holds the results of the run.

stage2.pbs

Code Block
#!/bin/bash -l
#PBS -N Stage2-seqtk
#PBS -m abe
#PBS -l select=1:ncpus=2:mem=8gb
#PBS -l walltime=1:00:00

# Change to folder where we launched the job
cd $PBS_O_WORKDIR

# Make a directory for the seqtk results
mkdir trimmed

# Use Singularity to supply trim the files
singularity exec https://depot.galaxyproject.org/singularity/seqtk:1.4--he4a0461_1 bash -c \
  'seqtk trimfq data/sample1_R1.fastq.gz | \
  gzip --no-name > trimmed/sample1_R1.fastq.trimmed.gz'

singularity exec https://depot.galaxyproject.org/singularity/seqtk:1.4--he4a0461_1 bash -c \
  'seqtk trimfq data/sample1_R2.fastq.gz | \
  gzip --no-name > trimmed/sample1_R2.fastq.trimmed.gz'

This job script will run the seqtk app over the sample fastq files using default settings. Since seqtk is not available on the HPC, we can use singularity to supply the app. This script will saved the trimmed fastq files in the trimmed folder.

stage3.pbs

Code Block
#!/bin/bash -l
#PBS -N Stage3-multiqc
#PBS -m abe
#PBS -l select=1:ncpus=2:mem=8gb
#PBS -l walltime=1:00:00

# Change to folder where we launched the job
cd $PBS_O_WORKDIR

# Make a directory for the seqtk results
mkdir multiqc

# Use Singularity to supply trim the files
singularity exec https://depot.galaxyproject.org/singularity/multiqc:1.21--pyhdfd78af_0 \
  multiqc --force --outdir multiqc data fastqc trimmed

This job script will run multiqc to create a summary of the data and operations of the previous stages. Like stage2, multiqc is not available on the HPC so we can run it via singularity. The results will be saved in the multiqc folder.

These files must be run one after each other, they cannot run at the same time.

Launch stage1.pbs

Code Block
qsub stage1.pbs
qjobs
cat Stage1.o{tab}
cat Stage1.e{tab}

When finished, launch stage2.pbs

Code Block
qsub stage2.pbs
qjobs
cat Stage2.o{tab}
cat Stage2.e{tab}

Finally, when stage2 is finished, launch stage3.pbs

Code Block
qsub stage3.pbs
qjobs
cat Stage3.o{tab}
cat Stage3.e{tab}

These jobs will send you an email when the job starts and finishes. It can be handy to receive the email rather than watching qjobs etc.

Once stage3 is finished, open up the hpc_training_3-1.0 folder in Windows Explorer/Finder and examine the files, especially in the multiqc folder.

image-20240911-065224.pngImage Added