...
QUT eResearch data storage and computing infrastructure leverage both on-prem and hybrid cloud resources. QUT host hosts a local High-Performance Computing (HPC) facility comprised of 254 compute nodes, with a total of 7,256 conventional CPU cores (14,512 virtual cores with hyperthreading) and 74 GPUs. The highest memory allocation is 6 TB. Additionally, the HPC file storage for data analytics has a an 1800TB disk capacity.
Getting an HPC account
Apply for a new HPC account here.
Introduction
All jobs on the HPC must be scheduled via the PBS Pro batch job submission system. Jobs are submitted to PBS Pro scheduler specifying required resources, for example, number of CPUs the number of CPUs, amount of memory, and estimated time to complete the job (walltime). PBS will then schedule will then schedule the job to run on the cluster when requested resources are/become available. Existing HPC usage principles are in place to ensure fair access by all users.
...
Find below information on basic PBS commands:
Command | Description |
qsub -I -S /bin/bash | Submit an interactive job for 1 node with 1 CPU and 1Gb of memory for 1 hr |
qsub commandfile | Submit jobs to the HPC Submit batch jobs to the HPC cluster using the PBS ‘qsub’ command. For example: [username@hpc ~]$ qsub commandfile or [username@hpc ~]$ qsub -q default -l walltime=10:00:00,mem=3000MB commandfile where commadfile is a file containing PBS commands along with other user defined executable command(s). See qsub Options below for more options. |
qstat –u username | Displays the status of PBS jobs and queues for the userDisplays a summary of the status of PBS jobs and queues for the users username. See man qstat for details of options |
qjobs | Displays a summary of the jobs that you have submitted |
qdel jobid | Delete your job from a queue. The jobid is returned by qsub atjob submission timeis returned by qsub at job submission time, and is also displayed in the qstat output. |
qhold jobid | Place a hold on your job in the queue and stops it from running. |
qrls –h u jobid | Release a user hold on your job and allows it to be run. |
qrerun jobid | Terminate an executing job and return it to a queue. |
...
sub option | Description | ||
---|---|---|---|
#PBS -A acct | Causes the job time to be charged to "acct". | #PBS -N myJob | Assigns a job name. The default is the name of PBS jobscript. |
#PBS -l nodesl select=41:ppnncpus=2 | The number of nodes and processors per node.(depreciated) | #PBS -lselect=1:ncpus=2 | The number of chunks or nodes and processors perThe number of chunks/nodes (n=1) and processors per chunk/node (n=2). |
#PBS -l ngpus=2 | The number of gpus requiredThe number of graphical processing units (GPUs) required. | ||
#PBS -lwalltimel walltime=01:00:00 | Sets the maximum wall-clock time during which this jobcanrun. clock time during which this job can run (walltime=hh:mm:ss). | ||
#PBS -l mem=n{mb|gb} | Sets the maximum amount of memory allocated to thejob. | #PBS -lvmem=n{mb|gb} | Sets the maximum amount of virtual memory allocated to the job. (depreciated)Sets the maximum amount of memory per chunk/node allocated to the job. |
#PBS -q queuename | Assigns your job to a specific queue. | ||
#PBS -o mypath/my.out | The path and file name for standard output. | ||
#PBS -e mypath/my.err | The path and file name for standard error. | ||
#PBS -j oe | Join option that merges the standard error stream with the standard output stream of the job. | #PBS -M email-address | Sends email notifications to a specific user emailaddress. |
#PBS -m {a|b|e} | Causes email to be sent to the user Causes email to be sent to the users email address (as default) when:
| ||
#PBS -M email-address | Sends email notifications to a specific email address (i.e. need to specify if other than the users email address). | ||
#PBS –P project | Specifies what project the job belongs to. | ||
#PBS -r n | Indicates that a job should not rerun if it fails. | ||
#PBS -S shell | Sets the shell to use. Make sure the full path to the shellis Make sure the full path to the shell is correct. | ||
#PBS -V | Exports all environment variables to the job. | ||
#PBS -W | Used to set job dependencies between two or morejobsUsed to set job dependencies between two or more jobs. |
A Job Script Example
An example PBS Pro submission script is provided below:
Code Block |
---|
#!/bin/bash -l #PBS -N Example_Job #PBS -q defaultJob #PBS -l select=2:ncpus=16 16:mem=4gb #PBS -l walltime=<hh:mm:ss> #PBS -o <output-file> #PBS -e <error-file> cd $PBS_O_WORKDIR module load java conda activate ConsGenome matlab –nodisplay –nosplash –r example_job.m |
Where the line "-l select=2:ncpus=16 " is the number of processors required for the job. select specifies the number of nodes (or chunks of resource) required; ncpus indicates the number ofCPUs per chunk requiredindicates the number of CPUs per chunk/node required.
Find below a table illustrating how the above command works:
select | ncpus | mem | Description |
---|---|---|---|
2 | 16 | 4 | 32 Processor job , using 2 nodes and 16 processors per with 8GB memory, using 2 nodes with 16 processors and 4GB memory per node |
4 | 8 | 2 | 32 Processor job , using 4 nodes and 8 processors with 8GB memory, using 4 nodes with 8 processors and 2GB memory per node |
16 | 1 | 1 | 16 Processor job , using 16 nodes and 1 processor with 16GB memory, using 16 nodes with 1 processor and 1GB per node |
8 | 16 | 32 | 128 Processor job with 256GB memory, using 8 nodes and 16 using 8 nodes with 16 processors and 32GB per node |
The line The line "-l walltime=<hh:mm:ss>" is the time limit for the job is the time limit you set for the job. If your job exceeds this time the scheduler will terminate the job. It is recommended to find a usual runtime for a job of the job and add some moresame size and add some more (say 20%) to it. For example, if a job took approximately 10 hours, the walltime limit could be set to 12hours
For example, if a job normally takes approximately 10 hours, the wall time limit for a similar job could be set to 12 hours, e.g. "-l walltime=12:00:00". By setting the walltime the scheduler can perform job schedulingmore efficiently and also reduces occasions where errors can leave the job stalled but still taking upresource for the default much longer walltime limit (for queue walltime defaults run
By setting a wall time, the scheduler can perform job scheduling more efficiently and also reduces occasions where errors can leave the job stalled but still taking up resources for the default (much longer) wall time limit.
To find out what the default queue wall times are run the "qstat -q" command).
Job management
The qstat command displays the status of the PBS scheduler and queues. Using the flags -Qa shows thequeue partitions availableshows the queue partitions available. If no queue is defined, it will use the queue called default. The following table shows the commonly using queues
If no queue is defined, the PBS scheduler will assign the job to a queue based on the resources requested. The following table shows the commonly used queues:
quick | short | mediumlong | long | huge |
---|---|---|---|---|
|
|
|
|
|
...
The table below describes the different job states through the life cycle of a job. There are some attributesthat are only applicable when submitting jobs to an Enterprise PBS Professional complex.
Job State | Description |
B | Job arrays only: job array has Begun. |
E | Job is Exiting after having run. |
F | Job has Finished exiting and execution. The job was completed successfully and had no application errors. |
Job has Finished exiting and execution; however, the jobexperienced the job experienced application errors. | |
H | Job is Held. A job is put into a held state by the server or by a useror administratorby a user or administrator. A job stays in a held state until it is released by auser or A job stays in a held state until it is released by an eResearch administrator. |
Q | Job is Queued, eligible to run or be routed eligible and waiting to be run. |
R | Job is Running. |
S | Job is Suspended by server. A job is put into the suspended statewhen asuspended state when a higher priority job needs the resources. |
T | Job is in Transition (being moved to a new location). |
U | Job is User-suspended. |
W | Job is Waiting for its requested execution time to be reached or job specified a staging request which failed for some reason. |
X | Sub jobs only; sub job is finished (expired). |
...
| quick | short | medium | long | huge |
Priority | 161 | 16020 | 160 | 160 | 160 |
Max CPU per job | 500 | 200 | 100 | 40 | 40 |
Max Node | 29 | 29 | 29 | 29 | 29 |
Min Walltime (hr) | 1 | 8 | 24 | 72 | 72 |
Max Walltime (hr) | 8 | 24 | 72 | 168 | 168 |
Default Walltime (hr) | 1 | 24 | 24 | 72 | 72 |
Default Memory (gb) | 2 | 2 | 2 | 2 | 2 |
Max RunningJobs | 500 | 4505000 | 400 | 200 | 200 |
Max QueuedJobs | 2000020000 | 5000 | 5000 | 2000 | 2000 |
Queue Scheduling Issues
The scheduling algorithm used on the HPC aims to:
promote large scale parallel use of the large-scale parallel use of the HPC
allow equal access to resources for all users
provide good turnaround for all users
minimize the impact of jobs on one another
...
resources are strictly allocated so jobs will not start unless there is sufficient free resources (e.g. cpus and memory).
queued jobs are shuffled so that jobs from different users are "interleaved". This means your first jobshould appear near the top of the queue even if there are many jobs in the queue This means your first job should appear near the top of the queue even if there are many jobs in the queue.
From a user's perspective, it is very important that you minimize your requests for resources (e.g. walltime, memory and cpus). Otherwise your job may be queued or suspended longer than Otherwise, your job may be queued or suspended longer than necessary. Of course, make sure you ask for sufficient resources - a little experimentation might help.
...
PBS sets multiple environment variables at submission time. The following PBS variables are commonly usedin command filesfollowing PBS variables are commonly used in command files:
Variable Name | Description |
PBS_ARRAYID | Array ID numbers for jobs submitted with the -t flag. Forexample a job submitted with #PBS For example a job submitted with #PBS -t 1-8 will run eightidentical copies of the shell 8 will run eight identical copies of the shell script. The value of the PBS_ARRAYID will be an integer between 1 and 8. |
PBS_ENVIRONMENT | Set to PBS_BATCH to indicate that the job is a batch job; otherwise, set to PBS_INTERACTIVE to indicatethat the job isINTERACTIVE to indicate that the job is a PBS interactive job. |
PBS_JOBID | Full jobid assigned to this job. Often used to uniquelyname output files for this job Often used to uniquely name output files for this job, for example: mpirun - np 16 ./a.out >output.${PBS_JOBID} |
PBS_JOBNAME | Name of the job. This can be set using the -N option inthe PBS script option in the PBS script (or from the command line). The defaultjob name The default job name is the name of the PBS script. |
PBS_NODEFILE | Contains a list of the nodes assigned to the job. Ifmultiple CPUs If multiple CPUs on a node have been assigned, the node will be listed in the file more than once. By default, mpirun assigns jobs to nodes in the order they are listed in this file |
PBS_O_HOME | The value of the HOME variable in the environment inwhich The value of the HOME variable in the environment in which qsub was executed. |
PBS_O_HOST | The name of the host upon which the qsub command is running. |
PBS_O_PATH | Original PBS path. Used with pbsdsh. |
PBS_O_QUEUE | Queue job was submitted to. |
PBS_O_WORKDIR | PBS sets the environment variable PBS_O_WORKDIR to the directory from which the batch job was submitted. |
PBS_QUEUE | Queue job is running in (typically this is the same as PBS_O_QUEUE). |
Interactive PBS Jobs
Use of PBS is not limited to batch jobs onlyThe use of PBS is not limited to batch jobs only. It also allows users to use the compute nodes interactively, when needed. For example, users can work with the developer environments provided by Matlab or R oncompute nodes
For example, users can work with the developer environments provided by Matlab or R on compute nodes, and run their jobs (until the walltime expiresuntil the wall-time expires). Instead of preparing a submission script
Instead of preparing a submission script, users pass the job requirements directly to the qsub command. For instance, the following PBS script:
...
corresponds to the following interactive job command:
Code Block |
---|
qsub -I -X -q defaultS /bin/bash -l select=1:ncpus=2,:mem=4gb, -l walltime=505:00:00 |