2024-2: What is the HPC?
The Hight Performance Cluster/Computer (HPC) is a computer system that is designed to provide Researchers with a computing environment that is larger and more available than their desktop/laptop. If you have high demands on compute, storage or time, the HPC will help you.
HPC Parts
While we talk about the HPC as a single unit, it is in fact, made of many parts. As indicated on the diagram we have Login Nodes, Compute Nodes, Shared Storage. The QUT HPC, Lyra, has around 150 Compute nodes, 10000 CPUs, 88TB of memory and 20 Petabytes of storage. The parts are racked together in an offsite Datacentre in Brisbane. People refer to the parts by different names, but they generally mean the same thing. Head Node, Login Node etc refer to the individual server you logon to when you connect to the HPC.
HPC Software
HPCs typically runs the Linux Operation System, so your software needs to be Linux compatible. The particular operating system is not important as much of it is replaced or not used while running your software. Since the HPC is used by many people, we have to be careful how we install software so one person’s needs do not conflict with anyone else. We achieve this by either partially installing the software (modules) or installing to your home folder.
Accessing the HPC
The HPC is a server system, which means you do not walk up to it and use it directly. You must connect to the HPC over the network using the SSH protocol. Lyra has been configured to use your QUT Username and Password, but you must be enabled first, access to the HPC is not automatic for QUT staff and Students. To activate your account, visit the IT Helpdesk and search for HPC.
Storage on the HPC
On Lyra, there are a number of storage locations, with two being very important for you.
Your home folder is a space set aside for just you. You can store data here related to your use of the HPC that should not be shared. Everyone gets a Home Folder the 1st time they login. It is not possible to share your home folder with anyone else.
The Shared Work area is a space for creating shared folders. Here we create folders for teams, or projects that multiple people can have access to. You do not get a shared folder by default, it must be requested.
Both of these storage spaces are shared amongst all the nodes of Lyra, so when you run a job, it will automatically have access to the data.
There is other shared storage on Lyra, such as the software folder. The software folder contains all the applications installed for the users of the HPC. This folder is also shared with all the nodes.
Each of Lyra’s nodes has local high speed NVMe storage that can be used while your job is running but any data saved here must be copied to shared storage or it will be lost.
Accessing Storage
While logged on to Lyra, you can use the cd command to change to any folder and read/write/create or delete anything you have access to. To bring data from outside the HPC you can use tools like scp, winscp or OS tools like the Windows File Explorer and Mac Finder.
HPC Scheduling System
To ensure fair access for all users of the HPC, Lyra runs “scheduling” software called PBS Pro. To run software on Lyra, you must submit jobs to PBS. A job is text file with instructions to PBS where you nominate the resources your software needs to carry out it’s task. The resources are; number of cpus, amount of memory, how long it will take, and other options such as number and type of GPUs. The last part of the job is the commands that perform your calculations.
Once PBS receives your job, it will look across all the nodes for space to run your job. If space is available, PBS will reserve the resources you requested exclusively for your use. If there is not enough space to run your job immediately, your job will be placed in a Queue. As other jobs finish, and resources released, PBS will use the Queues to find the next job to start.
Working with the Results
Once your job is finished, and it worked successfully, you will have the results you were looking for. You can examine the results in your terminal session, transfer the results to your Laptop to examine, or feed the results into yet more work for Lyra.