2024-2: Transferring Data to and from the HPC

It is very likely that you will need to transfer data to the HPC (and transfer data from the HPC).

There are a few methods available to us to transfer data.

  • Windows File Explorer

  • SCP/SFTP

  • curl/wget on the server.

Windows File Explorer

Using the Windows File Explorer, we can connect to our Home folder and the shared Work folder.

For the home folder we use \\hpc-fs\home in the address bar and for the shared Work folder we use \\hpc-fs\work in the address bar.

From here we can use the normal Windows file operations to copy to the HPC, copy from the HPC, copy/rename and delete files.

When copying text files, you might be caught by the CRLF vs LF line endings. On Windows, regular text editors like Notepad put extra CR line endings on text files, which the HPC cannot process. To solve this problem, use the dos2unix program via the HPC command line to remove this extra character and make the file compatible with the HPC.

If you regularly transfer data to and from the HPC, you can “Map a Drive”.

In the File Explorer right click “This PC” and choose Map Network Drive…

For your Home folder, choose a drive letter, the enter \\hpc-fs\home into the Folder box. Ensure “Reconnect at sign-in” is selected. Then choose finish.

For the Work folder, repeat the process and except use \\hpc-fs\work for the Folder box.

Make sure you have your HPC Home folder mapped for the rest of the training.

To see this demonstrated watch this video:

https://mediahub.qut.edu.au/media/t/0_ylaejs40

SCP/SFTP

Using tools like WinSCP (Or PSFTP in PuTTY) you can transfer files. WinSCP can convert text files as they are copied.

curl/wget on the server

It is possible to download files directly on the HPC using tools like curl and wget.

Say we wanted to download the file:

ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR206/072/SRR20622172/SRR20622172.fastq.gz

we could use the command:

wget -nc ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR206/072/SRR20622172/SRR20622172.fastq.gz

If you are transferring a number of files, you should script them into a PBS job, or use one of the data transfer nodes eresdt001 and eresdt002

 

Downloading Data for the Next Module

Some data files are needed for the next module, so let's download the zip file, unzip this file, then copy to the HPC.

  1. Download this file: https://swcarpentry.github.io/shell-novice/data/shell-lesson-data.zip

  2. Locate the file in your downloads, right click the file and choose “Extract All”

  3. Create a workshop folder in your HPC Home Folder

  4. Create a 2024-2 folder inside the workshop folder on you HPC Home Folder

  5. Drag the “shell-lesson-data” folder to the 2024-2.