BaseSpace data download

Aim:

Download sequencing data generated for your project from Illumina's BaseSpace to your HPC account/

Download the data from BaseSpace

  • Create a BaseSpace account

  • Ask for access to Project (having a link to the project should be enough) from the project owner.

Install BaseScapce Sequence Hub

source: CLI Overview

Log into QUT’s HPC using your account credentials (see above)

Run the following command:

wget "https://launch.basespace.illumina.com/CLI/latest/amd64-linux/bs" -O $HOME/bin/bs

then change the file permissions to make the downloaded binary executable:

$ chmod u+x $HOME/bin/bs

Authenticate the connection to the BaseSpace server:

bs auth

Find the project ID for the data you are interested in downloading. For example, you can derive this from the URL when logged into BaseSpace.

NOTE: If you have not yet created a folder to store raw data, you can run for example, the following command:

then you can create a subfolder for your project, for example:

Example: now you are ready to download the Fastq.gz files

Download files using a PBS Pro script (i.e., called launch_fetch_BaseSpaceData.pbs):

Where: -i is ‘project ID number'; -o 'path to the output folder, where data will be downloaded.’

 

Submit the job to the PBS Pro scheduler (queue):

Monitor progress:

It can take ~15-20 min to download ~170GB of data.