Aim

Analyse 10x Genomics* single cell RNA-Seq data.

This analysis workflow is split into 2 main sections: 1) ‘Upstream’ analysis on QUT’s HPC (high performance compute cluster) using a Nextflow workflow, nfcore/scrnaseq, 2) ‘Downstream’ analysis in R, primarily using the package seurat.

*Note: this workflow can be adapted to work with scRNA-Seq datasets generated by other sequencing technologies than 10x Genomics.

Requirements

A HPC account. If you do not have one, please request one here.
Access to an rVDI virtual desktop machine with 64GB RAM. For information about rVDI and requesting a virtual machine, see here.
Nextflow installed on your HPC home account. If you haven’t already installed Nextflow, do so by following the guide here.
Your scRNA-Seq data (fastq files) are on the HPC. If you are having difficulties transferring them to the HPC, submit a support ticket here.

Connect to an rVDI virtual desktop machine

Ensure you already have access to a 64GB RAM rVDI virtual machine, or request access by following the guide here.

scRNA-Seq datasets are often very large, requiring a lot of memory to run. Downstream analysis is run in R, on a Windows machine. Your PC is unlikely to have enough RAM, thus we’re using virtual machines with 64GB RAM. In addition, you can run the rest of this analysis in the virtual machine.

To access and run your rVDI virtual desktop:

Go to https://rvdi.qut.edu.au/

Click on ‘VMware Horizon HTML Access’

Log on with your QUT username and password

*NOTE: you need to be connected to the QUT network first, either being on campus or connecting remotely via VPN.

1. nfcore/scrnaseq

10x scRNA-Seq data is typically processed using Cell Ranger

https://nf-co.re/scrnaseq/

NOTE: sometimes your 10x data has already been processed by your sequencing company, using Cell Ranger . In this case you can skip the nfcore/scrnaseq analysis and go straight to the downstream Seurat analysis.

2. Seurat

Cell Ranger (and nfcore/scrnaseq) generates a default directory and file output structure for each sample, which we’ll use in R to complete our analysis. Each sample will have a directory named after the sample, an ‘outs’ subdirectory under this. This ‘outs’ directory contains various files and subdirectories. The subdirectory that contains the count matrix data we need for Seurat analysis is called ‘filtered_feature_bc_matrix’.

eResearch - single cell RNA-Seq analysis (10x Genomics)

Aim

Requirements

Connect to an rVDI virtual desktop machine

1. nfcore/scrnaseq

2. Seurat