Table of Contents |
---|
...
Open RStudio (you can type it in the Windows search bar)
Create a new R script: ‘File’ → “New File” → “R script”
Save this script where your samples folders are (‘File’ → ‘Save’). These should be on your H or W drive. Save the script file as ‘scrnaseq.R’
In the following sections you will be copying and running the R code into your scrnaseq.R script.
Cell Ranger (and nfcore/scrnaseq) generates a default folder and file output structure. There will be a main folder that contains all the sample subfolders (NOTE: this is where you must save your R script). Each sample folder will have an ‘outs’ subfolder. This ‘outs’ folder contains a ‘filtered_feature_bc_matrix’ folder, which contains the files that Seurat uses in its analysis.
...
You can manually set your working directory in RStudio by selecting ‘Session' -> 'Set working directory' -> 'Choose directory'. Choose the same directory as you saved your scrnaseq.R script, previous section. This will output the setwd(...)
command with your working directory into the console window (bottom left panel). Copy this command to replace the default setwd(...)
line in your R script.
...
This section is split into two three subsections - “i) Choosing the correct resolution”, where you’ll use the clustree tool to identify the optimal number of clusters, and “ii) Calculate the clusters” where you’ll use the clustree results to generate optimised clusters, and then “iii) Plot the results“, where you’ll generate 'before and after' plots, to visualise how filtration (inc. removing the non-target cells) changed the data structure.
...
Code Block |
---|
#### 2m(i). Choosing the correct resolution #### # Install and load the clustree package: install.packages("clustree") library(clustree) # Then generate a range of clusters, from 0 to 1, at 0.1 increments ("resolution = seq(0, 1, 0.1)"). mat3_clust <- FindNeighbors(mat3_filt, dims = 1:10) mat3_clust <- FindClusters(mat3_clust, resolution = seq(0, 1, 0.1), verbose = F) # Convert the results into a Seurat object, which can be used as input into "clustree()" clus_seurat <- CreateSeuratObject(counts = mat3_clust@assays$RNA@countsclust@assays$RNA$counts, meta.data = mat3_clust[[]]) clus_seurat[['TSNE']] <- CreateDimReducObject(embeddings = Embeddings(object = mat3_clust, reduction = "pca"), key = "tSNE_") # Generate the tree. # Refer to the clustree manual for tips on how to use this tree to choose the optimal `resolution` # https://cran.r-project.org/web/packages/clustree/vignettes/clustree.html clustree(clus_seurat, prefix = "RNA_snn_res.") + scale_color_manual(values=c25) + scale_edge_color_continuous(low = "blue", high = "red") |
...
ii) Calculate the clusters
In this subsection we’ll re-cluster the data based on the clustree results.
Choose the the resolution = xx
score in FindClusters()
to the optimal resolution that we can see in the clustree plot (e.g. 0.5, or 0.6, etc.).
Code Block |
---|
#### 2m(bii). Calculate the clusters #### # First you need to re-run the variable feature calculation, scaling, dimensionality reduction (PCA, t-SNE and UMAP). # This may take several minutes to run. mat3_filt <- FindVariableFeatures(mat3_filt, selection.method = "vst", nfeatures = nrow(mat3_filt)) all.genes <- rownames(mat3_filt) mat3_filt <- ScaleData(mat3_filt, features = all.genes) mat3_filt <- RunPCA(mat3_filt, dims = 1:3, verbose = F) mat3_filt <- RunUMAP(mat3_filt, dims = 1:3, verbose = F) mat3_filt <- RunTSNE(mat3_filt, dims = 1:3, verbose = F) # Then you generate a 'nearest neighbor' graph by calculating the neighborhood overlap (Jaccard index) # between every cell and identify clusters of cells based on shared nearest neighbor (SNN). mat3_filt <- FindNeighbors(mat3_filt, dims = 1:10) mat3_filt <- FindClusters(mat3_filt, resolution = 0.6) # You can see how many cells there are per cluster like so: cellcount <- as.data.frame(table(mat3_filt@meta.data[4])) names(cellcount) <- c("Cluster", "Cell_count") cellcount |
iii) Plot the results
Here we’ll plot ‘before and after’ filtration plots, to examine how filtration affected the data structure.
We’ll plot all clusters, for PCA, t-SNE and UMAP, and you can also choose an individual cluster to plot for any of these dimensionality reduction plot types.
Code Block |
---|
#### 2m(iii). Plot the results ####
## PCA Plot ##
# Before filtration:
p <- DimPlot(mat3, reduction = "pca", pt.size = 2, cols = c25) +
theme_bw() +
theme(legend.title=element_text(size=14), legend.text = element_text(size = 14), axis.title=element_text(size=18), axis.text=element_text(size=14)) + labs(color="Cluster")
p
# After filtration:
p <- DimPlot(mat3_filt, reduction = "pca", pt.size = 2, cols = c25) +
theme_bw() +
theme(legend.title=element_text(size=14), legend.text = element_text(size = 14), axis.title=element_text(size=18), axis.text=element_text(size=14)) + labs(color="Cluster")
p
# Export as pdf and tiff
tiff_exp <- paste0(sample, "_pca_filtered.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0(sample, "_pca_filtered.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")
## PCA: Plot one individual cluster ##
# Here you can visualise a single cluster by colouring it a specific colour.
# First define the colours, based on the cluster information in your seurat data.
# You can change the background colours ("gray70") and the highlighted cluster colour ("red") to whatever you like.
newcols <- rep("gray70", length(levels(mat3_filt))-1)
newcols <- c(newcols, "red")
# Enter which cluster you want to visualise in the `myclust` variable below.
# This is based on the cluster numbers in the previous plot.
myclust <- 3
# Plot the cluster, placing this cluster on top (the `order` parameter).
p <- DimPlot(mat3_filt, reduction = "pca", pt.size = 2, cols = newcols, order = myclust) +
theme_bw() +
theme(legend.title=element_text(size=14), legend.text = element_text(size = 14), axis.title=element_text(size=18), axis.text=element_text(size=14)) + labs(color="Cluster")
p
# Export as a pdf and tiff
tiff_exp <- paste0(sample, "clust_", myclust, "_pca_filtered.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0(sample, "clust_", myclust, "_pca_filtered.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")
|