Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

...

  1. Open RStudio (you can type it in the Windows search bar)

  2. Create a new R script: ‘File’ → “New File” → “R script”

  3. Save this script where your samples folders are (‘File’ → ‘Save’). These should be on your H or W drive. Save the script file as scrnaseq.R

In the following sections you will be copying and running the R code into your scrnaseq.R script.

Cell Ranger (and nfcore/scrnaseq) generates a default folder and file output structure. There will be a main folder that contains all the sample subfolders (NOTE: this is where you must save your R script). Each sample folder will have an ‘outs’ subfolder. This ‘outs’ folder contains a ‘filtered_feature_bc_matrix’ folder, which contains the files that Seurat uses in its analysis.

...

You can manually set your working directory in RStudio by selecting ‘Session' -> 'Set working directory' -> 'Choose directory'. Choose the same directory as you saved your scrnaseq.R script, previous section. This will output the setwd(...) command with your working directory into the console window (bottom left panel). Copy this command to replace the default setwd(...) line in your R script.

...

Code Block
#### 2g. Processing expression data (dimensionality reduction) ####

# Normalise data
mat3 <- NormalizeData(mat2)

# Identification of variable features
mat3 <- FindVariableFeatures(mat3, selection.method = "vst", nfeatures = nrow(mat3))

# Scaling the data
all.genes <- rownames(mat3)
mat3 <- ScaleData(mat3, features = all.genes)

# Perform linear dimensional reduction (PCA. This may take several minutes to run)
mat3 <- RunPCA(mat3, features = VariableFeatures(object = mat3))

...

Code Block
#### 2j. Remove low quality or outlier cells ####

# Generate a violin plot.
VlnPlot(mat2, features = c("nFeature_RNA", "nCount_RNA"), ncol = 3)

# And a scatter plot
FeatureScatter(mat2, feature1 = "nCount_RNA", feature2 = "nFeature_RNA")

# Use these plot to choose and enter a minimum (nFeature_RNA > xxx) and maxiumum (nFeature_RNA < xxxx) below
mat3_filt <- subset(mat3, subset = nFeature_RNA > 200 & nFeature_RNA < 4000)

2k. Visualise gene expression by marker

In the section 2i (plotting dimensionality reduction data) we visualised total expression, i.e. all genes within each cell.

In this section we will visualise expression for specific markers within each cell, using the same dimensionality reduction data (PCA, UMAP and t-SNE) that we generated in the previous section.

This has a variety of uses: to identify patterns of differential expression between cells for specific markers, identify 'non-target' cells - i.e. expression of markers that are known to be not expressed in target cells, marker-based heterogeneity of expression, etc.

In section 2f you selected a set of markers. To confirm which markers they are:

Code Block
markers

If you wish to plot a different set of markers, you can do so by changing the set of markers in the markers <- c(..) code and re-running that line. You can choose 1 marker, or as many as you like. Be aware though that there will be a plot generated for every marker provided.

Now generate the plots. You can change the colours in the plots (cols = c("red", "lightgrey") and the point size (pt.size = 1). Note that the default colours show the lowest expression in red. This is so you can more easily see which cells don't express the diagnostic markers.

In addition to the dimensionality reduction plots, you can visualise expression for your selected markers with a heatmap (note: this requires at least 2 markers, preferably several, to be visually meaningful). You can change the colour range as you like by providing a high, centre and low colour (scale_fill_gradientn(colors = c("darkorange", "floralwhite", "dodgerblue4")))

Code Block
#### 2k. Visualise gene expression by marker ####

# PCA

p <- FeaturePlot(mat3_filt, features = markers, reduction = "pca", cols = c("red", "lightgrey"), pt.size = 1)
p

# Export as a 300dpi tiff
tiff_exp <- paste0(sample, "_pca_markers.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
# Export as a pdf
pdf_exp <- paste0(sample, "_pca_markers.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")


# UMAP

p <- FeaturePlot(mat3, features = markers, reduction = "umap", cols = c("red", "lightgrey"), pt.size = 1)
p

tiff_exp <- paste0(sample, "_umap_markers.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0(sample, "_umap_markers.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")


# t-SNE

p <- FeaturePlot(mat3, features = markers, reduction = "tsne", cols = c("red", "lightgrey"), pt.size = 1)
p

tiff_exp <- paste0(sample, "_tsne_markers.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0(sample, "_tsne_markers.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")


# Heatmap

p <- DoHeatmap(mat3, features = markers, group.bar = FALSE) + 
scale_fill_gradientn(colors = c("darkorange", "floralwhite", "dodgerblue4")) + 
theme(text = element_text(size = 16))
p

tiff_exp <- paste0(sample, "_hmap_markers.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0(sample, "_hmap_markers.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")


# Clustered heatmap

# You can also cluster the cells and then plot a heatmap based on this cluster data, to see if expression of your chosen markers is related to clusters

# First, generate the clusters for your sample
# Choose the `resolution =` score carefully (between 0 and 1). A lower score means fewer clusters. 
# You can adjust this higher or lower to see how it affects your clusters.
mat3 <- FindNeighbors(mat3, dims = 1:10)
mat3 <- FindClusters(mat3, resolution = 0.5)

# Then generate the heatmap
p <- DoHeatmap(mat3, features = markers, raster = T) + 
scale_fill_gradientn(colors = c("darkorange", "floralwhite", "dodgerblue4")) + 
theme(text = element_text(size = 16)) + labs(color = "Dose (mg)")
p

tiff_exp <- paste0(sample, "_hmap_markers_clustered.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")

pdf_exp <- paste0(sample, "_hmap_markers_clustered.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")

2l. Remove non-target cells