Table of Contents |
---|
...
Open RStudio (you can type it in the Windows search bar)
Create a new R script: ‘File’ → “New File” → “R script”
Save this script where your samples folders are (‘File’ → ‘Save’). These should be on your H or W drive. Save the script file as ‘scrnaseq.R’
In the following sections you will be copying and running the R code into your scrnaseq.R script.
Cell Ranger (and nfcore/scrnaseq) generates a default folder and file output structure. There will be a main folder that contains all the sample subfolders (NOTE: this is where you must save your R script). Each sample folder will have an ‘outs’ subfolder. This ‘outs’ folder contains a ‘filtered_feature_bc_matrix’ folder, which contains the files that Seurat uses in its analysis.
...
You can manually set your working directory in RStudio by selecting ‘Session' -> 'Set working directory' -> 'Choose directory'. Choose the same directory as you saved your scrnaseq.R script, previous section. This will output the setwd(...)
command with your working directory into the console window (bottom left panel). Copy this command to replace the default setwd(...)
line in your R script.
...
Code Block |
---|
#### 5d.Differential expression: one cluster vs another cluster ####
# In this section you'll select to clusters to compare. Any upregulated genes (postive log fold change) are more highly expressed in the second cluster.
# First, select your baseline cluster:
## **USER INPUT**
declust <- 0
# Then, the cluster you want to compare it to:
## **USER INPUT**
declust2 <- 1
# Then run the Seurat DE function.
DE_genes <- FindMarkers(data, ident.1 = declust, ident.2 = declust2, logfc.threshold = 0.2)
# As in the previous section, you can run `tail(DE_genes)` to see if you need to reduce `logfc.threshold = 0.2` so as to capture all **significantly** DE genes.
tail(DE_genes)
# And again you can filter by p value and log fold change (i.e. change `p_val_adj` and `lfc_threshold` to suitable numbers), count the number of sig DE genes and view the top few.
DE_genes_sig <- DE_genes[DE_genes$p_val_adj < 0.05, ]
lfc_threshold <- 0.
DE_genes_sig_lfc <- DE_genes_sig[DE_genes_sig$avg_log2FC < -lfc_threshold | DE_genes_sig$avg_log2FC > lfc_threshold, ]
nrow(DE_genes_sig_lfc)
head(DE_genes_sig_lfc)
# Export the table of DE genes as a csv file:
write.csv(DE_genes_sig_lfc, paste0("DE_1vs1_clust", declust, "_vs_clust", declust2, "_", sample, ".csv"))
## Plotting ##
# Pull out the top 20 genes to plot (or change to the range or names of genes you want to plot, as in the previous section.)
plotgenes <- rownames(DE_genes_sig)[1:20]
# Scatter plot
p <- FeaturePlot(data, features = plotgenes, cols = c("red", "lightgrey"), reduction = "pca", pt.size = 1)
p
# Export as a pdf and tiff
tiff_exp <- paste0(sample, "clust_", declust, "clust_", declust2, "_DE_genes.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0(sample, "clust_", declust, "clust_", declust2, "_DE_genes.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")
# Dot plot
p <- DotPlot(data, features = plotgenes, dot.scale = 8, cols = c("red", "lightgrey")) + RotatedAxis() +
ylab("Cluster") + xlab("Genes") +
theme(text = element_text(size = 18))
p
# Export as a pdf and tiff
tiff_exp <- paste0(sample, "clust_", declust, "clust_", declust2, "_DE_genes_dotplot.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0(sample, "clust_", declust, "clust_", declust2, "_DE_genes_dotplot.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")
# Violin plots
p <- VlnPlot(data, features = plotgenes, cols = c25, pt.size = 0)
p
# Export as a pdf and tiff
tiff_exp <- paste0(sample, "clust_", declust, "clust_", declust2, "_DE_genes_violin.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0(sample, "clust_", declust, "clust_", declust2, "_DE_genes_violin.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")
# Heatmap
p <- DoHeatmap(data, features = plotgenes, raster = T) +
scale_fill_gradientn(colors = c("darkorange", "floralwhite", "dodgerblue4")) +
theme(text = element_text(size = 16))
p
# Export as a pdf and tiff
tiff_exp <- paste0(sample, "clust_", declust, "clust_", declust2, "_DE_genes_heatmap.tiff")
ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm")
pdf_exp <- paste0(sample, "clust_", declust, "clust_", declust2, "_DE_genes_heatmap.pdf")
ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm")
# You can go back to the start of this section and change declust <- and declust2 <- to compare different clusters.
|
5e. Differential expression: every cluster vs all other cells
Code Block |
---|
#### 5e. Differential expression: every cluster vs all other cells #### # In section 5c we compared a chosen cluster to all other clusters/cells. In this section we can automate this somewhat and do this for every cluster at the same time. # Unlike the other DE section, you don't need to choose any clusters to compare, just run the Seurat function: DE_genes <- FindAllMarkers(data, logfc.threshold = 0.2) # As with the previous sections, use `tail()` to see if you've captured all sig DE genes: tail(DE_genes[order(DE_genes$p_val_adj, decreasing = F),]) # Now filter by adjusted p and lfc (enter your own parameters), then re-order by cluster: DE_genes_sig <- DE_genes[DE_genes$p_val_adj < 0.05, ] DE_genes_sig <- DE_genes_sig[order(DE_genes_sig$cluster),] lfc_threshold <- 0.3 DE_genes_sig_lfc <- DE_genes_sig[DE_genes_sig$avg_log2FC < -lfc_threshold | DE_genes_sig$avg_log2FC > lfc_threshold, ] # You can see how many DE genes per cluster were found: summary(DE_genes_sig_lfc$cluster) # Then export this table of DE genes per cluster as a csv file: write.csv(DE_genes_sig, paste0("DE_eachclustVSall_", sample, ".csv")) |