Table of Contents |
---|
...
Open RStudio (you can type it in the Windows search bar)
Create a new R script: ‘File’ → “New File” → “R script”
Save this script where your samples folders are (‘File’ → ‘Save’). These should be on your H or W drive. Save the script file as ‘scrnaseq.R’
In the following sections you will be copying and running the R code into your scrnaseq.R script.
Cell Ranger (and nfcore/scrnaseq) generates a default folder and file output structure. There will be a main folder that contains all the sample subfolders (NOTE: this is where you must save your R script). Each sample folder will have an ‘outs’ subfolder. This ‘outs’ folder contains a ‘filtered_feature_bc_matrix’ folder, which contains the files that Seurat uses in its analysis.
...
You can manually set your working directory in RStudio by selecting ‘Session' -> 'Set working directory' -> 'Choose directory'. Choose the same directory as you saved your scrnaseq.R script, previous section. This will output the setwd(...)
command with your working directory into the console window (bottom left panel). Copy this command to replace the default setwd(...)
line in your R script.
...
Code Block |
---|
#### 3j. Plot marker expression before and after filtration #### # Select your set of markers (you could re-enter the same ones you used earlier, or use a different set): ## **USER INPUT** markers <- c("P2ry12", "Tmem119") # Select which type of plot you want to generate ("pca", "umap" or "tsne"): ## **USER INPUT** redplot <- "pca" ## Before filtration ## p <- FeaturePlot(mat3, features = markers, reduction = redplot, cols = c("lightgrey", "red"), pt.size = 1) p # Export as a pdf and tiff tiff_exp <- paste0(sample, "_", redplot, "_markers.tiff") ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm") pdf_exp <- paste0(sample, "_", redplot, "_markers.pdf") ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm") ## After filtration ## p <- FeaturePlot(mat3_filt, features = markers, reduction = redplot, cols = c("lightgrey", "red"), pt.size = 1) p # Export as a pdf and tiff tiff_exp <- paste0(sample, "_", redplot, "_markers_filtered.tiff") ggsave(file = tiff_exp, dpi = 300, compression = "lzw", device = "tiff", plot = p, width = 20, height = 20, units = "cm") pdf_exp <- paste0(sample, "_", redplot, "_markers_filtered.pdf") ggsave(file = pdf_exp, device = "pdf", plot = p, width = 20, height = 20, units = "cm") |
...
3k. Output filtered results
Now will export the filtered dataset (non-target cells removed, other filtration applied), for analysis in the next sections of this analysis workflow.
You need to run this entire '3. Filtering cells using markers' Notebook Analysis using the seurat package' section once for every sample you have, because the other sections rely on use the output created here for each sample.
This will create a file called <samplename>_seurat_filtered.rds
in your working directory. This contains all the data in your mat3_filt
object, including raw counts, filtered counts, gene and sample IDs, dimensionality reduction data, etc.
Code Block |
---|
#### 3k. Output filtered results ####
# Output the filtered Seurat object as a file
# This file will be imported in the other sections of this workflow
# This is a large amount of data, so may take a few minutes
saveRDS(mat3_filt, file = paste0(sample, "_seurat_filtered.rds"))
|
4. Aggregate clustering
This section of the report involves combining (aggregating) two or more sample datasets. This allows a direct comparison of these samples, rather than analysing them separately.
IMPORTANT: To run this section, you must have processed all the samples you wish to combine in the '3. Analysis using the seurat package' section. This means that outliers and non-target cells have already been identified and removed. Section 3 only has to run once for each sample, as it outputs a datafile for each sample that is imported into this section.
ALSO IMPORTANT: Aggregate datasets can be very large and utilise a lot of memory. Keep an eye on your memory usage (in the Environment tab in RStudio). If you use all available memory your rVDI machine may crash.
4a. Choose samples to aggregate and import data
Running your sample through section 3 created a datafile called <samplename>_seurat_filtered.rds
, so if your sample was called 'liver', the file would be liver_seurat_filtered.rds
.
These datafiles will be in your working directories.
Code Block |
---|
#### 4a. Choose samples to aggregate and import data ####
# See which samples you've run through section 3
# This will list every <samplename>_seurat_filtered.rds data file that you've generated
dir(pattern = "seurat_filtered.rds")
# Enter the sample names you wish to aggregate from the list above (just the names, without the '_seurat_filtered.rds').
## **USER INPUT**
samples <- c("Cerebellum", "Cortex")
# Give your aggregate samples a name (for example, "high_low"). This is for naming plots and such.
sample <- "Cerebellum_Cortex"
# Now import the data files for those samples.
samplist <- list()
for (i in 1:length(samples)){
assign(samples[i], readRDS(paste0(samples[i], "_seurat_filtered.rds")))
samplist[[i]] <- get(samples[i])
}
# Then merge these datasets (note that this tags your cells with your sample names)
data <- merge(samplist[[1]], y = samplist[[2:length(samplist)]], add.cell.ids = samples, project = sample)
# See a summary of your data object (note that 'samples' means cells and 'features' means genes)
data
# This should contain the combined number of cells from all your samples.
# You can see the number of cells in each sample by typing a sample name below.
## **USER INPUT**
Cerebellum
|