Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents

...

  1. Open RStudio (you can type it in the Windows search bar)

  2. Create a new R script: ‘File’ → “New File” → “R script”

  3. Save this script where your samples folders are (‘File’ → ‘Save’). These should be on your H or W drive. Save the script file as scrnaseq.R

In the following sections you will be copying and running the R code into your scrnaseq.R script.

Cell Ranger (and nfcore/scrnaseq) generates a default folder and file output structure. There will be a main folder that contains all the sample subfolders (NOTE: this is where you must save your R script). Each sample folder will have an ‘outs’ subfolder. This ‘outs’ folder contains a ‘filtered_feature_bc_matrix’ folder, which contains the files that Seurat uses in its analysis.

...

You can manually set your working directory in RStudio by selecting ‘Session' -> 'Set working directory' -> 'Choose directory'. Choose the same directory as you saved your scrnaseq.R script, previous section. This will output the setwd(...) command with your working directory into the console window (bottom left panel). Copy this command to replace the default setwd(...) line in your R script.

...

3a. Select a sample to work with and import the data into R

################################################################

#TO DO: Make sure the folder structure provided follows the Cell ranger output format.

Code Block
├── filtered_feature_bc_matrix
    │   ├── barcodes.tsv.gz
    │   ├── features.tsv.gz
    │   └── matrix.mtx.gz

Clarify how the files will be output from the upstream Nextflow pipeline and update text accordingly to guide to the output folder.

Code Block
mat <- Read10X(data.dir = "H:/nfcore-scrnaseq_R_analysis/test_data/Choroid/filtered_feature_bc_matrix"

################################################################

In this section we’ll choose one of our samples to analyse, then import that sample’s scRNA-Seq dataset into R. You can re-run this, and following sections, for each sample dataset that you have. Just replace the sample name in sample <- "xxxxxx" below.

...

3d. Plot of highly variable genes

################################################################

Question for Paul:

The following command returns a warning.

...

Code Block
Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
This message will be shown once per session

################################################################

Using the FindVariableFeatures results, we can visualise the most highly variable genes, including a count of variable and non variable genes in your dataset. The below code outputs the top 10 genes, but you can adjust this number as desired (i.e. in top_genes <- head(VariableFeatures(mat3), 10) change 10 to another number).

...

3e. PCA, UMAP and t-SNE plots (plotting dimensionality reduction data)

################################################################

Question for Paul:

one warning comes up when running RunUMAP:

Code Block
Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using the cosine metric
To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation'
This message will be shown once per session

################################################################

In the section 3c we ran dimensionality reduction based on Principal Component Analysis (PCA).

...

3f. Remove low quality or outlier cells

################################################################

Question for Paul:

Code Block
Warning: Default search for "data" layer in "RNA" assay yielded no results; utilizing "counts" layer instead.

################################################################

From the Seurat website:

Seurat allows you to easily explore QC metrics and filter cells based on any user-defined criteria. A few QC metrics commonly used by the community include

...