Examine your count table
The basis for the differential expression analysis is a count table of sequence reads mapped to defined gene regions per sample. There are a variety of methods to generate this count table, but for this exercise, we will be using the output from the Nextflow nfcore/rnaseq analysis you completed in the previous workshop sessions.
To access this count table:
Go to the W:\training\2024\rnaseq\run3_RNAseq\results folder that contains the results from running the nfcore/rnaseq pipeline. The output folders from task 3 look like this:
├── results │ ├── fastqc │ ├── multiqc │ ├── pipeline_info │ ├── star_salmon │ └── trimgalore
The count table can be found in the star_salmon folder. A list of files and folders in the star_salmon folder will look like this:
. ├── bigwig ├── CD49fmNGFRm_rep1 ├── CD49fmNGFRm_rep1.markdup.sorted.bam ├── CD49fmNGFRm_rep1.markdup.sorted.bam.bai ├── CD49fmNGFRm_rep2 ├── CD49fmNGFRm_rep2.markdup.sorted.bam ├── CD49fmNGFRm_rep2.markdup.sorted.bam.bai ├── CD49fmNGFRm_rep3 ├── CD49fmNGFRm_rep3.markdup.sorted.bam ├── CD49fmNGFRm_rep3.markdup.sorted.bam.bai ├── CD49fpNGFRp_rep1 ├── CD49fpNGFRp_rep1.markdup.sorted.bam ├── CD49fpNGFRp_rep1.markdup.sorted.bam.bai ├── CD49fpNGFRp_rep2 ├── CD49fpNGFRp_rep2.markdup.sorted.bam ├── CD49fpNGFRp_rep2.markdup.sorted.bam.bai ├── CD49fpNGFRp_rep3 ├── CD49fpNGFRp_rep3.markdup.sorted.bam ├── CD49fpNGFRp_rep3.markdup.sorted.bam.bai ├── deseq2_qc ├── dupradar ├── featurecounts ├── log ├── MTEC_rep1 ├── MTEC_rep1.markdup.sorted.bam ├── MTEC_rep1.markdup.sorted.bam.bai ├── MTEC_rep2 ├── MTEC_rep2.markdup.sorted.bam ├── MTEC_rep2.markdup.sorted.bam.bai ├── MTEC_rep3 ├── MTEC_rep3.markdup.sorted.bam ├── MTEC_rep3.markdup.sorted.bam.bai ├── picard_metrics ├── qualimap ├── rseqc ├── salmon.merged.gene_counts_length_scaled.rds ├── salmon.merged.gene_counts_length_scaled.tsv ├── salmon.merged.gene_counts.rds ├── salmon.merged.gene_counts_scaled.rds ├── salmon.merged.gene_counts_scaled.tsv ├── salmon.merged.gene_counts.tsv <---- We will use this feature counts file for DESeq2 expression analysis. ├── salmon.merged.gene_tpm.tsv ├── salmon.merged.transcript_counts.rds ├── salmon.merged.transcript_counts.tsv ├── salmon.merged.transcript_tpm.tsv ├── salmon_tx2gene.tsv ├── samtools_stats └── stringtie
The expression count file that we are interested is salmon.merged.gene_counts.tsv
head salmon.merged.gene_counts.tsv