Preparing input feature counts table for DESeq2 analysis
The nfcore/rnaseq pipelines provides count tables that are not integers necessary for running DESeq2. To convert the data to integers one option is to use awk a Unix command tool as follows:
#convert counts to integers
cat salmon.merged.gene_counts.tsv | awk '{print $1 "\t" $2 "\t" int($3) "\t" int($4) "\t" int($5) "\t" int($6) "\t" int($7) "\t" int($8) "\t" int($9) "\t" int($10) }' > salmon.merged.gene_counts.integers.tsv
#remove gene symbols column that has duplicated IDs
cat salmon.merged.gene_counts.integers.tsv | awk '{print $1 "\t" $3 "\t" $4 "\t" $5 "\t" $6 "\t" $7 "\t" $8 "\t" $9 "\t" $10}' > salmon.merged.gene_counts.integers.GeneAcc.tsv
The number of replilcates can vary from one study to another and this can be manually modified in the above by adding or removing data columns designated as $2, $3, etc.