Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.

This guide provides a step-by-step guide to 1) convert BAM files (i.e., public) to FASTQ; and 2) run the nextflow nf-core/sarek variant calling pipeline.


Code Block
conda activate liver

Prepare a file called environment.yml - Tip: use a text editor (i.e., vim, nano, or other) to copy and paste the code below into the file.


Move to the folder where all the BAM files are present and prepare the following script (i.e.,launch_BAM2FASTQ.pbs):

Code Block
#!/bin/bash -l
#PBS -l walltime=24:00:00
#PBS -l mem=8gb
#PBS -l ncpus=4


#activate the conda environment with the necessary tools
conda activate liver

#Sort reads in BAM file by indentifier-name (-n) using 4 CPUs (-@ 4). Note 'prefix' for sorted file noted after $i (input BAM file)
for i in `ls --color=never *.bam`
  echo $i
  samtools sort -@ 4 -n $i ${i%%.bam}_sorted

#Extract paired end reads in FASTQ format
for file in `ls --color=never *sorted.bam`
  echo $file
  bedtools bamtofastq -i $file -fq ${file%%.bam}_R1.fastq -fq2 ${file%%.bam}_R2.fastq
  #compress FASTQ files to run using the sarek pipeline
  gzip -c -9 ${file%%.bam}_R1.fastq > ${file%%.bam}_R1.fastq.gz
  gzip -c -9 ${file%%.bam}_R1.fastq > ${file%%.bam}_R2.fastq.gz

Submit the job to the PBS scehduler:


Check the submited job(s):

Code Block


Create a conda environment with nf-core

Code Block
conda create --name nf-core python=3.8 nf-core nextflow
conda activate nf-core

Code Block
nf-core download sarek -r 3.1.2 --output nf-core-sarek -x nonce -c none