Table of Contents |
---|
...
If you haven’t been set up or have used the HPC previously, click on this link for information on how to get access to and use the HPC:
Need a link here for HPC access and usage
Creating a shared workspace on the HPC
...
To request a node using PBS, submit a shell script containing your RAM/CPU/analysis time requirements and the code needed to run your analysis. For an overview of submitting a PBS job, see here:
Need a link here for creating PBS jobs
Alternatively, you can start up an ‘interactive’ node, using the following:
...
Reads for (B):
Above length : 209322 (100%)
Below length : 0 (0%)
Post-demultiplexing file processing
The output files from lima
are bam files, one for each sample. They are by default named as the barcode pair used for that sample. A samples table with barcode pairs and sample IDs should be provided to you by your sequencing company. Rename each barcode-named file to the sample ID in this table.
For other downstream analysis, such as using these samples in the ampliseq pipeline (recommended), these bam files need to be converted to fastq files.
First the bam files need to be sorted using samtools.
Run the following in each directory where the output bam files reside. Note this will loop though each bam file, so ensure only the output bam files are in that directory.
Code Block |
---|
module load samtools
for f1 in *.bam
do
echo working with $f1
samtools sort -@ 24 $f1 > "${f1%%.bam}.sorted.bam"
rm $f1
done
|
This will sort each bam file and rename it to *sorted.bam
Then convert these sorted bam files to fastq files using bedtools bamToFastq tool.
Code Block |
---|
module load bedtools
bamToFastq -i filename.sorted.bam -fq filename.sorted.fq
for f1 in *.sorted.bam
do
echo working with $f1
bamToFastq -i $f1 -fq "${f1%%.sorted.bam}.fastq"
done
|
The fastq files can then be used in downstream analysis. Follow our ampliseq (16S analysis pipeline) guide here: