Bisulfite conversion WGBS

1 Test bisulfate conversion rate
- 1.1 Download data from Basespace.
- 1.2 Initial data management

Test bisulfate conversion rate

See email from Vikki 12-02-2023.

Hi Roberto and Paul,
The Bisulfate test run has completed if you could still help with checking the conversion efficiency?
Here’s the links to BaseSpace:
Run:
BaseSpace Sequence Hub
Project:
BaseSpace Sequence Hub
Sample “test 1” should be the converted sample and the “test 2” should be the non-converted DNA.
Thank you and let me know if you have any questions.
Best wishes,
Vikki

Download data from Basespace.

Followed this guide: Downloading data from BaseSpace

PBS job submitted:

#!/bin/bash -l
#PBS -N fetchBaseSpace
#PBS -l walltime=24:00:00
#PBS -l mem=32gb
#PBS -l ncpus=8
#PBS -m bae
#PBS -M paul.whatmore@qut.edu.au
#PBS -j oe

cd $PBS_O_WORKDIR

#fetch data from BaseSpace by indicating the Project ID (-i parameter)
bs download project -i 382493133 -o /work/eresearch_bio/projects/CARF/Alison_Carey_WGBS/data --extension=fastq.gz

Raw data downloaded to /work/eresearch_bio/projects/CARF/Alison_Carey_WGBS/data

Initial data management

Fastq files moved to /work/eresearch_bio/projects/CARF/Alison_Carey_WGBS/fastq

mv /work/eresearch_bio/projects/CARF/Alison_Carey_WGBS/data/**/*.fastq.gz /work/eresearch_bio/projects/CARF/Alison_Carey_WGBS/fastq

Files are two paired-end samples: ‘test-1’ and ‘test-2’. These are across 8 files in the raw data. Each read pair spans a lane, so fastq files needs to be concatenated by lane.

Doing this via an interactive PBS session:

qsub -I -S /bin/bash -l walltime=11:59:00 -l select=1:ncpus=32:mem=128gb

Concatenating lanes (run in fastq directory):

# Combining test 1 first read pairs
cat *test-1*R1_001.fastq.gz > test_1_R1.fastq.gz
# Combining test 1 second read pairs
cat *test-1*R2_001.fastq.gz > test_1_R2.fastq.gz
# Combining test 2 first read pairs
cat *test-2*R1_001.fastq.gz > test_2_R1.fastq.gz
# Combining test 2 second read pairs
cat *test-2*R2_001.fastq.gz > test_2_R2.fastq.gz

Then can remove all the original datafiles rm Alison*