Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Aim

Assess sequence polymorphism in five horse genes of interest by comparing amplicon seq data of healthy and unhealthy horses.

...

Code Block
>EquCab3.0_Glucagon_ADIPOQ_201000001|ADIPOQ-201 cds:protein_coding
ATGGGACAATGTGTCTCTGGTTGTCTGACTAGATCAAGGAAAGACTATGTGTGTGTGTGTGCTTGTGCGTACATGTGTGTGCAAGTATGTGTATGTATGTATATGTATGTGTGTGTTTGGGTTGGGTGTGCTGTTTGGGGTCTGCTCTCATGGCTGACAGTGCAGATTTGGATTCCAGGACTCAGGATGCTGTTGCTCCAAGCTGTTCTATTGCTACTAGTCCTGCCGAGTCCGGGTGAGGTTACCACGACTGAAGAGACTCTGCCCAAGGAGGGCTGCGCAGGTTGGATGGCAGGCATCCCAGGGCATCCTGGCCACAATGGGACCCCAGGCCGTGATGGCAGAGATGGCACCCCTGGCGAGAAGGGTGAGAAAGGAGATCCAGGTCTTGTTGGGCCTAAGGGTGATGCTGGTGAAACTGGAGTGCCTGGAGTTGAAGGTCCCAGAGGCTTTCCGGGAATCCCAGGCAGGAAAGGAGAACCTGGAGAAAGTTCCTATGTATACCGCTCAGCATTCAGTGTAGGATTGGAGACCCGAGTCACCGTCCCCAATGTTCCCATTCGTTTTACCAAGATCTTCTACAATCAGCAAAACCACTATGATGGCAGCACGGGCAAATTCCACTGCAACATTCCTGGGCTGTACTACTTCTCCTACCACATCACAGTCTACTTGAAGGATGTGAAGGTCAGCCTCTACAAGAAGGACAAGGCTGTGCTCTTCACCTATGACCAGTACCAGGACAAGAACTTGGACCAGGCCTCAGGCTCTGTTCTCCTCTATCTGGAGAAGGGCGACCAAGTCTGGCTCCAGGTGTATGGGGATGGAGATCATAATGGGCTCTATGCCGATAATGTCAATGACTCCACCTTCACAGGCTTCCTTCTCTACCACGACACCAACTGA

ConsGenome pipeline: Creating a conda environment

See this tutorial for additional information: https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html

The ConsGenome workflow requires the following tools specified in an ‘environment.yml’ file:

Code Block
name: ConsGenome
channels:
  - bioconda
  - conda-forge
dependencies:
  - python=3.7.8  
  - bwa=0.7.17   
  - spades=3.15.2
  - samtools=1.7
  - bedtools=2.27.1
  - bcftools=1.9
  - blast=2.5.0
  - seqtk=1.3
  - trim-galore=0.6.2

Creating a conda environment called ‘ConsGenome’

Code Block
conda env create -f environment.yml

Activate the environment → this enables to use of the above tools. NOTE: prior to running netxflow need to activate the ConsGenome environment.

Code Block
conda activate ConsGenome

Deactivate the environment

Code Block
conda deactivate

Nextflow - ConsGenome pipeline

...

To run the pipeline prepare the follwowingfollowing:

  • index.csv - a file describing the sample ID, the path to read1 and read2 if applicable

  • nextflow.config - a file to specify parameter options such as the genome/transcriptome/amplicon reference to use to map reads and predict a consensus sequence

  • launch.pbs - PBS Pro script to submit the ConsGenome job to the HPC cluster

Example index.csv file:

Code Block
sampleid,read1,read2
AP01,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP01_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP01_R2.fq.gz
AP02,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP02_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP02_R2.fq.gz
AP03,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP03_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP03_R2.fq.gz
AP04,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP04_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP04_R2.fq.gz
AP06,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP06_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP06_R2.fq.gz
AP07,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP07_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP07_R2.fq.gz
AP08,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP08_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP08_R2.fq.gz
AP10,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP10_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP10_R2.fq.gz
AP11,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP11_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP11_R2.fq.gz
AP12,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP12_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP12_R2.fq.gz
AP13,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP13_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP13_R2.fq.gz
AP14,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP14_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP14_R2.fq.gz
AP15,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP15_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP15_R2.fq.gz
AP17,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP17_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP17_R2.fq.gz
AP18,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP18_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP18_R2.fq.gz
AP19,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP19_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP19_R2.fq.gz
AP20,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP20_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/AP20_R2.fq.gz
MC01,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC01_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC01_R2.fq.gz
MC02,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC02_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC02_R2.fq.gz
MC03,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC03_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC03_R2.fq.gz
MC04,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC04_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC04_R2.fq.gz
MC05,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC05_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC05_R2.fq.gz
MC06,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC06_R1.fq.gz,/work/APP_dp18app/nextflow/data/QCed_fastq_pairs/MC06_R2.fq.gz

Example nextflow.config file:

Code Block
params {
  outdir = "results"
  indexfile = "index.csv"
  genome = "/work/APP_dp18app/nextflow/data/ref/Equus_caballus_EquCan3.0_GCG_MC2R_MC4R_POMC_ADIPOQ_genes.cds.fasta"
  paired = true
}

process {
  withLabel: mapping {
    memory = 32.GB
  }
}

...