...
Assemble the genome of two M. alpina strains (Mortierella alpina and Mortierella sp.) sequenced at QUT.
Identify DAG (Diacyl-glycerol) and Phopholipase (PLA, PLB, PLC and/or PLD) genes
...
Code Block |
---|
conda create -n aaftf "python>=3.6" bbmap trimmomatic bowtie2 bwa pilon sourmash blast minimap2 spades megahit novoplasty biopython -c bioconda |
Activate the conda environment:
Code Block |
---|
conda activate aaftf
#to deactivate the conda environment type:
conda deactivate |
Install the aaftf pipeline as follows:
Code Block |
---|
python -m pip install git+https://github.com/stajichlab/AAFTF.git |
To deactivate the environment do:
Code Block |
---|
conda deactivate |
2. Running AAFTF pipeline
...
Code Block |
---|
qjobs #alternatively use qstats -u USERNAME |
List of public genomes
https://www.ncbi.nlm.nih.gov/genome/browse/#!/eukaryotes/11236/
Fetch for phospholipase proteins:
Code Block |
---|
#step1: decompress file
gzip -d GCA_015679415.1_UCR_MalpAD072_1.0_protein.faa.gz
#step2: wrap sequence to one line
python2.7 /work/speight_team/projects/yeast_genomes/scripts/extract_seqs.py GCA_015679415.1_UCR_MalpAD072_1.0_protein.faa 0 | sed 's/lcl|//' > GCA_015679415.1_UCR_MalpAD072_1.0_protein.mod.faa
#step3" check file
less -S GCA_015679415.1_UCR_MalpAD072_1.0_protein.mod.faa
#use grep to fetch name of interest
grep -A 1 "Phospholipase" GCA_015679415.1_UCR_MalpAD072_1.0_protein.mod.faa | sed '/^--$/d' > Phospholipase_proteins.fasta |