/
25S1W1 - 6. Run full pipeline with metadata

25S1W1 - 6. Run full pipeline with metadata

Aim:

  • Process metagenomics data using metadata groups ( --metadata metadata.tsv )to enable the generation of alpha and beta diversity analyses

Interactive HPC session

Open a Terminal (Mac users) or PuTTy (Windows users) and paste the text below into the command prompt to start an Interactive Session:

qsub -I -S /bin/bash -l walltime=4:00:00 -l select=1:ncpus=4:mem=8gb

It should take less than a minute for the interactive session.

Running nfcore/ampliseq with metadata information

  • Metadata is optional to run the ampliseq pipeline, but for performing downstream analysis such as barplots, diversity indices (alpha and beta diversities) or differential abundance testing, a metadata file is essential.

  • The public data we are using in the workshop does not have associated metadata information, so we will used an ‘artificially’ created metadata.tsv file that assigns the first 15 samples to a “control” group and the remaining samples to a group called “illumina” (technology used to created the amplicon data.

Let’s create a working folder for this exercise and move to it:

mkdir $HOME/workshop/2025/S1W1/metagenomics/runs/run3_ampliseq_metadata cd $HOME/workshop/2025/S1W1/metagenomics/runs/run3_ampliseq_metadata

Let’s copy the samplesheet.tsv, launch script and metadata file to the newly created folder:

cp /work/training/2025/S1W1/session3_metagenomics/runs/run3_ampliseq_metadata/samplesheet.tsv . cp /work/training/2025/S1W1/session3_metagenomics/runs/run3_ampliseq_metadata/launch_nfcore_ampliseq_illumina_metadata.pbs . cp /work/training/2025/S1W1/session3_metagenomics/runs/run3_ampliseq_metadata/metadata.tsv .

Print the content of the metadata file (e.g., cat metadata.tsv):

ID condition Illumina1 control Illumina2 control Illumina3 control Illumina4 control Illumina5 control Illumina6 control Illumina7 control Illumina8 control Illumina9 control Illumina10 control Illumina11 control Illumina12 control Illumina13 control Illumina14 control Illumina15 control Illumina16 illumina Illumina17 illumina Illumina18 illumina Illumina19 illumina Illumina20 illumina Illumina21 illumina Illumina22 illumina Illumina23 illumina Illumina24 illumina Illumina25 illumina Illumina26 illumina Illumina27 illumina Illumina28 illumina Illumina29 illumina Illumina30 illumina Illumina31 illumina Illumina32 illumina Illumina33 illumina Illumina34 illumina Illumina35 illumina Illumina36 illumina Illumina37 illumina Illumina38 illumina Illumina39 illumina Illumina40 illumina Illumina41 illumina Illumina42 illumina Illumina43 illumina Illumina44 illumina Illumina45 illumina Illumina46 illumina Illumina47 illumina Illumina48 illumina Illumina49 illumina Illumina50 illumina Illumina51 illumina Illumina52 illumina Illumina53 illumina Illumina54 illumina Illumina55 illumina Illumina56 illumina Illumina57 illumina Illumina58 illumina Illumina59 illumina

Print the content of the launch script:

cat launch_nfcore_ampliseq_illumina_metadata.pbs
image-20250328-053126.png

The parameters:

  • -r 2.9.0 runs version 2.9.0 of the ampliseq workflow. This is important for version control.

  • -profile singularity is the type of container we use on the HPC. Nextflow uses containers to run.

  • --single_end Since we have single-end data, we need to add this parameter. If we had paired-end we don’t need to add anything as paired-end is the default.

  • --ignore_failed_trimming Some of the samples in the public dataset are poor quality and fail the adapter trimming step. We’re ignoring these in this practice session, but if you have your own dataset you’ll want to address this in other ways (e.g. re-sequence samples, remove as outliers, etc).

  • --input "data/samplesheet.tsv" The samplesheet you created. Note in this case they must be in a ‘data’ subdirectory, but they can be anywhere you like, which you should then provide the full path for.

  • --FW_primer "GGATTAGATACCCBRGTAGTC" --RV_primer "TCACGRCACGAGCTGACGAC" The forward and reverse primers used are fromComparison of Illumina versus Nanopore 16S rRNA Gene Sequencing of the Human Nasal Microbiota

The hypervariable V5 and V6 regions (276 base pairs—bp) of the 16S rRNA gene were amplified using the 785F (5′-GGA TTA GAT ACC CBR GTA GTC-3′) and 1061R (5′-TCA CGR CAC GAG CTG ACG AC-3′) primers [20]

  • --outdir results The output directory for results. You can call this whatever you like.

  • --metadata Specify the sample “ID” and “Condition” on columns 1 and 2, respectively in a tab-delimited file (see above).

 

Submit job to the cluster

qsub launch_nfcore_ampliseq_illumina_metadata.pbs

The job will take about on hour to complete.

Find the results for alpha and beta diversity in the ./results/quiime2 folder:

results/qiime2/diversity/ ├── alpha_diversity │   ├── evenness_vector │   ├── faith_pd_vector │   ├── observed_features_vector │   └── shannon_vector ├── beta_diversity │   ├── bray_curtis_distance_matrix-condition │   ├── bray_curtis_pcoa_results-PCoA │   ├── jaccard_distance_matrix-condition │   ├── jaccard_pcoa_results-PCoA │   ├── unweighted_unifrac_distance_matrix-condition │   ├── unweighted_unifrac_pcoa_results-PCoA │   ├── weighted_unifrac_distance_matrix-condition │   └── weighted_unifrac_pcoa_results-PCoA └── WARNING The sampling depth of 500 seems too small for rarefaction.txt

Let’s now inspect precomputed results:

Windows PC: open file finder and type the address below to connect to your home directory in the HPC, and then browse to the /workshop/2025/S1W1/session3_metagenomics folder

\\hpc-fs\home\workshop\2025\S1W1\session3_metagenomics

Mac: open file finder and press “command” + “k” to open prompt, then type the below command, and then browse to the /workshop/2025/S1W1/session3_metagenomics folder

smb://hpc-fs/home/workshop/2025/S1W1/session3_metagenomics

Navigate to the runs/run3_ampliseq_metadata folder:

results/ ├── barrnap │   ├── rrna.arc.gff │   ├── rrna.bac.gff │   ├── rrna.euk.gff │   ├── rrna.mito.gff │   └── summary.tsv ├── cutadapt │   ├── cutadapt_summary.tsv │   ├── Illumina10.trimmed.cutadapt.log │   ├── Illumina11.trimmed.cutadapt.log │   ├── Illumina12.trimmed.cutadapt.log │   ├── Illumina13.trimmed.cutadapt.log │   ├── Illumina14.trimmed.cutadapt.log │   ├── Illumina15.trimmed.cutadapt.log │   ├── Illumina16.trimmed.cutadapt.log │   ├── Illumina17.trimmed.cutadapt.log │   ├── Illumina18.trimmed.cutadapt.log │   ├── Illumina19.trimmed.cutadapt.log │   ├── Illumina1.trimmed.cutadapt.log │   ├── Illumina20.trimmed.cutadapt.log │   ├── Illumina21.trimmed.cutadapt.log │   ├── Illumina22.trimmed.cutadapt.log │   ├── Illumina23.trimmed.cutadapt.log │   ├── Illumina24.trimmed.cutadapt.log │   ├── Illumina25.trimmed.cutadapt.log │   ├── Illumina26.trimmed.cutadapt.log │   ├── Illumina27.trimmed.cutadapt.log │   ├── Illumina28.trimmed.cutadapt.log │   ├── Illumina29.trimmed.cutadapt.log │   ├── Illumina2.trimmed.cutadapt.log │   ├── Illumina30.trimmed.cutadapt.log │   ├── Illumina31.trimmed.cutadapt.log │   ├── Illumina32.trimmed.cutadapt.log │   ├── Illumina33.trimmed.cutadapt.log │   ├── Illumina34.trimmed.cutadapt.log │   ├── Illumina35.trimmed.cutadapt.log │   ├── Illumina36.trimmed.cutadapt.log │   ├── Illumina37.trimmed.cutadapt.log │   ├── Illumina38.trimmed.cutadapt.log │   ├── Illumina39.trimmed.cutadapt.log │   ├── Illumina3.trimmed.cutadapt.log │   ├── Illumina40.trimmed.cutadapt.log │   ├── Illumina41.trimmed.cutadapt.log │   ├── Illumina42.trimmed.cutadapt.log │   ├── Illumina43.trimmed.cutadapt.log │   ├── Illumina44.trimmed.cutadapt.log │   ├── Illumina45.trimmed.cutadapt.log │   ├── Illumina46.trimmed.cutadapt.log │   ├── Illumina47.trimmed.cutadapt.log │   ├── Illumina48.trimmed.cutadapt.log │   ├── Illumina49.trimmed.cutadapt.log │   ├── Illumina4.trimmed.cutadapt.log │   ├── Illumina50.trimmed.cutadapt.log │   ├── Illumina51.trimmed.cutadapt.log │   ├── Illumina52.trimmed.cutadapt.log │   ├── Illumina53.trimmed.cutadapt.log │   ├── Illumina54.trimmed.cutadapt.log │   ├── Illumina55.trimmed.cutadapt.log │   ├── Illumina56.trimmed.cutadapt.log │   ├── Illumina57.trimmed.cutadapt.log │   ├── Illumina58.trimmed.cutadapt.log │   ├── Illumina59.trimmed.cutadapt.log │   ├── Illumina5.trimmed.cutadapt.log │   ├── Illumina6.trimmed.cutadapt.log │   ├── Illumina7.trimmed.cutadapt.log │   ├── Illumina8.trimmed.cutadapt.log │   └── Illumina9.trimmed.cutadapt.log ├── dada2 │   ├── args │   ├── ASV_seqs.fasta │   ├── ASV_table.tsv │   ├── ASV_tax.silva_138.tsv │   ├── ASV_tax_species.silva_138.tsv │   ├── DADA2_stats.tsv │   ├── DADA2_table.rds │   ├── DADA2_table.tsv │   ├── log │   ├── QC │   └── ref_taxonomy.silva_138.txt ├── fastqc │   ├── Illumina10_fastqc.html │   ├── Illumina11_fastqc.html │   ├── Illumina12_fastqc.html │   ├── Illumina13_fastqc.html │   ├── Illumina14_fastqc.html │   ├── Illumina15_fastqc.html │   ├── Illumina16_fastqc.html │   ├── Illumina17_fastqc.html │   ├── Illumina18_fastqc.html │   ├── Illumina19_fastqc.html │   ├── Illumina1_fastqc.html │   ├── Illumina20_fastqc.html │   ├── Illumina21_fastqc.html │   ├── Illumina22_fastqc.html │   ├── Illumina23_fastqc.html │   ├── Illumina24_fastqc.html │   ├── Illumina25_fastqc.html │   ├── Illumina26_fastqc.html │   ├── Illumina27_fastqc.html │   ├── Illumina28_fastqc.html │   ├── Illumina29_fastqc.html │   ├── Illumina2_fastqc.html │   ├── Illumina30_fastqc.html │   ├── Illumina31_fastqc.html │   ├── Illumina32_fastqc.html │   ├── Illumina33_fastqc.html │   ├── Illumina34_fastqc.html │   ├── Illumina35_fastqc.html │   ├── Illumina36_fastqc.html │   ├── Illumina37_fastqc.html │   ├── Illumina38_fastqc.html │   ├── Illumina39_fastqc.html │   ├── Illumina3_fastqc.html │   ├── Illumina40_fastqc.html │   ├── Illumina41_fastqc.html │   ├── Illumina42_fastqc.html │   ├── Illumina43_fastqc.html │   ├── Illumina44_fastqc.html │   ├── Illumina45_fastqc.html │   ├── Illumina46_fastqc.html │   ├── Illumina47_fastqc.html │   ├── Illumina48_fastqc.html │   ├── Illumina49_fastqc.html │   ├── Illumina4_fastqc.html │   ├── Illumina50_fastqc.html │   ├── Illumina51_fastqc.html │   ├── Illumina52_fastqc.html │   ├── Illumina53_fastqc.html │   ├── Illumina54_fastqc.html │   ├── Illumina55_fastqc.html │   ├── Illumina56_fastqc.html │   ├── Illumina57_fastqc.html │   ├── Illumina58_fastqc.html │   ├── Illumina59_fastqc.html │   ├── Illumina5_fastqc.html │   ├── Illumina6_fastqc.html │   ├── Illumina7_fastqc.html │   ├── Illumina8_fastqc.html │   └── Illumina9_fastqc.html ├── input │   ├── metadata.tsv │   └── samplesheet.tsv ├── multiqc │   ├── multiqc_data │   ├── multiqc_plots │   └── multiqc_report.html ├── overall_summary.tsv ├── phyloseq │   └── dada2_phyloseq.rds ├── pipeline_info │   ├── execution_report_2025-03-28_10-55-51.html │   ├── execution_timeline_2025-03-28_10-55-51.html │   ├── execution_trace_2025-03-28_10-55-51.txt │   ├── params_2025-03-28_10-56-00.json │   ├── pipeline_dag_2025-03-28_10-55-51.html │   └── software_versions.yml ├── qiime2 │   ├── abundance_tables │   ├── alpha-rarefaction │   ├── ancom │   ├── barplot │   ├── diversity │   ├── input │   ├── phylogenetic_tree │   ├── rel_abundance_tables │   └── representative_sequences └── summary_report ├── dada2_taxonomic_classification_per_taxonomy_level.svg ├── evenness_vector_spearman.svg ├── faith_pd_vector_spearman.svg ├── observed_features_vector_spearman.svg ├── rrna_detection_with_barrnap.svg ├── shannon_vector_spearman.svg ├── stacked_barchart_of_reads.svg ├── summary_report.html └── versions.yml

Move to the /results/qiime2/diversity folder and evaluate the alpha diversity results, particularly, open the interactive “index.html” reports for each type of alpha diversity generated:

results/qiime2/diversity/alpha_diversity/ ├── evenness_vector │   ├── column-condition.jsonp │   ├── dist │   ├── index.html │   ├── kruskal-wallis-pairwise-condition.csv │   ├── metadata.tsv │   └── q2templateassets ├── faith_pd_vector │   ├── column-condition.jsonp │   ├── dist │   ├── index.html │   ├── kruskal-wallis-pairwise-condition.csv │   ├── metadata.tsv │   └── q2templateassets ├── observed_features_vector │   ├── column-condition.jsonp │   ├── dist │   ├── index.html │   ├── kruskal-wallis-pairwise-condition.csv │   ├── metadata.tsv │   └── q2templateassets └── shannon_vector ├── column-condition.jsonp ├── dist ├── index.html ├── kruskal-wallis-pairwise-condition.csv ├── metadata.tsv └── q2templateassets

Move to the /results/qiime2/diversity folder and evaluate the beta diversity results, particularly, open the interactive “index.html” reports for each type of beta diversity generated:

results/qiime2/diversity/beta_diversity/ ├── bray_curtis_distance_matrix-condition │   ├── control-boxplots.pdf │   ├── control-boxplots.png │   ├── illumina-boxplots.pdf │   ├── illumina-boxplots.png │   ├── index.html │   ├── permanova-pairwise.csv │   ├── q2templateassets │   └── raw_data.tsv ├── bray_curtis_pcoa_results-PCoA │   ├── css │   ├── emperor.html │   ├── img │   ├── index.html │   ├── js │   ├── q2templateassets │   ├── templates │   └── vendor ├── jaccard_distance_matrix-condition │   ├── control-boxplots.pdf │   ├── control-boxplots.png │   ├── illumina-boxplots.pdf │   ├── illumina-boxplots.png │   ├── index.html │   ├── permanova-pairwise.csv │   ├── q2templateassets │   └── raw_data.tsv ├── jaccard_pcoa_results-PCoA │   ├── css │   ├── emperor.html │   ├── img │   ├── index.html │   ├── js │   ├── q2templateassets │   ├── templates │   └── vendor ├── unweighted_unifrac_distance_matrix-condition │   ├── control-boxplots.pdf │   ├── control-boxplots.png │   ├── illumina-boxplots.pdf │   ├── illumina-boxplots.png │   ├── index.html │   ├── permanova-pairwise.csv │   ├── q2templateassets │   └── raw_data.tsv ├── unweighted_unifrac_pcoa_results-PCoA │   ├── css │   ├── emperor.html │   ├── img │   ├── index.html │   ├── js │   ├── q2templateassets │   ├── templates │   └── vendor ├── weighted_unifrac_distance_matrix-condition │   ├── control-boxplots.pdf │   ├── control-boxplots.png │   ├── illumina-boxplots.pdf │   ├── illumina-boxplots.png │   ├── index.html │   ├── permanova-pairwise.csv │   ├── q2templateassets │   └── raw_data.tsv └── weighted_unifrac_pcoa_results-PCoA ├── css ├── emperor.html ├── img ├── index.html ├── js ├── q2templateassets ├── templates └── vendor

Related content