Data
Experiment Accession | sample | FASTQ | Experiment Title | Organism Name | Instrument | Submitter | Study Accession | Study Title | Sample Accession | Total Size, Mb | Total Spots | Total Bases | Library Strategy | Library Source | Library Selection |
SRX14748451 | S1 | SRR18645307 | Homo sapiens | Homo sapiens | MinION | Drexel University | SRP367676 | Multiplex structural variant detection by whole-genome mapping and nanopore sequencing. | SRS12509856 | 821.1 | 348226 | 972620520 | OTHER | GENOMIC | other |
SRX19406878 | S2 | SRR23513621 | NA12878 DNA sequencing from nanopore WSG consortium - basecalled sequences (Guppy 6.1.3 super accuracy) | Homo sapiens | MinION | Garvan Institute of Medical Research | SRP421403 | Curated publicly available nanopore datasets | SRS16801715 | 78526.8 | 11173458 | 97545895593 | WGS | GENOMIC | RANDOM |
ERX8211413 | S3 | ERR8578833 | MinION sequencing | Homo sapiens | MinION | the university of hong kong | ERP135493 | Target enrichment sequencing and variant calling on medical exome using ONT MinION | ERS10590135 | 8961.02 | 9636172 | 10382057986 | Targeted-Capture | GENOMIC | PCR |
ERX8211414 | S4 | ERR8578834 | MinION sequencing | Homo sapiens | MinION | the university of hong kong | ERP135493 | Target enrichment sequencing and variant calling on medical exome using ONT MinION | ERS10590135 | 10669.72 | 10644000 | 12212807287 | Targeted-Capture | GENOMIC | PCR |
SRX13322984 | S5 | SRR17138639 | Nanopore targeted sequencing (ReadUntil/ReadFish) of NA12878-HG001- basecalled sequences | Homo sapiens | MinION | Garvan Institute of Medical Research | SRP349335 | Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing | SRS11230712 | 6629.97 | 5513156 | 7815960904 | WGS | GENOMIC | other |
SRX13323057 | S6 | SRR17138566 | Nanopore targeted sequencing (ReadUntil/ReadFish) of NA12878-HG001- basecalled sequences | Homo sapiens | MinION | Garvan Institute of Medical Research | SRP349335 | Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing | SRS11230747 | 17107.98 | 12278391 | 20238395479 | WGS | GENOMIC | other |
Mapping
Let’s run the --help option of the pipeline to get information on the available parameters
module load java nextflow run epi2me-labs/wf-alignment -profile singularity --help
N E X T F L O W ~ version 23.12.0-edge Launching `https://github.com/epi2me-labs/wf-alignment` [nostalgic_galileo] DSL2 - revision: e1fd7a51dc [master] WARN: Config setting `prov.formats` is not defined, no provenance reports will be produced |||||||||| _____ ____ ___ ____ __ __ _____ _ _ |||||||||| | ____| _ \_ _|___ \| \/ | ____| | | __ _| |__ ___ ||||| | _| | |_) | | __) | |\/| | _| _____| |/ _` | '_ \/ __| ||||| | |___| __/| | / __/| | | | |__|_____| | (_| | |_) \__ \ |||||||||| |_____|_| |___|_____|_| |_|_____| |_|\__,_|_.__/|___/ |||||||||| wf-alignment v1.1.2-ge1fd7a5 -------------------------------------------------------------------------------- Typical pipeline command: nextflow run epi2me-labs/wf-alignment \ --fastq 'wf-alignment-demo/fastq' \ --references 'wf-alignment-demo/references' Input Options --fastq [string] FASTQ files to use in the analysis. --bam [string] BAM or unaligned BAM (uBAM) files to use in the analysis. --analyse_unclassified [boolean] Analyse unclassified reads from input directory. By default the workflow will not process reads in the unclassified directory. --references [string] Path to a directory containing FASTA reference files. --reference_mmi_file [string] Path to an MMI index file to be used as reference. --counts [string] Path to a CSV file containing expected counts as a control. Sample Options --sample_sheet [string] A CSV file used to map barcodes to sample aliases. The sample sheet can be provided when the input data is a directory containing sub-directories with FASTQ files. --sample [string] A single sample name for non-multiplexed data. Permissible if passing a single .fastq(.gz) file or directory of .fastq(.gz) files. Output Options --out_dir [string] Directory for output of all workflow results. [default: output] --prefix [string] Optional prefix attached to each of the output filenames. Advanced options --depth_coverage [boolean] Calculate depth coverage statistics and include them in the report. [default: true] --minimap_preset [choice] Pre-defined parameter sets for `minimap2`, covering most common use cases. [default: dna] * dna * rna --minimap_args [string] String of command line arguments to be passed on to `minimap2`. Miscellaneous Options --threads [integer] Number of CPU threads to use for the alignment step. [default: 4] --disable_ping [boolean] Enable to prevent sending a workflow ping. Other parameters --monochrome_logs [boolean] null --validate_params [boolean] null [default: true] --show_hidden_params [boolean] null !! Hiding 4 params, use --show_hidden_params to show them !! -------------------------------------------------------------------------------- If you use epi2me-labs/wf-alignment for your analysis please cite: * The nf-core framework https://doi.org/10.1038/s41587-020-0439-x
Variant calling
nextflow run epi2me-labs/wf-human-variation -profile singularity --help
N E X T F L O W ~ version 23.12.0-edge Launching `https://github.com/epi2me-labs/wf-human-variation` [amazing_fourier] DSL2 - revision: 5651930a05 [master] WARN: Config setting `prov.formats` is not defined, no provenance reports will be produced |||||||||| _____ ____ ___ ____ __ __ _____ _ _ |||||||||| | ____| _ \_ _|___ \| \/ | ____| | | __ _| |__ ___ ||||| | _| | |_) | | __) | |\/| | _| _____| |/ _` | '_ \/ __| ||||| | |___| __/| | / __/| | | | |__|_____| | (_| | |_) \__ \ |||||||||| |_____|_| |___|_____|_| |_|_____| |_|\__,_|_.__/|___/ |||||||||| wf-human-variation v2.2.0-g5651930 -------------------------------------------------------------------------------- Typical pipeline command: nextflow run epi2me-labs/wf-human-variation \ --bam 'wf-human-variation-demo/demo.bam' \ --basecaller_cfg 'dna_r10.4.1_e8.2_400bps_hac_prom' \ --mod \ --ref 'wf-human-variation-demo/demo.fasta' \ --sample_name 'DEMO' \ --snp \ --sv Workflow Options --sv [boolean] Call for structural variants. --snp [boolean] Call for small variants --cnv [boolean] Call for copy number variants. --str [boolean] Enable Straglr to genotype STR expansions. --mod [boolean] Enable output of modified calls to a bedMethyl file [requires input BAM with Ml and Mm tags] Main options --sample_name [string] Sample name to be displayed in workflow outputs. [default: SAMPLE] --bam [string] BAM or unaligned BAM (uBAM) files for the sample to use in the analysis. --ref [string] Path to a reference FASTA file. --basecaller_cfg [choice] Name of the model to use for selecting a small variant calling model. [default: dna_r10.4.1_e8.2_400bps_sup@v4.1.0] * dna_r10.4.1_e8.2_260bps_fast@v4.1.0 * dna_r10.4.1_e8.2_260bps_hac@v4.1.0 * dna_r10.4.1_e8.2_260bps_sup@v4.1.0 * dna_r10.4.1_e8.2_400bps_fast@v4.1.0 * dna_r10.4.1_e8.2_400bps_fast@v4.2.0 * dna_r10.4.1_e8.2_400bps_fast@v4.3.0 * dna_r10.4.1_e8.2_400bps_hac@v4.1.0 * dna_r10.4.1_e8.2_400bps_hac@v4.3.0 * dna_r10.4.1_e8.2_400bps_sup@v4.1.0 * dna_r10.4.1_e8.2_400bps_sup@v4.3.0 * dna_r9.4.1_e8_fast@v3.4 * dna_r9.4.1_e8_hac@v3.3 * dna_r9.4.1_e8_sup@v3.3 * dna_r9.4.1_e8_sup@v3.6 * custom * dna_r10.4.1_e8.2_260bps_hac@v4.0.0 * dna_r10.4.1_e8.2_260bps_sup@v4.0.0 * dna_r10.4.1_e8.2_400bps_hac * dna_r10.4.1_e8.2_400bps_hac@v3.5.2 * dna_r10.4.1_e8.2_400bps_hac@v4.0.0 * dna_r10.4.1_e8.2_400bps_hac@v4.2.0 * dna_r10.4.1_e8.2_400bps_hac_prom * dna_r10.4.1_e8.2_400bps_sup@v3.5.2 * dna_r10.4.1_e8.2_400bps_sup@v4.0.0 * dna_r10.4.1_e8.2_400bps_sup@v4.2.0 * dna_r9.4.1_450bps_hac * dna_r9.4.1_450bps_hac_prom --bam_min_coverage [number] Minimum read coverage required to run analysis. [default: 20] --bed [string] An optional BED file enumerating regions to process for variant calling. --annotation [boolean] SnpEff annotation. [default: true] --phased [boolean] Perform phasing. --include_all_ctgs [boolean] Call for variants on all sequences in the reference, otherwise small and structural variants will only be called on chr{1..22,X,Y,MT}. --output_gene_summary [boolean] If set to true, the workflow will generate gene-level coverage summaries. --out_dir [string] Directory for output of all workflow results. [default: output] Structural variant calling options --tr_bed [string] Input BED file containing tandem repeat annotations for the reference genome. Structural variant benchmarking options --sv_benchmark [boolean] Benchmark called structural variants. Copy number variant calling options --use_qdnaseq [boolean] Use QDNAseq for CNV calling. --qdnaseq_bin_size [choice] Bin size for QDNAseq in kbp. [default: 500] * 1 * 5 * 10 * 15 * 30 * 50 * 100 * 500 * 1000 Modified base calling options --force_strand [boolean] Require modkit to call strand-aware modifications. Short tandem repeat expansion genotyping options --sex [choice] Sex (XX or XY) to be passed to Straglr-genotype. * XY * XX Advanced Options --depth_intervals [boolean] Output a bedGraph file with entries for each genomic interval featuring homogeneous depth. --GVCF [boolean] Enable to output a gVCF file in addition to the VCF outputs (experimental). --downsample_coverage [boolean] Downsample the coverage to along the genome. --downsample_coverage_target [number] Average coverage or reads to use for the analyses. [default: 60] Multiprocessing Options --threads [integer] Set max number of threads to use for more intense processes (limited by config executor cpus) [default: 4] --ubam_map_threads [integer] Set max number of threads to use for aligning reads from uBAM (limited by config executor cpus) [default: 8] --ubam_sort_threads [integer] Set max number of threads to use for sorting and indexing aligned reads from uBAM (limited by config executor cpus) [default: 3] --ubam_bam2fq_threads [integer] Set max number of threads to use for uncompressing uBAM and generating FASTQ for alignment (limited by config executor cpus) [default: 1] --merge_threads [integer] Set max number of threads to use for merging alignment files (limited by config executor cpus) [default: 4] --modkit_threads [integer] Total number of threads to use in modkit modified base calling (limited by config executor cpus) [default: 4] Miscellaneous Options --disable_ping [boolean] Enable to prevent sending a workflow ping. Other parameters --monochrome_logs [boolean] null --validate_params [boolean] null [default: true] --show_hidden_params [boolean] null !! Hiding 28 params, use --show_hidden_params to show them !! -------------------------------------------------------------------------------- If you use epi2me-labs/wf-human-variation for your analysis please cite: * The nf-core framework https://doi.org/10.1038/s41587-020-0439-x