MetaPhage Options
These are the parameters supported by the main.nf
Nextflow script. Advanced users can invoke the script or modify the configuration file using this reference.
General options
--readPath
Specify the folder where your datasets are stored. Default is $projectDir/dataset
. Note that $projectDir correspond to the MetaPhage folder.
--metaPath
Specify the folder where the dataset's metadata are stored. Default is $readPath/metadata/
.
--dbPath
Specify the folder where databases are stored. Default is $projectDir/db
.
--virome_dataset
(Boolean) Specify if the dataset provided is a virome. This will affect different tools usage. Default is false
.
--singleEnd
(Boolean) Specify if your datasets are in single-end mode. Default is false
. Please note that single-end mode is not supported yet.
--outdir
Specify the folder where your results are stored. Default is $projectDir/output
.
--temp_dir
Specify the folder where your temporary files are stored. Default is $projectDir/temp
.
--workDir
Specify the directory where tasks temporary files are created. Default is $projectDir/work
.
Quality check and trimming
--skip_qtrimming
(Boolean) Specify whether to perform the quality trimming of your reads or not. Default is false
.
--adapter_forward
and --adapter_reverse
Specify the adapter sequences. Deafault are AGATCGGAAGAGCACACGTCTGAACTCCAGTCA
for forward and AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
for reverse.
--mean_quality
Given a read, every base having quality less than --mean_quality
(default is 15
) is marked as "unqualified". If the percentage of unqualified in the read is more that 40%, that that read if excluded from the analysis. See https://github.com/OpenGene/fastp for more informations.
--trimming_quality
Sliding window trimming is enabled in 5'→3' and in 3'→5' with a window 4bp large. Bases inside the window are trimmed if their mean quality is less then --trimming_quality
(default is 15). See https://github.com/OpenGene/fastp for more informations.
--keep_phix
(Boolean) Specify whether to remove the phix or not. Default is false
.
--mod_phix
Specify the modality of phix removal. There are 3 possibilities:
phiX174
(default) search and remove the complete genome of Coliphage phiX174 isolate S1 (GenBank: AF176027.1, https://www.ncbi.nlm.nih.gov/nuccore/AF176027). Genome is automatically downloaded if not already present in./db/phix/
.WA11
search and remove the complete genome of Coliphage WA11 (GenBank: DQ079895.1, https://www.ncbi.nlm.nih.gov/nuccore/DQ079895). Genome is automatically downloaded if not already present in./db/phix/
.custom
search and remove the sequence specified with--file_phix_alone
(path to the .fasta file; the path is relative to the pipeline's root directory, for example--file_phix_alone ./db/phix/genome.fasta
).
Microbial taxonomy
--skip_bacterial_taxo
(Boolean) Specify whether to skip microbial taxonomy classification step (kraken2 and krona) or not. Default is false
.
--skip_kraken2
(Boolean) Specify whether to skip the microbial taxonomy classification with Kraken2 or not. Default is false
.
--mod_kraken2
Specify the modality of the short read alignment with Kraken2. There are 3 possibilities:
miniBAV
(default) align against RefSeq bacteria, archaea, and viral libraries. Pre-built database taken from https://ccb.jhu.edu/software/kraken2/downloads.shtml.miniBAVH
align against RefSeq bacteria, archaea, and viral libraries, and against the GRCh38 human genome. Pre-built database taken from https://ccb.jhu.edu/software/kraken2/downloads.shtml.custom
align using your custom database. With this modality you have to specify also--file_kraken2_db
, which is the path to the folder containing thehash.k2d
,opts.k2d
andtaxo.k2d
files. The path is relative to the pipeline's root directory, for example--file_kraken2_db ./db/kraken2/folder/
.
--skip_krona
(Boolean) Specify whether to generate the krona-compatible TEXT file using KrakenTools/kreport2krona.py or not. Default is false
.
Assembly
--skip_megahit
(Boolean) Specify whether to skip the assembly with MEGAHIT or not. Default is true
.
--skip_metaquast
(Boolean) Specify whether to skip the assembly evaluation with metaQUAST or not. Default is false
.
Phage mining
--skip_mining
(Boolean) Specify whether to skip the phage mining entire step (VIBRANT, phigaro, VirSorter, VirFinder) or not. Default is false
. If you want to exclude a single or multiple tools from this step, use the specific skip parameter instead.
--skip_vibrant
(Boolean) Specify whether to skip the phage mining with VIBRANT or not. Default is false
.
--mod_vibrant
Specify the modality of the phage mining with VIBRANT. There are 2 possibilities:
-
legacy
(default) use VIBRANT 1.0.1. -
standard
use VIBRANT 1.2.1. Not working yet (does not produce output).
--skip_phigaro
(Boolean) Specify whether to skip the phage mining with Phigaro or not. Default is false
.
--mod_phigaro
Specify the modality of the phage mining with phigaro. There are X possibilities:
-
standard
(default) use phigaro 2.3.0. -
custom
use phigaro using your custom config file. With this modality you have to specify also--file_figaro_config
, which is the path to the .yml config file containing yout custom parameters to run the miner.
--skip_virsorter
(Boolean) Specify whether to skip the phage mining with VirSorter or not. Default is false
.
--mod_virsorter
Specify the modality of the phage mining with VirSorter. There are 2 possibilities:
-
legacy
(default) use VirSorter 1.0.6. -
standard
use VirSorter 2.0.beta. Not working yet (does not produce output). -
custom
use VirSorter with your custom database. With this modality you have to specify also--file_virsorter_db
, wich is the path to the folder containing the database file. Please verify that your database files match the required files requested by the VirSorter version (1.0.6).
--skip_virfinder
(Boolean) Specify whether to skip the phage mining with VirFinder or not. Default is false
.
Dereplication and reads mapping
--skip_dereplication
(Boolean) Specify whether to skip the dereplication of viral scaffolds or not. Default is false
.
--minlen
Specify the minimal length in bp for a viral consensus scaffold. Default is 1000
.
Viral taxonomy
--skip_viral_taxo
(Boolean) Specify whether to skip the viral taxonomy classification step (vcontact2, graphanalyzer) or not. Default is false
.
--skip_vcontact2
(Boolean) Specify whether to skip the phage taxonomy classification with vcontact2. Default is false
.
--mod_vcontact2
Specify the modality of the phage taxonomy analysis with vConTACT2. Currently there are 2 possibilities (new modalities will be added periodically):
Jan2022
(default) use vConTACT2 with the outputs generated by https://github.com/RyanCook94/inphared.pl script runned on January 2022.custom
use vConTACT2 using your custom reference genomes and taxonomy. With this modality you have to specify also--file_vcontact2_db
, which is the path to the folder containing thevConTACT2_proteins.faa
,vConTACT2_gene_to_genome.csv
anddata_excluding_refseq.tsv
files. The path is relative to the pipeline's root directory, for example--file_vcontact2_db ./db/inphared/custom/
.
--vcontact2_file_head
Specify the INphared files prefix (vcontact2 db). Default is 20Jan2022_vConTACT2_
If you use a different version of inphared, specify the file prefix string (usually dd/mm/yyyy_vConTACT2_
).
--skip_graphanalyzer
(Boolean) Specify wheter to skip the automatic phage taxonomy assignment with graphanalyzer and taxonomy table csv file. Default is false
.
Plots and report
--skip_miner_comparison
(Boolean) Specify whether to skip the miner comparison plot (upSet plot) or not. Default is false
.
--skip_summary
(Boolean) Specify whether to skip the summary table generation and single (for each sample) violin plots creation or not. Default is false
.
--skip_taxonomy_table
(Boolean) Specify whether to skip the taxonomy table generation or not. Default is false
.
--skip_heatmap
(Boolean) Specify whether to skip the heatmap plot creation or not. Default is false
.
--heatmap_var
Specify the name of the variable (metadata column) to use for the heatmap top dendrogram subdivision. (Requested) in order to generate the heatmap.
--skip_alpha_diversity
(Boolean) Specify whether to skip the alpha-diversity plots creation or not. Default is false
.
--alpha_var1
Specify the name of the variable (metadata column) to use for the alpha-diversity sample clustering (on the x-axis). (Requested) in order to generate the alpha-diversity plots
--alpha_var2
Specify the name of the variable (metadata column) to use for the alpha-diversity color mapping. (Requested) in order to generate the alpha-diversity plots.
--skip_beta_diversity
(Boolean) Specify whether to skip the beta-diversity plots creation or not. Default is false
.
--beta_var
Specify the name of the variable (metadata column) to use for the beta-diversity color mapping. (Requested) in order to generate the beta-diversity plots.
--skip_violin_plots
(Boolean) Specify whether to skip the violin plot (samples clustered for a variable) or not. Default is false
.
--violin_var
Specify the name of the variable (metadata column) to use for the violin plot clustering. (Requested) in order to generate the violin plot.
--skip_report
(Boolean) Specify whether to skip the report step (Multiqc report) or not. Default is false
.