MetaPhage Options

These are the parameters supported by the main.nf Nextflow script. Advanced users can invoke the script or modify the configuration file using this reference.

General options

`--readPath`

Specify the folder where your datasets are stored. Default is $projectDir/dataset. Note that $projectDir correspond to the MetaPhage folder.

`--metaPath`

Specify the folder where the dataset's metadata are stored. Default is $readPath/metadata/.

`--dbPath`

Specify the folder where databases are stored. Default is $projectDir/db.

`--virome_dataset`

(Boolean) Specify if the dataset provided is a virome. This will affect different tools usage. Default is false.

`--singleEnd`

(Boolean) Specify if your datasets are in single-end mode. Default is false. Please note that single-end mode is not supported yet.

`--outdir`

Specify the folder where your results are stored. Default is $projectDir/output.

`--temp_dir`

Specify the folder where your temporary files are stored. Default is $projectDir/temp.

`--workDir`

Specify the directory where tasks temporary files are created. Default is $projectDir/work.

Quality check and trimming

`--skip_qtrimming`

(Boolean) Specify whether to perform the quality trimming of your reads or not. Default is false.

`--adapter_forward` and `--adapter_reverse`

Specify the adapter sequences. Deafault are AGATCGGAAGAGCACACGTCTGAACTCCAGTCA for forward and AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT for reverse.

`--mean_quality`

Given a read, every base having quality less than --mean_quality (default is 15) is marked as "unqualified". If the percentage of unqualified in the read is more that 40%, that that read if excluded from the analysis. See https://github.com/OpenGene/fastp for more informations.

`--trimming_quality`

Sliding window trimming is enabled in 5'→3' and in 3'→5' with a window 4bp large. Bases inside the window are trimmed if their mean quality is less then --trimming_quality (default is 15). See https://github.com/OpenGene/fastp for more informations.

`--keep_phix`

(Boolean) Specify whether to remove the phix or not. Default is false.

`--mod_phix`

Specify the modality of phix removal. There are 3 possibilities:

phiX174 (default) search and remove the complete genome of Coliphage phiX174 isolate S1 (GenBank: AF176027.1, https://www.ncbi.nlm.nih.gov/nuccore/AF176027). Genome is automatically downloaded if not already present in ./db/phix/.
WA11 search and remove the complete genome of Coliphage WA11 (GenBank: DQ079895.1, https://www.ncbi.nlm.nih.gov/nuccore/DQ079895). Genome is automatically downloaded if not already present in ./db/phix/.
custom search and remove the sequence specified with --file_phix_alone (path to the .fasta file; the path is relative to the pipeline's root directory, for example --file_phix_alone ./db/phix/genome.fasta).

Microbial taxonomy

`--skip_bacterial_taxo`

(Boolean) Specify whether to skip microbial taxonomy classification step (kraken2 and krona) or not. Default is false.

`--skip_kraken2`

(Boolean) Specify whether to skip the microbial taxonomy classification with Kraken2 or not. Default is false.

`--mod_kraken2`

Specify the modality of the short read alignment with Kraken2. There are 3 possibilities:

miniBAV (default) align against RefSeq bacteria, archaea, and viral libraries. Pre-built database taken from https://ccb.jhu.edu/software/kraken2/downloads.shtml.
miniBAVH align against RefSeq bacteria, archaea, and viral libraries, and against the GRCh38 human genome. Pre-built database taken from https://ccb.jhu.edu/software/kraken2/downloads.shtml.
custom align using your custom database. With this modality you have to specify also --file_kraken2_db, which is the path to the folder containing the hash.k2d, opts.k2d and taxo.k2d files. The path is relative to the pipeline's root directory, for example --file_kraken2_db ./db/kraken2/folder/.

`--skip_krona`

(Boolean) Specify whether to generate the krona-compatible TEXT file using KrakenTools/kreport2krona.py or not. Default is false.

Assembly

`--skip_megahit`

(Boolean) Specify whether to skip the assembly with MEGAHIT or not. Default is true.

`--skip_metaquast`

(Boolean) Specify whether to skip the assembly evaluation with metaQUAST or not. Default is false.

Phage mining

`--skip_mining`

(Boolean) Specify whether to skip the phage mining entire step (VIBRANT, phigaro, VirSorter, VirFinder) or not. Default is false. If you want to exclude a single or multiple tools from this step, use the specific skip parameter instead.

`--skip_vibrant`

(Boolean) Specify whether to skip the phage mining with VIBRANT or not. Default is false.

`--mod_vibrant`

Specify the modality of the phage mining with VIBRANT. There are 2 possibilities:

legacy (default) use VIBRANT 1.0.1.
standard use VIBRANT 1.2.1. Not working yet (does not produce output).

`--skip_phigaro`

(Boolean) Specify whether to skip the phage mining with Phigaro or not. Default is false.

`--mod_phigaro`

Specify the modality of the phage mining with phigaro. There are X possibilities:

standard (default) use phigaro 2.3.0.
custom use phigaro using your custom config file. With this modality you have to specify also --file_figaro_config, which is the path to the .yml config file containing yout custom parameters to run the miner.

`--skip_virsorter`

(Boolean) Specify whether to skip the phage mining with VirSorter or not. Default is false.

`--mod_virsorter`

Specify the modality of the phage mining with VirSorter. There are 2 possibilities:

legacy (default) use VirSorter 1.0.6.
standard use VirSorter 2.0.beta. Not working yet (does not produce output).
custom use VirSorter with your custom database. With this modality you have to specify also --file_virsorter_db, wich is the path to the folder containing the database file. Please verify that your database files match the required files requested by the VirSorter version (1.0.6).

`--skip_virfinder`

(Boolean) Specify whether to skip the phage mining with VirFinder or not. Default is false.

Dereplication and reads mapping

`--skip_dereplication`

(Boolean) Specify whether to skip the dereplication of viral scaffolds or not. Default is false.

`--minlen`

Specify the minimal length in bp for a viral consensus scaffold. Default is 1000.

Viral taxonomy

`--skip_viral_taxo`

(Boolean) Specify whether to skip the viral taxonomy classification step (vcontact2, graphanalyzer) or not. Default is false.

`--skip_vcontact2`

(Boolean) Specify whether to skip the phage taxonomy classification with vcontact2. Default is false.

`--mod_vcontact2`

Specify the modality of the phage taxonomy analysis with vConTACT2. Currently there are 2 possibilities (new modalities will be added periodically):

Jan2022 (default) use vConTACT2 with the outputs generated by https://github.com/RyanCook94/inphared.pl script runned on January 2022.
custom use vConTACT2 using your custom reference genomes and taxonomy. With this modality you have to specify also --file_vcontact2_db, which is the path to the folder containing the vConTACT2_proteins.faa, vConTACT2_gene_to_genome.csv and data_excluding_refseq.tsv files. The path is relative to the pipeline's root directory, for example --file_vcontact2_db ./db/inphared/custom/.

`--vcontact2_file_head`

Specify the INphared files prefix (vcontact2 db). Default is 20Jan2022_vConTACT2_ If you use a different version of inphared, specify the file prefix string (usually dd/mm/yyyy_vConTACT2_).

`--skip_graphanalyzer`

(Boolean) Specify wheter to skip the automatic phage taxonomy assignment with graphanalyzer and taxonomy table csv file. Default is false.

Plots and report

`--skip_miner_comparison`

(Boolean) Specify whether to skip the miner comparison plot (upSet plot) or not. Default is false.

`--skip_summary`

(Boolean) Specify whether to skip the summary table generation and single (for each sample) violin plots creation or not. Default is false.

`--skip_taxonomy_table`

(Boolean) Specify whether to skip the taxonomy table generation or not. Default is false.

`--skip_heatmap`

(Boolean) Specify whether to skip the heatmap plot creation or not. Default is false.

`--heatmap_var`

Specify the name of the variable (metadata column) to use for the heatmap top dendrogram subdivision. (Requested) in order to generate the heatmap.

`--skip_alpha_diversity`

(Boolean) Specify whether to skip the alpha-diversity plots creation or not. Default is false.

`--alpha_var1`

Specify the name of the variable (metadata column) to use for the alpha-diversity sample clustering (on the x-axis). (Requested) in order to generate the alpha-diversity plots

`--alpha_var2`

Specify the name of the variable (metadata column) to use for the alpha-diversity color mapping. (Requested) in order to generate the alpha-diversity plots.

`--skip_beta_diversity`

(Boolean) Specify whether to skip the beta-diversity plots creation or not. Default is false.

`--beta_var`

Specify the name of the variable (metadata column) to use for the beta-diversity color mapping. (Requested) in order to generate the beta-diversity plots.

`--skip_violin_plots`

(Boolean) Specify whether to skip the violin plot (samples clustered for a variable) or not. Default is false.

`--violin_var`

Specify the name of the variable (metadata column) to use for the violin plot clustering. (Requested) in order to generate the violin plot.

`--skip_report`

(Boolean) Specify whether to skip the report step (Multiqc report) or not. Default is false.

MetaPhage Options

General options

--readPath

--metaPath

--dbPath

--virome_dataset

--singleEnd

--outdir

--temp_dir

--workDir

Quality check and trimming

--skip_qtrimming

--adapter_forward and --adapter_reverse

--mean_quality

--trimming_quality

--keep_phix

--mod_phix

Microbial taxonomy

--skip_bacterial_taxo

--skip_kraken2

--mod_kraken2

--skip_krona

Assembly

--skip_megahit

--skip_metaquast

Phage mining

--skip_mining

--skip_vibrant

--mod_vibrant

--skip_phigaro

--mod_phigaro

--skip_virsorter

--mod_virsorter

--skip_virfinder