Post-assembly workflow

assembly-ERGA-post-assembly-QC/main-workflow

Author(s)
Cristóbal Gallardo Alba
version Version
2
last_modification Last updated
Sep 6, 2024
license License
AGPL-3.0-or-later
galaxy-tags Tags
ReferenceGenome
name:ERGA

Features
Tutorial
hands_on ERGA post-assembly QC

Workflow Testing
Tests: ✅
Results: Not yet automated
FAIRness purl PURL
https://gxy.io/GTN:W00000
RO-Crate logo with flask Download Workflow RO-Crate Workflowhub cloud with gears logo View on (Dev) WorkflowHub
Launch in Tutorial Mode question
galaxy-download Download
flowchart TD
  0["ℹ️ Input Dataset\nMetadata file"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Parameter\nNCBI taxonomic ID"];
  style 1 fill:#ded,stroke:#393,stroke-width:4px;
  2["ℹ️ Input Dataset\nNCBI taxdump directory"];
  style 2 stroke:#2c3143,stroke-width:4px;
  3["ℹ️ Input Collection\nLong-read FASTQ files"];
  style 3 stroke:#2c3143,stroke-width:4px;
  4["ℹ️ Input Dataset\nPrimary genome assembly file fasta"];
  style 4 stroke:#2c3143,stroke-width:4px;
  5["ℹ️ Input Parameter\nPloidy for model to use"];
  style 5 fill:#ded,stroke:#393,stroke-width:4px;
  6["ℹ️ Input Dataset\nDIAMOND database"];
  style 6 stroke:#2c3143,stroke-width:4px;
  7["ℹ️ Input Collection\nHi-C reverse"];
  style 7 stroke:#2c3143,stroke-width:4px;
  8["ℹ️ Input Collection\nHi-C forward"];
  style 8 stroke:#2c3143,stroke-width:4px;
  9["Meryl"];
  3 -->|output| 9;
  10["Collapse Collection"];
  3 -->|output| 10;
  11["Create BlobtoolKit"];
  4 -->|output| 11;
  0 -->|output| 11;
  2 -->|output| 11;
  1 -->|output| 11;
  12["gfastats"];
  4 -->|output| 12;
  13["Convert compressed file to uncompressed."];
  4 -->|output| 13;
  14["gfastats"];
  4 -->|output| 14;
  15["Diamond"];
  4 -->|output| 15;
  6 -->|output| 15;
  16["Collapse Collection"];
  7 -->|output| 16;
  17["Collapse Collection"];
  8 -->|output| 17;
  18["Meryl"];
  9 -->|read_db| 18;
  19["Map with minimap2"];
  10 -->|output| 19;
  4 -->|output| 19;
  20["Smudgeplot"];
  10 -->|output| 20;
  21["Replace"];
  13 -->|output1| 21;
  22["Bandage Image"];
  14 -->|output| 22;
  23["BWA-MEM2"];
  16 -->|output| 23;
  4 -->|output| 23;
  24["BWA-MEM2"];
  17 -->|output| 24;
  4 -->|output| 24;
  25["Merqury"];
  4 -->|output| 25;
  18 -->|read_db| 25;
  d2dbf498-5155-43b2-bf2a-ffed5c45100d["Output\nMerqury on input dataset(s): stats"];
  25 --> d2dbf498-5155-43b2-bf2a-ffed5c45100d;
  style d2dbf498-5155-43b2-bf2a-ffed5c45100d stroke:#2c3143,stroke-width:4px;
  0b43d206-253f-4c0b-a1f4-971402851c6d["Output\nMerqury on input dataset(s): plots"];
  25 --> 0b43d206-253f-4c0b-a1f4-971402851c6d;
  style 0b43d206-253f-4c0b-a1f4-971402851c6d stroke:#2c3143,stroke-width:4px;
  811a77e8-eb99-4dc7-9295-f0000fcb9fe5["Output\nMerqury on input dataset(s): QV stats"];
  25 --> 811a77e8-eb99-4dc7-9295-f0000fcb9fe5;
  style 811a77e8-eb99-4dc7-9295-f0000fcb9fe5 stroke:#2c3143,stroke-width:4px;
  26["Meryl"];
  18 -->|read_db| 26;
  27["Samtools stats"];
  19 -->|alignment_output| 27;
  28["Busco"];
  21 -->|outfile| 28;
  af98aae6-f1a3-493c-9cef-08e0926210d3["Output\nBusco on input dataset(s): full table"];
  28 --> af98aae6-f1a3-493c-9cef-08e0926210d3;
  style af98aae6-f1a3-493c-9cef-08e0926210d3 stroke:#2c3143,stroke-width:4px;
  929439b7-c688-45ea-848a-97c13d3e0028["Output\nBusco on input dataset(s): short summary"];
  28 --> 929439b7-c688-45ea-848a-97c13d3e0028;
  style 929439b7-c688-45ea-848a-97c13d3e0028 stroke:#2c3143,stroke-width:4px;
  29["Filter and merge"];
  24 -->|bam_output| 29;
  23 -->|bam_output| 29;
  30["Merqury plot 2"];
  25 -->|png_files| 30;
  31["Merqury plot 1"];
  25 -->|png_files| 31;
  32["Merqury plot 3"];
  25 -->|png_files| 32;
  33["Merqury plot 5"];
  25 -->|png_files| 33;
  34["Merqury plot 4"];
  25 -->|png_files| 34;
  35["GenomeScope"];
  26 -->|read_db_hist| 35;
  5 -->|output| 35;
  36["BlobToolKit"];
  15 -->|blast_tabular| 36;
  11 -->|blobdir| 36;
  28 -->|busco_table| 36;
  19 -->|alignment_output| 36;
  37["PretextMap"];
  29 -->|outfile| 37;
  38["BlobToolKit"];
  36 -->|blobdir| 38;
  39["BlobToolKit"];
  36 -->|blobdir| 39;
  40["BlobToolKit"];
  36 -->|blobdir| 40;
  41["Pretext Snapshot"];
  37 -->|pretext_map_out| 41;

Inputs

Input Label
Input dataset Metadata file
Input parameter NCBI taxonomic ID
Input dataset NCBI taxdump directory
Input dataset collection Long-read FASTQ files
Input dataset Primary genome assembly file (fasta)
Input parameter Ploidy for model to use
Input dataset DIAMOND database
Input dataset collection Hi-C reverse
Input dataset collection Hi-C forward

Outputs

From Output Label
Input parameter NCBI taxonomic ID
Input parameter Ploidy for model to use
toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.2.0+galaxy0 gfastats
toolshed.g2.bx.psu.edu/repos/galaxy-australia/smudgeplot/smudgeplot/0.2.5+galaxy+2 Smudgeplot
toolshed.g2.bx.psu.edu/repos/iuc/bandage/bandage_image/2022.09+galaxy4 Bandage Image
toolshed.g2.bx.psu.edu/repos/iuc/merqury/merqury/1.3+galaxy2 Merqury
toolshed.g2.bx.psu.edu/repos/devteam/samtools_stats/samtools_stats/2.0.4 Samtools stats
toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/5.4.4+galaxy0 Busco
toolshed.g2.bx.psu.edu/repos/iuc/genomescope/genomescope/2.0+galaxy2 GenomeScope
toolshed.g2.bx.psu.edu/repos/iuc/pretext_snapshot/pretext_snapshot/0.0.3+galaxy1 Pretext Snapshot

Tools

Tool Links
CONVERTER_gz_to_uncompressed
__EXTRACT_DATASET__
toolshed.g2.bx.psu.edu/repos/bgruening/blobtoolkit/blobtoolkit/4.0.7+galaxy2 View in ToolShed
toolshed.g2.bx.psu.edu/repos/bgruening/diamond/bg_diamond/2.0.15+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.2.0+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/bgruening/gfastats/gfastats/1.3.6+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_find_and_replace/1.1.4 View in ToolShed
toolshed.g2.bx.psu.edu/repos/devteam/samtools_stats/samtools_stats/2.0.4 View in ToolShed
toolshed.g2.bx.psu.edu/repos/galaxy-australia/smudgeplot/smudgeplot/0.2.5+galaxy+2 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/bandage/bandage_image/2022.09+galaxy4 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/bellerophon/bellerophon/1.0+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/busco/busco/5.4.4+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/bwa_mem2/bwa_mem2/2.2.1+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/genomescope/genomescope/2.0+galaxy2 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/merqury/merqury/1.3+galaxy2 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/meryl/meryl/1.3+galaxy6 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/minimap2/minimap2/2.28+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/pretext_map/pretext_map/0.1.9+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/pretext_snapshot/pretext_snapshot/0.0.3+galaxy1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/nml/collapse_collections/collapse_dataset/5.1.0 View in ToolShed

To use these workflows in Galaxy you can either click the links to download the workflows, or you can right-click and copy the link to the workflow which can be used in the Galaxy form to import workflows.

Importing into Galaxy

Below are the instructions for importing these workflows directly into your Galaxy server of choice to start using them!
Hands-on: Importing a workflow
  • Click on Workflow on the top menu bar of Galaxy. You will see a list of all your workflows.
  • Click on galaxy-upload Import at the top-right of the screen
  • Provide your workflow
    • Option 1: Paste the URL of the workflow into the box labelled “Archived Workflow URL”
    • Option 2: Upload the workflow file in the box labelled “Archived Workflow File”
  • Click the Import workflow button

Below is a short video demonstrating how to import a workflow from GitHub using this procedure:

Video: Importing a workflow from URL

Version History

Version Commit Time Comments
6 49cec9ea8 2024-09-04 10:06:48 Rename post-assembly-workflow.ga to main_workflow.ga
5 cf8e17d6d 2024-08-28 09:16:42 Add workflow test
4 93e0f8507 2024-06-28 10:00:05 Change mapping tool to minimap and add more details about the downloaded databases
3 807026b84 2023-06-23 12:49:16 Update topics/assembly/tutorials/ERGA-post-assembly-QC/workflows/main_workflow.ga
2 c5bb8031a 2023-06-23 09:04:53 Fix testes
1 893cd527b 2023-04-28 09:18:57 Add images and rename tutorial

For Admins

Installing the workflow tools

wget https://training.galaxyproject.org/training-material/topics/assembly/tutorials/ERGA-post-assembly-QC/workflows/main_workflow.ga -O workflow.ga
workflow-to-tools -w workflow.ga -o tools.yaml
shed-tools install -g GALAXY -a API_KEY -t tools.yaml
workflow-install -g GALAXY -a API_KEY -w workflow.ga --publish-workflows