
These workflows are associated with Large genome assembly and polishing

To use these workflows in Galaxy you can either click the links to download the workflows, or you can right-click and copy the link to the workflow which can be used in the Galaxy form to import workflows.

Assembly polishing - upgraded
Anna Syme

Last updated May 8, 2024

License: GPL-3.0-or-later
flowchart TD
  0["ℹ️ Input Dataset\nAssembly to be polished"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nlong reads"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Parameter\nminimap setting for long reads"];
  style 2 fill:#ded,stroke:#393,stroke-width:4px;
  3["ℹ️ Input Dataset\nIllumina reads R1"];
  style 3 stroke:#2c3143,stroke-width:4px;
  4["🛠️ Subworkflow\nRacon polish with long reads, x4 - upgraded"];
  style 4 fill:#edd,stroke:#900,stroke-width:4px;
  0 -->|output| 4;
  1 -->|output| 4;
  2 -->|output| 4;
  5["Medaka polish"];
  4 -->|Assembly polished by long reads using Racon| 5;
  1 -->|output| 5;
  e3136060-bce7-4af3-87c4-9dcbb0d1f531["Output\nAssembly polished by long reads using Medaka"];
  5 --> e3136060-bce7-4af3-87c4-9dcbb0d1f531;
  style e3136060-bce7-4af3-87c4-9dcbb0d1f531 stroke:#2c3143,stroke-width:4px;
  6["Fasta statistics after Racon long read polish"];
  4 -->|Assembly polished by long reads using Racon| 6;
  7["Fasta statistics after Medaka polish"];
  5 -->|out_consensus| 7;
  8["🛠️ Subworkflow\nRacon polish with Illumina reads R1 only, x2 - upgraded"];
  style 8 fill:#edd,stroke:#900,stroke-width:4px;
  5 -->|out_consensus| 8;
  3 -->|output| 8;
  9["Fasta statistics after Racon short read polish"];
  8 -->|Assembly polished by short reads using Racon| 9;
Assembly with Flye - upgraded
Anna Syme

Last updated May 8, 2024

License: GPL-3.0-or-later
flowchart TD
  0["ℹ️ Input Dataset\nlong reads"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["Flye: assembly"];
  0 -->|output| 1;
  3960e31d-7a9e-400c-bb21-f6e47b75e649["Output\nFlye assembly on input dataset(s) (consensus)"];
  1 --> 3960e31d-7a9e-400c-bb21-f6e47b75e649;
  style 3960e31d-7a9e-400c-bb21-f6e47b75e649 stroke:#2c3143,stroke-width:4px;
  e524f295-a957-4c91-838c-f8e98e809b6c["Output\nFlye assembly on input dataset(s) (assembly_graph)"];
  1 --> e524f295-a957-4c91-838c-f8e98e809b6c;
  style e524f295-a957-4c91-838c-f8e98e809b6c stroke:#2c3143,stroke-width:4px;
  48b854e2-dd6e-4345-8d05-09abca6659da["Output\nFlye assembly on input dataset(s) (Graphical Fragment Assembly)"];
  1 --> 48b854e2-dd6e-4345-8d05-09abca6659da;
  style 48b854e2-dd6e-4345-8d05-09abca6659da stroke:#2c3143,stroke-width:4px;
  8672c172-71a7-432c-9679-a8e37f36cf53["Output\nFlye assembly on input dataset(s) (assembly_info)"];
  1 --> 8672c172-71a7-432c-9679-a8e37f36cf53;
  style 8672c172-71a7-432c-9679-a8e37f36cf53 stroke:#2c3143,stroke-width:4px;
  2["Fasta statistics"];
  1 -->|consensus| 2;
  3["Quast genome report"];
  1 -->|consensus| 3;
  17cdf8e0-8ad4-4570-afae-1861934fc678["Output\nQuast on input dataset(s):  HTML report"];
  3 --> 17cdf8e0-8ad4-4570-afae-1861934fc678;
  style 17cdf8e0-8ad4-4570-afae-1861934fc678 stroke:#2c3143,stroke-width:4px;
  4["Bandage image: Flye assembly"];
  1 -->|assembly_gfa| 4;
  e66bd129-146f-48dc-95b8-39a2a1ffb68d["Output\nBandage Image on input dataset(s): Assembly Graph Image"];
  4 --> e66bd129-146f-48dc-95b8-39a2a1ffb68d;
  style e66bd129-146f-48dc-95b8-39a2a1ffb68d stroke:#2c3143,stroke-width:4px;
  5["Bar chart: show contig sizes"];
  1 -->|assembly_info| 5;
  6d0a4e23-d631-4e37-8930-3d22fc91369b["Output\nBar chart showing contig sizes"];
  5 --> 6d0a4e23-d631-4e37-8930-3d22fc91369b;
  style 6d0a4e23-d631-4e37-8930-3d22fc91369b stroke:#2c3143,stroke-width:4px;
Assess genome quality - upgraded
Anna Syme

Last updated May 8, 2024

License: GPL-3.0-or-later
flowchart TD
  0["ℹ️ Input Dataset\nPolished assembly"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nReference genome"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["Busco: assess assembly"];
  0 -->|output| 2;
  219e4952-36e0-4b01-a407-e774b5b02dca["Output\nBusco short summary"];
  2 --> 219e4952-36e0-4b01-a407-e774b5b02dca;
  style 219e4952-36e0-4b01-a407-e774b5b02dca stroke:#2c3143,stroke-width:4px;
  3["Quast: assess assembly"];
  1 -->|output| 3;
  0 -->|output| 3;
  77ab0186-4cfd-460f-b71e-39a923414ef4["Output\nQuast on input dataset(s):  HTML report"];
  3 --> 77ab0186-4cfd-460f-b71e-39a923414ef4;
  style 77ab0186-4cfd-460f-b71e-39a923414ef4 stroke:#2c3143,stroke-width:4px;
Combined workflows for large genome assembly - upgraded
Anna Syme

Last updated May 8, 2024

License: GPL-3.0-or-later
flowchart TD
  0["ℹ️ Input Dataset\nlong reads"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nR1"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Dataset\nR2"];
  style 2 stroke:#2c3143,stroke-width:4px;
  3["ℹ️ Input Parameter\nminimap settings for long reads"];
  style 3 fill:#ded,stroke:#393,stroke-width:4px;
  4["ℹ️ Input Dataset\nReference genome for Quast"];
  style 4 stroke:#2c3143,stroke-width:4px;
  5["🛠️ Subworkflow\nkmer counting - meryl - upgraded"];
  style 5 fill:#edd,stroke:#900,stroke-width:4px;
  1 -->|output| 5;
  6["🛠️ Subworkflow\nData QC - upgraded"];
  style 6 fill:#edd,stroke:#900,stroke-width:4px;
  1 -->|output| 6;
  2 -->|output| 6;
  0 -->|output| 6;
  7["🛠️ Subworkflow\nTrim and filter reads - fastp - upgraded "];
  style 7 fill:#edd,stroke:#900,stroke-width:4px;
  1 -->|output| 7;
  2 -->|output| 7;
  0 -->|output| 7;
  8["🛠️ Subworkflow\nAssembly with Flye - upgraded"];
  style 8 fill:#edd,stroke:#900,stroke-width:4px;
  7 -->|fastp filtered long reads| 8;
  9["🛠️ Subworkflow\nAssembly polishing - upgraded"];
  style 9 fill:#edd,stroke:#900,stroke-width:4px;
  8 -->|Flye assembly on input datasets consensus| 9;
  7 -->|fastp filtered R1 reads| 9;
  7 -->|fastp filtered long reads| 9;
  3 -->|output| 9;
  10["🛠️ Subworkflow\nAssess genome quality - upgraded"];
  style 10 fill:#edd,stroke:#900,stroke-width:4px;
  9 -->|Assembly polished by long reads using Medaka| 10;
  4 -->|output| 10;
Data QC - upgraded
Anna Syme

Last updated May 8, 2024

License: GPL-3.0-or-later
flowchart TD
  0["ℹ️ Input Dataset\nInput file: long reads"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nInput file: Illumina reads R1"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Dataset\nInput file: Illumina reads R2"];
  style 2 stroke:#2c3143,stroke-width:4px;
  3["Nanoplot: long reads"];
  0 -->|output| 3;
  73d0e4cf-366e-41c1-810a-b269638826b3["Output\nNanoPlot on input dataset(s): HTML report"];
  3 --> 73d0e4cf-366e-41c1-810a-b269638826b3;
  style 73d0e4cf-366e-41c1-810a-b269638826b3 stroke:#2c3143,stroke-width:4px;
  4["FastQC on R1"];
  1 -->|output| 4;
  5["FastQC on R2"];
  2 -->|output| 5;
  6["MultiQC: combine fastQC reports"];
  4 -->|text_file| 6;
  5 -->|text_file| 6;
  8baf8700-876e-4a74-ad99-6e656f3ba618["Output\nMultiQC on input dataset(s): Webpage"];
  6 --> 8baf8700-876e-4a74-ad99-6e656f3ba618;
  style 8baf8700-876e-4a74-ad99-6e656f3ba618 stroke:#2c3143,stroke-width:4px;
Racon polish with Illumina reads (R1 only), x2 - upgraded
Anna Syme

Last updated May 8, 2024

License: GPL-3.0-or-later
flowchart TD
  0["ℹ️ Input Dataset\nAssembly to be polished"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nIllumina reads, R1, in fastq.gz format"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["Minimap2 round 1: map reads to assembly"];
  1 -->|output| 2;
  0 -->|output| 2;
  3["Racon round 1: polish assembly"];
  0 -->|output| 3;
  2 -->|alignment_output| 3;
  1 -->|output| 3;
  4["Minimap2 round 2: map reads to assembly"];
  1 -->|output| 4;
  3 -->|consensus| 4;
  5["Racon round 2: polish assembly"];
  3 -->|consensus| 5;
  4 -->|alignment_output| 5;
  1 -->|output| 5;
  594819c3-668e-4575-b9a6-4459ffacf952["Output\nAssembly polished by short reads using Racon"];
  5 --> 594819c3-668e-4575-b9a6-4459ffacf952;
  style 594819c3-668e-4575-b9a6-4459ffacf952 stroke:#2c3143,stroke-width:4px;
Racon polish with long reads, x4 - upgraded
Anna Syme

Last updated May 8, 2024

License: GPL-3.0-or-later
flowchart TD
  0["ℹ️ Input Dataset\nAssembly to be polished"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nlong reads"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Parameter\nminimap setting for long reads "];
  style 2 fill:#ded,stroke:#393,stroke-width:4px;
  3["Minimap2: map long reads to assembly"];
  2 -->|output| 3;
  1 -->|output| 3;
  0 -->|output| 3;
  4["Racon: polish 1"];
  0 -->|output| 4;
  3 -->|alignment_output| 4;
  1 -->|output| 4;
  5["Minimap2: map long reads to polished assembly 1"];
  2 -->|output| 5;
  1 -->|output| 5;
  4 -->|consensus| 5;
  6["Racon: polish 2"];
  4 -->|consensus| 6;
  5 -->|alignment_output| 6;
  1 -->|output| 6;
  7["Minimap2: map long reads to polished assembly 2"];
  2 -->|output| 7;
  1 -->|output| 7;
  6 -->|consensus| 7;
  8["Racon: polish 3"];
  6 -->|consensus| 8;
  7 -->|alignment_output| 8;
  1 -->|output| 8;
  9["Minimap2: map long reads to polished assembly 3"];
  2 -->|output| 9;
  1 -->|output| 9;
  8 -->|consensus| 9;
  10["Racon: polish 4"];
  8 -->|consensus| 10;
  9 -->|alignment_output| 10;
  1 -->|output| 10;
  bcf0f03c-5951-46a7-aa38-545aed9bc183["Output\nAssembly polished by long reads using Racon"];
  10 --> bcf0f03c-5951-46a7-aa38-545aed9bc183;
  style bcf0f03c-5951-46a7-aa38-545aed9bc183 stroke:#2c3143,stroke-width:4px;
Trim and filter reads - fastp - upgraded
Anna Syme

Last updated May 8, 2024

License: GPL-3.0-or-later
flowchart TD
  0["ℹ️ Input Dataset\nIllumina reads R1"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nIllumina reads R2"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Dataset\nlong reads"];
  style 2 stroke:#2c3143,stroke-width:4px;
  3["fastp on short reads"];
  0 -->|output| 3;
  1 -->|output| 3;
  656e4138-41ab-4561-8989-33de9ac9a2f3["Output\nfastp report on short reads html"];
  3 --> 656e4138-41ab-4561-8989-33de9ac9a2f3;
  style 656e4138-41ab-4561-8989-33de9ac9a2f3 stroke:#2c3143,stroke-width:4px;
  0d53f347-0368-47cb-953a-2e4dac57e013["Output\nfastp filtered R1 reads"];
  3 --> 0d53f347-0368-47cb-953a-2e4dac57e013;
  style 0d53f347-0368-47cb-953a-2e4dac57e013 stroke:#2c3143,stroke-width:4px;
  10fbe1e5-400c-4ffe-8794-9b776b0d7322["Output\nfastp report on short reads json"];
  3 --> 10fbe1e5-400c-4ffe-8794-9b776b0d7322;
  style 10fbe1e5-400c-4ffe-8794-9b776b0d7322 stroke:#2c3143,stroke-width:4px;
  639ed3f7-0e51-4e5d-b6f8-081378962109["Output\nfastp filtered R2 reads"];
  3 --> 639ed3f7-0e51-4e5d-b6f8-081378962109;
  style 639ed3f7-0e51-4e5d-b6f8-081378962109 stroke:#2c3143,stroke-width:4px;
  4["fastp on long reads"];
  2 -->|output| 4;
  5e0d2c3d-41a4-4823-ae9c-b1e4d2826541["Output\nfastp report on long reads html"];
  4 --> 5e0d2c3d-41a4-4823-ae9c-b1e4d2826541;
  style 5e0d2c3d-41a4-4823-ae9c-b1e4d2826541 stroke:#2c3143,stroke-width:4px;
  e6018ad6-86f4-4e78-8cf2-ccc8b97022fe["Output\nfastp filtered long reads"];
  4 --> e6018ad6-86f4-4e78-8cf2-ccc8b97022fe;
  style e6018ad6-86f4-4e78-8cf2-ccc8b97022fe stroke:#2c3143,stroke-width:4px;
  69f8383b-a1be-4a74-95f4-3dba35e01426["Output\nfastp report on long reads json"];
  4 --> 69f8383b-a1be-4a74-95f4-3dba35e01426;
  style 69f8383b-a1be-4a74-95f4-3dba35e01426 stroke:#2c3143,stroke-width:4px;
kmer counting - meryl - upgraded
Anna Syme

Last updated May 8, 2024

License: GPL-3.0-or-later
flowchart TD
  0["ℹ️ Input Dataset\nIllumina reads R1"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["Meryl - count kmers"];
  0 -->|output| 1;
  899ddd93-4c0f-4f81-a973-8120494ed983["Output\nMeryl on input dataset(s): read-db.meryldb"];
  1 --> 899ddd93-4c0f-4f81-a973-8120494ed983;
  style 899ddd93-4c0f-4f81-a973-8120494ed983 stroke:#2c3143,stroke-width:4px;
  2["Meryl - generate histogram"];
  1 -->|read_db| 2;
  2 -->|read_db_hist| 3;
  efc727b6-1ef4-4c4c-8cce-35c7d3cc8aac["Output\nGenomeScope on input dataset(s) Transformed log plot"];
  3 --> efc727b6-1ef4-4c4c-8cce-35c7d3cc8aac;
  style efc727b6-1ef4-4c4c-8cce-35c7d3cc8aac stroke:#2c3143,stroke-width:4px;
  701df341-5767-44bc-ade2-6af498ab7467["Output\nGenomeScope on input dataset(s) Transformed linear plot"];
  3 --> 701df341-5767-44bc-ade2-6af498ab7467;
  style 701df341-5767-44bc-ade2-6af498ab7467 stroke:#2c3143,stroke-width:4px;
  85fa4004-b351-47b3-84aa-6788d338037a["Output\nGenomeScope on input dataset(s) Log plot"];
  3 --> 85fa4004-b351-47b3-84aa-6788d338037a;
  style 85fa4004-b351-47b3-84aa-6788d338037a stroke:#2c3143,stroke-width:4px;
  c71ce055-98f0-4354-9397-2f8833b90cc4["Output\nGenomeScope on input dataset(s) Linear plot"];
  3 --> c71ce055-98f0-4354-9397-2f8833b90cc4;
  style c71ce055-98f0-4354-9397-2f8833b90cc4 stroke:#2c3143,stroke-width:4px;

