QC + Mapping + Counting - Ref Based RNA Seq - Transcriptomics - GTN - subworkflows

transcriptomics-ref-based/qc-mapping-counting

Author(s)
Bérénice Batut, Mallory Freeberg, Mo Heydarian, Anika Erxleben, Pavankumar Videm, Clemens Blank, Maria Doyle, Nicola Soranzo, Peter van Heusden, Lucille Delisle
version Version
9
last_modification Last updated
Jul 5, 2024
license License
MIT
galaxy-tags Tags
transcriptomics

Features
Tutorial
hands_on Reference-based RNA-Seq data analysis
workflow Other workflows associated with this material
Workflow Testing
Tests: ✅
Results: Not yet automated
FAIRness purl PURL
https://gxy.io/GTN:W00246
RO-Crate logo with flask Download Workflow RO-Crate Workflowhub cloud with gears logo View on WorkflowHub
Launch in Tutorial Mode question
galaxy-download Download
flowchart TD
  0["ℹ️ Input Collection\nPaired list collection with PE fastqs"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nDrosophila_melanogaster.BDGP6.32.109_UCSC.gtf.gz"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["🛠️ Subworkflow\nFastQC"];
  style 2 fill:#edd,stroke:#900,stroke-width:4px;
  0 -->|output| 2;
  5f7652f1-e225-4ad1-8dbd-4d21544edb89["Output\nmultiqc_fastqc_html"];
  2 --> 5f7652f1-e225-4ad1-8dbd-4d21544edb89;
  style 5f7652f1-e225-4ad1-8dbd-4d21544edb89 stroke:#2c3143,stroke-width:4px;
  3["🛠️ Subworkflow\ncutadapt"];
  style 3 fill:#edd,stroke:#900,stroke-width:4px;
  0 -->|output| 3;
  41c7fbb5-655b-457c-8ba7-0e2eeab3d7ee["Output\nmultiqc_cutadapt_html"];
  3 --> 41c7fbb5-655b-457c-8ba7-0e2eeab3d7ee;
  style 41c7fbb5-655b-457c-8ba7-0e2eeab3d7ee stroke:#2c3143,stroke-width:4px;
  4["🛠️ Subworkflow\nSTAR + multiQC"];
  style 4 fill:#edd,stroke:#900,stroke-width:4px;
  1 -->|output| 4;
  3 -->|out_pairs| 4;
  8aa5ef30-3f09-4a93-944d-0d89101c056a["Output\nmultiqc_star_html"];
  4 --> 8aa5ef30-3f09-4a93-944d-0d89101c056a;
  style 8aa5ef30-3f09-4a93-944d-0d89101c056a stroke:#2c3143,stroke-width:4px;
  11af5c57-91b4-496c-9b0c-b02904963f81["Output\nSTAR_BAM"];
  4 --> 11af5c57-91b4-496c-9b0c-b02904963f81;
  style 11af5c57-91b4-496c-9b0c-b02904963f81 stroke:#2c3143,stroke-width:4px;
  5["🛠️ Subworkflow\nmore QC"];
  style 5 fill:#edd,stroke:#900,stroke-width:4px;
  1 -->|output| 5;
  4 -->|STAR_BAM| 5;
  f2eed352-ca21-4d65-8810-f5a1d3c282b4["Output\nmultiqc_read_distrib_html"];
  5 --> f2eed352-ca21-4d65-8810-f5a1d3c282b4;
  style f2eed352-ca21-4d65-8810-f5a1d3c282b4 stroke:#2c3143,stroke-width:4px;
  b306cb12-a275-4c6d-b609-47fdc208864b["Output\nmultiqc_reads_per_chrom_html"];
  5 --> b306cb12-a275-4c6d-b609-47fdc208864b;
  style b306cb12-a275-4c6d-b609-47fdc208864b stroke:#2c3143,stroke-width:4px;
  3375d63c-cdc3-4fbb-8a55-6f504c934918["Output\nmultiqc_gene_body_cov_html"];
  5 --> 3375d63c-cdc3-4fbb-8a55-6f504c934918;
  style 3375d63c-cdc3-4fbb-8a55-6f504c934918 stroke:#2c3143,stroke-width:4px;
  3ea82568-5698-49a7-88fe-91381070aac2["Output\nmultiqc_dup_html"];
  5 --> 3ea82568-5698-49a7-88fe-91381070aac2;
  style 3ea82568-5698-49a7-88fe-91381070aac2 stroke:#2c3143,stroke-width:4px;
  6["🛠️ Subworkflow\nDetermine strandness"];
  style 6 fill:#edd,stroke:#900,stroke-width:4px;
  4 -->|STAR_BAM| 6;
  1 -->|output| 6;
  4 -->|signal_unique_str1| 6;
  4 -->|signal_unique_str2| 6;
  4 -->|reads_per_gene| 6;
  fb810859-f2d0-43f8-ac7c-5c714c5c6805["Output\ninferexperiment"];
  6 --> fb810859-f2d0-43f8-ac7c-5c714c5c6805;
  style fb810859-f2d0-43f8-ac7c-5c714c5c6805 stroke:#2c3143,stroke-width:4px;
  9727824a-3eb2-4430-92d1-b3c40c3041d1["Output\npgt"];
  6 --> 9727824a-3eb2-4430-92d1-b3c40c3041d1;
  style 9727824a-3eb2-4430-92d1-b3c40c3041d1 stroke:#2c3143,stroke-width:4px;
  105313d8-e31a-405d-8fcd-cc5fd93275e2["Output\nmultiqc_star_counts_html"];
  6 --> 105313d8-e31a-405d-8fcd-cc5fd93275e2;
  style 105313d8-e31a-405d-8fcd-cc5fd93275e2 stroke:#2c3143,stroke-width:4px;
  7["🛠️ Subworkflow\ncount STAR"];
  style 7 fill:#edd,stroke:#900,stroke-width:4px;
  1 -->|output| 7;
  4 -->|reads_per_gene| 7;
  5fee8aff-4023-43f1-a653-f5af5357d798["Output\ncounts_from_star"];
  7 --> 5fee8aff-4023-43f1-a653-f5af5357d798;
  style 5fee8aff-4023-43f1-a653-f5af5357d798 stroke:#2c3143,stroke-width:4px;
  bd3388e6-5b45-4fdc-9780-3efd1c34ebf8["Output\ncounts_from_star_sorted"];
  7 --> bd3388e6-5b45-4fdc-9780-3efd1c34ebf8;
  style bd3388e6-5b45-4fdc-9780-3efd1c34ebf8 stroke:#2c3143,stroke-width:4px;
  7b7c698b-4808-4b45-adf1-686f8d273d18["Output\nGene length"];
  7 --> 7b7c698b-4808-4b45-adf1-686f8d273d18;
  style 7b7c698b-4808-4b45-adf1-686f8d273d18 stroke:#2c3143,stroke-width:4px;
  8["🛠️ Subworkflow\ncount featureCount"];
  style 8 fill:#edd,stroke:#900,stroke-width:4px;
  1 -->|output| 8;
  4 -->|STAR_BAM| 8;
  f0de4714-4df8-4506-90d9-384537ad663e["Output\nfeatureCounts_sorted"];
  8 --> f0de4714-4df8-4506-90d9-384537ad663e;
  style f0de4714-4df8-4506-90d9-384537ad663e stroke:#2c3143,stroke-width:4px;
  8b9d6c76-6e82-4691-b8bc-9996d6ae1594["Output\nfeatureCounts_gene_length"];
  8 --> 8b9d6c76-6e82-4691-b8bc-9996d6ae1594;
  style 8b9d6c76-6e82-4691-b8bc-9996d6ae1594 stroke:#2c3143,stroke-width:4px;
  152ba01e-d4f2-4227-8812-87648a1c19ea["Output\nmultiqc_featureCounts_html"];
  8 --> 152ba01e-d4f2-4227-8812-87648a1c19ea;
  style 152ba01e-d4f2-4227-8812-87648a1c19ea stroke:#2c3143,stroke-width:4px;
  46c7a2e8-7819-4715-a028-7ad1de9ed605["Output\nfeatureCounts"];
  8 --> 46c7a2e8-7819-4715-a028-7ad1de9ed605;
  style 46c7a2e8-7819-4715-a028-7ad1de9ed605 stroke:#2c3143,stroke-width:4px;

Inputs

Input Label
Input dataset collection Paired list collection with PE fastqs
Input dataset Drosophila_melanogaster.BDGP6.32.109_UCSC.gtf.gz

Outputs

From Output Label
FastQC
cutadapt
STAR + multiQC
more QC
Determine strandness
count STAR
count featureCount

Tools

Tool Links
Cut1
__FLATTEN__
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_sort_header_tool/1.1.1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_tail_tool/1.1.0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/devteam/fastqc/fastqc/0.73+galaxy0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_MarkDuplicates/2.18.2.4 View in ToolShed
toolshed.g2.bx.psu.edu/repos/devteam/samtools_idxstats/samtools_idxstats/2.0.4 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/featurecounts/featurecounts/2.0.3+galaxy1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/gtftobed12/gtftobed12/357 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/length_and_gc_content/length_and_gc_content/0.1.2 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.11+galaxy1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/pygenometracks/pygenomeTracks/3.8+galaxy1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/rgrnastar/rna_star/2.7.10b+galaxy3 View in ToolShed
toolshed.g2.bx.psu.edu/repos/lparsons/cutadapt/cutadapt/4.0+galaxy1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/nilesh/rseqc/rseqc_geneBody_coverage/5.0.1+galaxy2 View in ToolShed
toolshed.g2.bx.psu.edu/repos/nilesh/rseqc/rseqc_infer_experiment/5.0.1+galaxy2 View in ToolShed
toolshed.g2.bx.psu.edu/repos/nilesh/rseqc/rseqc_read_distribution/5.0.1+galaxy2 View in ToolShed

To use these workflows in Galaxy you can either click the links to download the workflows, or you can right-click and copy the link to the workflow which can be used in the Galaxy form to import workflows.

Importing into Galaxy

Below are the instructions for importing these workflows directly into your Galaxy server of choice to start using them!
Hands-on: Importing a workflow
  • Click on Workflow on the top menu bar of Galaxy. You will see a list of all your workflows.
  • Click on galaxy-upload Import at the top-right of the screen
  • Provide your workflow
    • Option 1: Paste the URL of the workflow into the box labelled “Archived Workflow URL”
    • Option 2: Upload the workflow file in the box labelled “Archived Workflow File”
  • Click the Import workflow button

Below is a short video demonstrating how to import a workflow from GitHub using this procedure:

Video: Importing a workflow from URL

Version History

Version Commit Time Comments
19 a1251f286 2024-07-05 09:38:54 Removed 'comments' tags
18 068c0f303 2024-07-05 09:28:05 Updated 'QC + Mapping + Counting' workflow
17 3377d5c6f 2023-10-20 13:31:21 update workflow to have steps in the same order as in the tutorial
16 41dead43e 2023-05-02 10:31:07 add mo orcid to workflows
15 36eb5cf82 2023-04-28 17:26:00 update workflows and tests
14 f35bb9e74 2023-04-27 13:30:02 update zenodo try to make workflow test working
13 8fc9c9026 2023-04-25 07:46:15 add creators and licence to workflows
12 dc21d9ddb 2023-04-22 08:29:08 update images and results, rearrange workflow for part1
11 9921a8623 2023-04-21 12:37:10 Update first part of the tutorial
10 6203157c4 2022-05-05 08:25:29 revert bdc1fd3
9 4d2f611a6 2022-04-28 15:20:51 subset BAM before gene body coverage
8 bdc1fd3ce 2022-04-28 08:35:56 switch order of fastqc and flatten
7 8ff9bda0f 2022-04-14 21:15:02 update workflow to fit test
6 bae8287b9 2022-04-14 12:52:02 update qc workflow and test
5 e08c38b2b 2022-04-05 19:36:51 add tag
4 35d565217 2022-04-05 13:18:22 update workflows
3 667ff3de9 2020-01-22 10:59:29 annotation
2 eb4d724e0 2020-01-15 10:41:35 Workflow renaming
1 e477f2b7f 2019-09-10 09:22:59 Split workflow and add more tests

For Admins

Installing the workflow tools

wget https://training.galaxyproject.org/training-material/topics/transcriptomics/tutorials/ref-based/workflows/qc-mapping-counting.ga -O workflow.ga
workflow-to-tools -w workflow.ga -o tools.yaml
shed-tools install -g GALAXY -a API_KEY -t tools.yaml
workflow-install -g GALAXY -a API_KEY -w workflow.ga --publish-workflows