GTN Proteogenomics1 Database Creation

proteomics-proteogenomics-dbcreation/galaxy-workflow-mouse-rnaseq-dbcreation

Author(s)

version Version
8
last_modification Last updated
Feb 9, 2021
license License
None Specified, defaults to CC-BY-4.0
galaxy-tags Tags
proteomics

Features

Tutorial
hands_on Proteogenomics 1: Database Creation

Workflow Testing
Tests: ❌
Results: Not yet automated
FAIRness purl PURL
https://gxy.io/GTN:W00172
RO-Crate logo with flask Download Workflow RO-Crate Workflowhub cloud with gears logo View on WorkflowHub
Launch in Tutorial Mode question
galaxy-download Download
flowchart TD
  0["ℹ️ Input Dataset\nFASTQ_ProB_22LIST.fastqsanger"];
  style 0 stroke:#2c3143,stroke-width:4px;
  1["ℹ️ Input Dataset\nMus_musculus.GRCm38.86.gtf"];
  style 1 stroke:#2c3143,stroke-width:4px;
  2["ℹ️ Input Dataset\nTrimmed_ref_5000_uniprot_cRAP.fasta"];
  style 2 stroke:#2c3143,stroke-width:4px;
  3["HISAT2"];
  0 -->|output| 3;
  4["Editing GTF File"];
  1 -->|output| 4;
  70b6f719-803f-4fa9-8394-a3b64b71ec01["Output\nchr_corrected_gtf"];
  4 --> 70b6f719-803f-4fa9-8394-a3b64b71ec01;
  style 70b6f719-803f-4fa9-8394-a3b64b71ec01 stroke:#2c3143,stroke-width:4px;
  5["FASTA-to-Tabular"];
  2 -->|output| 5;
  6["FreeBayes"];
  3 -->|output_alignments| 6;
  7["StringTie"];
  4 -->|outfile| 7;
  3 -->|output_alignments| 7;
  8["Filter Tabular"];
  5 -->|output| 8;
  9["CustomProDB"];
  3 -->|output_alignments| 9;
  6 -->|output_vcf| 9;
  10["GffCompare"];
  4 -->|outfile| 10;
  7 -->|output_gtf| 10;
  11["SQLite to tabular"];
  9 -->|output_genomic_mapping_sqlite| 11;
  12["SQLite to tabular"];
  9 -->|output_variant_annotation_sqlite| 12;
  13["FASTA Merge Files and Filter Unique Sequences"];
  9 -->|output_rpkm| 13;
  9 -->|output_snv| 13;
  9 -->|output_indel| 13;
  a5054be9-dd74-4d60-a4c6-494fdbcb97a1["Output\nCustomProDB_FASTA"];
  13 --> a5054be9-dd74-4d60-a4c6-494fdbcb97a1;
  style a5054be9-dd74-4d60-a4c6-494fdbcb97a1 stroke:#2c3143,stroke-width:4px;
  14["FASTA-to-Tabular"];
  9 -->|output_rpkm| 14;
  15["Convert gffCompare annotated GTF to BED"];
  10 -->|transcripts_annotated| 15;
  16["Column Regex Find And Replace"];
  11 -->|query_results| 16;
  17["Column Regex Find And Replace"];
  12 -->|query_results| 17;
  18["FASTA-to-Tabular"];
  13 -->|output| 18;
  19["Filter Tabular"];
  14 -->|output| 19;
  20["Translate BED transcripts"];
  15 -->|output| 20;
  21["Query Tabular"];
  17 -->|out_file1| 21;
  e9468479-4ebd-4ced-8086-c10e916b97c5["Output\nvariant_annoation_sqlite"];
  21 --> e9468479-4ebd-4ced-8086-c10e916b97c5;
  style e9468479-4ebd-4ced-8086-c10e916b97c5 stroke:#2c3143,stroke-width:4px;
  22["Column Regex Find And Replace"];
  18 -->|output| 22;
  23["Reference Protein Accessions"];
  19 -->|output| 23;
  8 -->|output| 23;
  19aa5a85-5933-4ab5-ae76-e4b1ec5b2f85["Output\nReference_Protein_Accessions"];
  23 --> 19aa5a85-5933-4ab5-ae76-e4b1ec5b2f85;
  style 19aa5a85-5933-4ab5-ae76-e4b1ec5b2f85 stroke:#2c3143,stroke-width:4px;
  24["bed to protein map"];
  20 -->|translation_bed| 24;
  25["Tabular-to-FASTA"];
  22 -->|out_file1| 25;
  26["Concatenate datasets"];
  24 -->|output| 26;
  16 -->|out_file1| 26;
  27["FASTA Merge Files and Filter Unique Sequences"];
  2 -->|output| 27;
  25 -->|output| 27;
  20 -->|translation_fasta| 27;
  924bce7c-5c6a-4cb3-a459-18784a55ae01["Output\nUniprot_cRAP_SAV_indel_translatedbed_FASTA"];
  27 --> 924bce7c-5c6a-4cb3-a459-18784a55ae01;
  style 924bce7c-5c6a-4cb3-a459-18784a55ae01 stroke:#2c3143,stroke-width:4px;
  28["Genomic_mapping_sqlite"];
  26 -->|out_file1| 28;
  6542c4e5-289a-4be4-9b42-7a4e170ba198["Output\ngenomic_mapping_sqlite"];
  28 --> 6542c4e5-289a-4be4-9b42-7a4e170ba198;
  style 6542c4e5-289a-4be4-9b42-7a4e170ba198 stroke:#2c3143,stroke-width:4px;

Inputs

Input Label
Input dataset FASTQ_ProB_22LIST.fastqsanger
Input dataset Mus_musculus.GRCm38.86.gtf
Input dataset Trimmed_ref_5000_uniprot_cRAP.fasta

Outputs

From Output Label
Input dataset FASTQ_ProB_22LIST.fastqsanger
Input dataset Mus_musculus.GRCm38.86.gtf
Input dataset Trimmed_ref_5000_uniprot_cRAP.fasta
toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy5 HISAT2
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_column/1.1.3 Replace Text Editing GTF File
toolshed.g2.bx.psu.edu/repos/devteam/fasta_to_tabular/fasta2tab/1.1.1 FASTA-to-Tabular
toolshed.g2.bx.psu.edu/repos/devteam/freebayes/freebayes/1.3.1 FreeBayes
toolshed.g2.bx.psu.edu/repos/iuc/stringtie/stringtie/2.1.1 StringTie
toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/2.0.0 Filter Tabular
toolshed.g2.bx.psu.edu/repos/galaxyp/custom_pro_db/custom_pro_db/1.22.0 CustomProDB
toolshed.g2.bx.psu.edu/repos/iuc/gffcompare/gffcompare/0.11.2 GffCompare
toolshed.g2.bx.psu.edu/repos/iuc/sqlite_to_tabular/sqlite_to_tabular/2.0.0 SQLite to tabular
toolshed.g2.bx.psu.edu/repos/iuc/sqlite_to_tabular/sqlite_to_tabular/2.0.0 SQLite to tabular
toolshed.g2.bx.psu.edu/repos/galaxyp/fasta_merge_files_and_filter_unique_sequences/fasta_merge_files_and_filter_unique_sequences/1.2.0 FASTA Merge Files and Filter Unique Sequences
toolshed.g2.bx.psu.edu/repos/devteam/fasta_to_tabular/fasta2tab/1.1.1 FASTA-to-Tabular
toolshed.g2.bx.psu.edu/repos/galaxyp/gffcompare_to_bed/gffcompare_to_bed/0.2.1 Convert gffCompare annotated GTF to BED
toolshed.g2.bx.psu.edu/repos/galaxyp/regex_find_replace/regexColumn1/1.0.0 Column Regex Find And Replace
toolshed.g2.bx.psu.edu/repos/galaxyp/regex_find_replace/regexColumn1/1.0.0 Column Regex Find And Replace
toolshed.g2.bx.psu.edu/repos/devteam/fasta_to_tabular/fasta2tab/1.1.1 FASTA-to-Tabular
toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/2.0.0 Filter Tabular
toolshed.g2.bx.psu.edu/repos/galaxyp/translate_bed/translate_bed/0.1.0 Translate BED transcripts
toolshed.g2.bx.psu.edu/repos/iuc/query_tabular/query_tabular/3.0.0 Query Tabular
toolshed.g2.bx.psu.edu/repos/galaxyp/regex_find_replace/regexColumn1/1.0.0 Column Regex Find And Replace
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cat/0.1.0 Concatenate datasets Reference Protein Accessions
toolshed.g2.bx.psu.edu/repos/galaxyp/bed_to_protein_map/bed_to_protein_map/0.2.0 bed to protein map
toolshed.g2.bx.psu.edu/repos/devteam/tabular_to_fasta/tab2fasta/1.1.1 Tabular-to-FASTA
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cat/0.1.0 Concatenate datasets
toolshed.g2.bx.psu.edu/repos/galaxyp/fasta_merge_files_and_filter_unique_sequences/fasta_merge_files_and_filter_unique_sequences/1.2.0 FASTA Merge Files and Filter Unique Sequences
toolshed.g2.bx.psu.edu/repos/iuc/query_tabular/query_tabular/3.0.0 Query Tabular Genomic_mapping_sqlite

Tools

Tool Links
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_cat/0.1.0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/bgruening/text_processing/tp_replace_in_column/1.1.3 View in ToolShed
toolshed.g2.bx.psu.edu/repos/devteam/fasta_to_tabular/fasta2tab/1.1.1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/devteam/freebayes/freebayes/1.3.1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/devteam/tabular_to_fasta/tab2fasta/1.1.1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/galaxyp/bed_to_protein_map/bed_to_protein_map/0.2.0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/galaxyp/custom_pro_db/custom_pro_db/1.22.0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/galaxyp/fasta_merge_files_and_filter_unique_sequences/fasta_merge_files_and_filter_unique_sequences/1.2.0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/galaxyp/gffcompare_to_bed/gffcompare_to_bed/0.2.1 View in ToolShed
toolshed.g2.bx.psu.edu/repos/galaxyp/regex_find_replace/regexColumn1/1.0.0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/galaxyp/translate_bed/translate_bed/0.1.0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/filter_tabular/filter_tabular/2.0.0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/gffcompare/gffcompare/0.11.2 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/hisat2/hisat2/2.1.0+galaxy5 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/query_tabular/query_tabular/3.0.0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/sqlite_to_tabular/sqlite_to_tabular/2.0.0 View in ToolShed
toolshed.g2.bx.psu.edu/repos/iuc/stringtie/stringtie/2.1.1 View in ToolShed

To use these workflows in Galaxy you can either click the links to download the workflows, or you can right-click and copy the link to the workflow which can be used in the Galaxy form to import workflows.

Importing into Galaxy

Below are the instructions for importing these workflows directly into your Galaxy server of choice to start using them!
Hands-on: Importing a workflow
  • Click on Workflow on the top menu bar of Galaxy. You will see a list of all your workflows.
  • Click on galaxy-upload Import at the top-right of the screen
  • Provide your workflow
    • Option 1: Paste the URL of the workflow into the box labelled “Archived Workflow URL”
    • Option 2: Upload the workflow file in the box labelled “Archived Workflow File”
  • Click the Import workflow button

Below is a short video demonstrating how to import a workflow from GitHub using this procedure:

Video: Importing a workflow from URL

Version History

Version Commit Time Comments
12 28fc28c45 2021-02-08 19:18:49 Update galaxy-workflow-mouse_rnaseq_dbcreation.ga
11 1eca54ca9 2021-01-27 21:47:56 Update galaxy-workflow-mouse_rnaseq_dbcreation.ga
10 0db38ae94 2021-01-27 21:41:38 Update galaxy-workflow-mouse_rnaseq_dbcreation.ga
9 13e2ffc4d 2021-01-27 21:27:34 Update galaxy-workflow-mouse_rnaseq_dbcreation.ga
8 2c4b4646c 2021-01-27 21:17:15 Update galaxy-workflow-mouse_rnaseq_dbcreation.ga
7 f99b54e3c 2021-01-27 21:01:48 Update galaxy-workflow-mouse_rnaseq_dbcreation.ga
6 667ff3de9 2020-01-22 10:59:29 annotation
5 eb4d724e0 2020-01-15 10:41:35 Workflow renaming
4 55fe079b2 2020-01-13 16:30:56 WoLF PSORT WF
3 361236c41 2019-04-04 09:00:14 Changed format of workflows
2 6eef55b7e 2019-02-27 18:54:36 Updated install_tutorial_requirements.sh + minor fixes (#1275)
1 a928824de 2018-08-25 09:12:50 add protegenomics dbcreation tutorial

For Admins

Installing the workflow tools

wget https://training.galaxyproject.org/training-material/topics/proteomics/tutorials/proteogenomics-dbcreation/workflows/galaxy-workflow-mouse_rnaseq_dbcreation.ga -O workflow.ga
workflow-to-tools -w workflow.ga -o tools.yaml
shed-tools install -g GALAXY -a API_KEY -t tools.yaml
workflow-install -g GALAXY -a API_KEY -w workflow.ga --publish-workflows