Annotate, prepare tests and publish Galaxy workflows in workflow registries

Overview
Creative Commons License: CC-BY Questions:
  • How can a Galaxy workflow be annotated to improve reusability?

  • What are the best practices for testing a Galaxy workflow?

  • How can a Galaxy workflow be submitted to the Intergalactic Workflow Commission (IWC)?

Objectives:
  • Annotate a Galaxy workflows with essential metadata

  • Annotate and apply best practices to data analysis Galaxy workflows for consistency and reusability

  • Implement robust tests to ensure workflow reliability and accuracy

  • Successfully integrate key data analysis Galaxy workflows into IWC, improving accessibility and usability

Requirements:
Time estimation: 2 hours
Level: Intermediate Intermediate
Supporting Materials:
Published: May 26, 2025
Last modification: May 26, 2025
License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MIT
version Revision: 1

Research data is accumulating at an unprecedented rate, presenting significant challenges for achieving fully reproducible science. As a result, implementing high-quality management of scientific data has become a global priority. One key aspect of this effort is the use of computational workflows, which describe the complex, multi-step methods used for data analysis. In Galaxy, workflows are a powerful feature that allows researchers to link multiple steps of complex analyses seamlessly. To maximize their impact, these workflows should adhere to best practices that make them FAIR: Findable, Accessible, Interoperable, and Reusable.

The FAIR principles —-Findable, Accessible, Interoperable, and Reusable—- provide practical guidelines for enhancing the value of research data:

  • Findable: Easy to locate through rich metadata and unique identifiers.
  • Accessible: Stored in a way that allows them to be retrieved by those who need them.
  • Interoperable: Usable across different systems and platforms without extensive adaptation.
  • Reusable: Well-documented and clearly licensed to enable reuse in different contexts.
    Comment

    You can learn more about FAIR Data, Workflows, and Research in our dedicated topic with numerous tutorials

Applying these principles to workflows is equally important for good data management:

  • Enhanced Discoverability: Well-annotated and documented workflows are easier to find, making them more likely to be used and cited.
  • Improved Reproducibility: Standardized and tested workflows ensure that analyses can be reproduced, validating research findings.
  • Community Collaboration: Sharing workflows through centralized registries fosters collaboration and innovation within the bioinformatics community.
  • Sustainability: Regular updates and versioning ensure that workflows remain compatible with the latest tools, extending their lifespan and utility.

While making a workflow FAIR might seem complicated (Wilkinson et al. 2025, Goble et al. 2020), publications like “Ten quick tips for building FAIR workflows” (de Visser et al. 2023) provide practical guidelines to simplify the process:

This image provides ten quick tips for building FAIR workflows, focusing on findability (registering the workflow and describing it with rich metadata), accessibility (making source code available in a public repository and providing example input data and results), interoperability (adhering to file format standards and making the workflow portable), and reusability (providing a reproducible computational environment, adding a configuration file with defaults, modularizing the workflow, and offering clear documentation).Open image in new tab

Figure 1: Ten quick tips for building FAIR workflows. Source: de Visser et al. 2023

Using Galaxy as a workflow management system, many of these tips (all related to Interoperability and Reusability) are already fulfilled:

  • Tip 5 (Interoperability): “The tools integrated in a workflow should adhere to file format standards.”
  • Tip 6 (Interoperability): “Make the workflow portable.”
  • Tip 7 (Reusability): “Provide a reproducible computational environment to run the workflow.”
  • Tip 8 (Reusability): “Add a configuration file with defaults.”
  • Tip 9 (Reusability): “Modularize the workflow.”

In this tutorial, we will demonstrate how to fulfill the remaining tips:

  • Tip 1 (Findability): “Register the workflow.”
  • Tip 2 (Findability): “Describe the workflow with rich metadata.”
  • Tip 3 (Accessibility): “Make source code available in a public code repository.”
  • Tip 4 (Accessibility): “Provide example input data and results along with the workflow.”
  • Tip 10 (Reusability): “Provide clear and concise workflow documentation.”

To illustrate the process, we will use a simple workflow with 2 steps as an example.

Agenda

In this tutorial, we will cover:

  1. Prepare the workflow
  2. Annotate the Galaxy workflow with essential metadata
  3. Adhere to best practices for Galaxy workflows
  4. Prepare example input data and results
  5. Register the workflow by making the workflow available in the public IWC GitHub repository
    1. Check for workflow eligibility
    2. Generate tests
    3. Submit the workflow to IWC
  6. Conclusion

Prepare the workflow

Here, we will use a workflow running Falco (FastQC alternative, de Sena Brandine and Smith 2021) and MultiQC (Ewels et al. 2016) but you can use your own workflow.

Hands On: Create the workflow into Galaxy
  1. Go to the workflow page
  2. Create a new workflow

    1. Click Workflow on the top bar
    2. Click the new workflow galaxy-wf-new button
    3. Give it a clear and memorable name
    4. Clicking Save will take you directly into the workflow editor for that workflow
    5. Need more help? Please see the How to make a workflow subsection here

  3. Add a single data input

  4. Add Falco ( Galaxy version 1.2.4+galaxy0) and MultiQC ( Galaxy version 1.27+galaxy3)

  5. Connect the input to Falco
  6. Define MultiQC parameter
    • Which tool was used generate logs?: FastQC
  7. Connect text_file Falco to MultiQC input

Annotate the Galaxy workflow with essential metadata

The first step to FAIRify a Galaxy workflow is to fulfill Tip 2 (Findability): “Describe the workflow with rich metadata”. Describing the workflow with rich metadata helps both humans and machines understand its purpose, facilitating discovery by search engines. Galaxy allows to associate workflows with metadata directly in its interface.

Hands On: Open workflow attributes
  1. Click on Attributes on the left side of the workflow editor (or right side for older Galaxy version for clicking on galaxy-pencil Edit Attributes)

Some workflow metadata should appear.

Question
  1. What is the name of the workflow?
  1. “Unnamed Workflow”

Metadata should include details about the workflow’s components and clearly outline its purpose, scope, and limitations. This ensures the workflow is easily findable and understandable.

Question

Which metadata is supported in Galaxy workflow interface?

Galaxy workflow interface supports:

  • Name
  • Version
  • Annotation: These notes will be visible when this workflow is viewed.
  • License
  • Creator: Either a person or an organization.
  • Tags: Apply tags to make it easy to search for and find items with the same tag.

Galaxy workflow interface supports some metadata. Are they enough to fulfill Tip 2 and also Tip 1 “Register the workflow.”? WorkflowHub, a workflow registry we will explain more later, supports the following metadata:

Name Description Mandatory
Title This field is mandatory and is with some workflow types pre-filled with the title of the workflow. Yes
Description If a CWL (abstract) file is given, the description will be parsed automatically out of the doc attribute. In any other case this field can be used to write some documentation that will be shown on the workflow page. No
Source If the workflow came from an external repository (i.e. GitHub), you can include its original URL here. No
Maturity This field can be used to specify in which maturity state the workflow is. The two available options are: work-in-progress, stable No
Teams Every workflow registration is linked to one or more teams. Yes
Licence The standard licence is Apache Software Licence 2.0. If you did not make the workflow yourself, be sure that the licence corresponds to the licence where you took the workflow from (for example github licences). No
Sharing Specify who can view the summary, get access to the content, and edit the Workflow. This is possibly already filled in according to the selected project. No
Tags Choose an appropriate tag for your workflow. Please check if your tag is already available and use the existing one if so. If you make a new tag, keep it simple without capitals or spaces. For example all new covid-19 workflows need to be tagged with covid-19. No
Creators This is an important section where all the people that were involved in making / publishing this workflow are listed. No
Question
  1. Which WorkflowHub metadata is supported in Galaxy workflow interface?
  2. Which values should we put for our workflow?
Workflow metadata Galaxy metadata Mapping workflow value
Title Name Quality control and mapping
Description Annotation Workflow runs quality control and mapping of paired-end short-reads data.
Source Not supported in Galaxy workflow interface  
Teams Not supported in Galaxy workflow interface  
License License MIT License
Sharing Supported in Galaxy workflow interface but not in the same way as WorkflowHub  
Tags Tags sequence-analysis, mapping
Creators Creator  

Let’s annotate the workflow

Hands On: Add metadata to the workflow
  1. Add a proper title in Name
  2. Add a workflow description in Annotation
  3. Select a license in License

    Comment

    The most appropriate license are MIT or GNU Affero (if you want a copyleft).

  4. Add creators in Creator with their name and a unique identifier (typically an orcid.org ID)

    Comment

    Do not forget to click on Save

  5. Add tags in Tags
  6. Save the workflow

Tip 2 is now fulfilled. We can now move toward the other recommendations:

  • Tip 1 (Findability): “Register the workflow.”
  • Tip 3 (Accessibility): “Make source code available in a public code repository.”
  • Tip 4 (Accessibility): “Provide example input data and results along with the workflow.”
  • Tip 10 (Reusability): “Provide clear and concise workflow documentation.”

    The workflow annotation provides a short and concise description of the workflow. So Tip 10 is partially fullfilled. In addition to the workflow annotation, each step could annotated and commented so users could have an idea about the purpose of the steps

    Let’s add label and annotation to each step and comments to the workflow.

    Hands On: Add tool label and annotation and comments
    1. Add tool label and step annotation to Falco
      1. Click on Falco
      2. Fill in Label with Quality control in the right panel
      3. Fill in Step Annotation with This step uses Falco (FastQC alternative) to generate statistics of raw reads quality including basic statistics, per base sequence quality, per sequence quality scores, adapter content, etc.
      4. Save the workflow
    2. Add tool label and step annotation to MultiQC
    3. Add comments and annotations to workflow to create an high resolution image of the Galaxy Workflow by following the dedicated tutorial

Before we care about Tip 3 and 4, lets use the build-in Galaxy Wizard to check if our workflow adheres to best practices of Galaxy workflows.

Adhere to best practices for Galaxy workflows

Galaxy has a build-in Wizard to check against community developed best practice workflows. It will recommend to use a License, add Authors etc but also ensures that workflows are easy to test, that they are usable as subworkflows and invocation reports, and can be consumed easily via an API. The reusablility will be greatly enhanced if you stick to those recommendations.

Hands On: Check alignment with best practices
  1. Click on galaxy-wf-best-practices Best Practices on the right

    When you are editing a workflow, there are a number of additional steps you can take to ensure that it is a Best Practice workflow and will be more reusable.

    1. Open a workflow for editing
    2. In the workflow menu bar, you’ll find the galaxy-wf-options Workflow Options dropdown menu.
    3. Click on it and select galaxy-wf-best-practices Best Practices from the dropdown menu.

      screenshot showing the best practices menu item in the gear dropdown.

    4. This will take you to a new side panel, which allows you to investigate and correct any issues with your workflow.

      screenshot showing the best practices side panel. several issues are raised like a missing annotation with a link to add that, and non-optional inputs that are unconnected. Additionally several items already have green checks like the workflow defining creator information and a license.

    The Galaxy community also has a guide on best practices for maintaining workflows. This guide includes the best practices from the Galaxy workflow panel, plus:

    • adding tests to the workflow
    • publishing the workflow on GitHub, a public GitLab server, or another public version-controlled repository
    • registering the workflow with a workflow registry such as WorkflowHub or Dockstore

A side panel will open showing a review of the best practices with

  • Green checks indicating that certain elements are already correctly defined (e.g., creator information, license).
  • Orange warnings highlighting potential issues such as missing annotations, unconnected inputs, or other workflow problems.

By following the steps with orange warnings and checking the side panel, we will ensure our workflow adheres to best practices and is ready for public use.

Question

How many orange warnings do we have?

2

The first warning is Some workflow inputs are missing labels and/or annotations. To follow best practices, all inputs should be explicit (with labelled input nodes) and tool steps should not have disconnected data inputs (even though the GUI can handle this) or consume workflow parameters. Older style “runtime parameters” should only be used for post job actions and newer type workflow parameter inputs should be used to manipulate tool logic.

Hands On: Add label to input
  1. Click on Input dataset: Missing a label and annotation
  2. Fill in Label in the 1: Input dataset side panel that opened on the right
  3. Save the workflow
  4. Check the best practices
Question

Is it okay for the input?

The input is missing an annotation

Let’s add annotation to input.

Hands On: Add annotation to input
  1. Click on Input dataset: Missing a label and annotation
  2. Fill in Annotation in the 1: Input dataset side panel that opened on the right
  3. Save the workflow
  4. Check the best practices

We now have a green check for the input. The second orange warning mentions This workflow has no labeled outputs, please select and label at least one output.

As for inputs, workflows should define explicit, labeled outputs. While Galaxy does not require this, declaring and labeling outputs offers significant advantages. A workflow with clearly defined outputs provides an explicit interface, making it easier to generate reports, test the workflow, and document them.

Hands On: Add labels to outputs
  1. Rename labels for all outputs to add tool name (e.g. Falco text_file becomes falco_text_file)
  2. Check boxes on the left of HTML outputs of both Falco and MultiQC in the middle pannel
  3. Save the workflow
  4. Check the best practices
Question

Is the workflow following all best practices?

Yes!

After confirming that all the best practices are applied, make the workflow public. This allows other users to access, reuse, and share the workflow in the Galaxy community.

Hands On: Make the workflow public
  1. Make the workflow public

    1. Click on galaxy-workflows-activity Workflows in the Galaxy activity bar (on the left side of the screen, or in the top menu bar of older Galaxy instances). You will see a list of all your workflows
    2. Click on the history-share Share button of the workflow you would like to publish
    3. Click on Make Workflow accessible. This makes the workflow publicly accessible but unlisted.
    4. To also list the workflow for all users on the Public workflows tab of the galaxy-workflows-activity Workflows page, click Make Workflow publicly available in Published Workflows

Prepare example input data and results

To fulfill Tip 4 (“Provide example input data and results along with the workflow.”), we will create test data that can be used to test the workflow. The data should be small and simple, so it can interact with the workflow and generate results without excessive processing time.

Hands On: Create an example Galaxy History
  1. Create a new Galaxy history for this analysis.

    To create a new history simply click the new-history icon at the top of the history panel:

    UI for creating new history

  2. Rename the history.

    1. Click on galaxy-pencil (Edit) next to the history name (which by default is “Unnamed history”)
    2. Type the new name
    3. Click on Save
    4. To cancel renaming, click the galaxy-undo “Cancel” button

    If you do not have the galaxy-pencil (Edit) next to the history name (which can be the case if you are using an older version of Galaxy) do the following:

    1. Click on Unnamed history (or the current name of the history) (Click to rename history) at the top of your history panel
    2. Type the new name
    3. Press Enter

  3. galaxy-upload Upload test data to the newly created history from the following link:

    https://zenodo.org/record/3977236/files/female_oral2.fastq-4143.gz
    
    • Copy the link location
    • Click galaxy-upload Upload Data at the top of the tool panel

    • Select galaxy-wf-edit Paste/Fetch Data
    • Paste the link(s) into the text field

    • Press Start

    • Close the window

  4. galaxy-eye Preview the datasets to verify that the uploaded data looks correct and is ready for analysis.

  5. Run the workflow using the uploaded test data.

    1. Click on galaxy-workflows-activity Workflows in the Galaxy activity bar (on the left side of the screen, or in the top menu bar of older Galaxy instances). At the top of the resulting page you will have the option to switch between the My workflows, Workflows shared with me and Public workflows tabs. Select the tab you want to see all workflows in that category.
    2. Click on the workflow-run Run workflow button of the workflow you would like to use
    3. Configure the workflow as needed
    4. Click the Run Workflow button at the top-right of the screen

    You may have to refresh your history to see the queued jobs

  6. Make the history public

    Sharing your history allows others to import and access the datasets, parameters, and steps of your history.

    Access the history sharing menu via the History Options dropdown (galaxy-history-options), and clicking “history-share Share or Publish”

    1. Share via link
      • Open the History Options galaxy-history-options menu at the top of your history panel and select “history-share Share or Publish”
        • galaxy-toggle Make History accessible
        • A Share Link will appear that you give to others
      • Anybody who has this link can view and copy your history
    2. Publish your history
      • galaxy-toggle Make History publicly available in Published Histories
      • Anybody on this Galaxy server will see your history listed under the Published Histories tab opened via the galaxy-histories-activity Histories activity
    3. Share only with another user.
      • Enter an email address for the user you want to share with in the Please specify user email input below Share History with Individual Users
      • Your history will be shared only with this user.
    4. Finding histories others have shared with me
      • Click on the galaxy-histories-activity Histories activity in the activity bar on the left
      • Click the Shared with me tab
      • Here you will see all the histories others have shared with you directly

    Note: If you want to make changes to your history without affecting the shared version, make a copy by going to History Options galaxy-history-options icon in your history and clicking Copy this History

With this example history we fulfilled Tip 4. We will now fulfill Tip 1 (“Register the workflow.”) and Tip 3 (“Make source code available in a public code repository.”) at the same time

Register the workflow by making the workflow available in the public IWC GitHub repository

To make your workflow FAIR, it’s essential to start by making it findable. Let’s now register the workflow in a public registry that enables systematic scientific annotations and supports multiple workflow languages. We recommend using specialized workflow registries such as WorkflowHub (Gustafsson et al. 2024) or Dockstore (Yuen et al. 2021), which cater to workflows written in different languages and provide unique features like digital object identifiers (DOIs) for easy citation and version tracking.

A Galaxy workflow can be registered by anyone on the WorkflowHub and Dockstore by following their documentation. In this section, we will see how to register a workflow on both WorkflowHub and Dockstore (Tip 1 - “Register the workflow.”) by making it available in a public GitHub repository (Tip 3 - “Make source code available in a public code repository.”) maintained by the Intergalactic Workflow Commission (IWC), a Galaxy community effort.

The IWC maintains high-quality Galaxy Workflows via a Galaxy Workflows Library. All workflows are reviewed and tested before publication and with every new Galaxy release. Deposited workflows follow best practices and are versioned using GitHub releases. Workflows also contain important metadata. Additionally the IWC collects further best practices, tips and tricks, FAQs and assists the community in designing high-quality Galaxy workflows.

IWC offer guidelines for adding workflow that will go through:

  1. Check for workflow eligibility
  2. Ensure workflows follow best-practices (already done)
  3. Generate tests

Check for workflow eligibility

IWC collects production workflows targeted at users that want to analyze their own data. As such, the workflow should be sufficiently generic that users can provide their own data.

IWC encourage, but do not require, links to related Galaxy Training Network Tutorials. Importantly, each workflow should be described in a way that a user can run the workflow on their own data without modifying the workflow. If we want to deposit a workflow that accompanies a tutorial we have to make sure that the workflow does not refer to datasets that only make sense in the context of the tutorial.

By fulfilling the first Tips as we did before, we guarantee for workflow eligibility.

Generate tests

This is usually the most difficult part and we encourage all new contributors to IWC to propose their workflows even if they did not managed to generate tests. However, the publication of these workflow will be speed up if tests are already present.

Find input datasets

To test the workflow, we need input datasets. By fulfilling Tip 4 (“Provide example input data and results along with the workflow.”), we generated a toy dataset. We now need to publish it to Zenodo to have a permanent URL, also allowing others to easily retrieve and reuse the data when running or validating the workflow.

Hands On: Upload test data to Zenodo
  1. Create an account on Zenodo or log in using your GitHub account.
  2. Click on the ”+” button at the top of the page and select “New upload”.
  3. Drag and drop the test dataset files into the upload area.
  4. Fill in the metadata:
    • Resource type: Dataset.
    • Title: Dataset for "..." workflow
    • Creator: Add the relevant author(s).
    • Description: This dataset is associated with the Galaxy workflow "..."
    • License: Choose GNU General Public License v3.0 or later
  5. Click “Publish” to make the dataset publicly available.

Uploading test data to Zenodo ensures that it has a permanent DOI, making it easy to reference in workflow documentation, publications, and testing pipelines.

Generate test from a workflow invocation

To generate tests, we can either write test cases by hand, or use a workflow invocation to generate a test case. We will do the second case. Let’s first prepare the folder to store the tests in the IWC git repository.

Hands On: Prepare the folder for the tests
  1. Fork the IWC GitHub repositoryto your GitHub account.

  2. Clone your fork locally:

    git clone https://github.com/yourusername/iwc.git
    cd iwc
    
  3. Create a new branch for adding the workflow:

    git checkout -b add-new-workflow
    
  4. Create a new directory under one of the directories that represent categories

    Comment

    If no category is suitable, we can create a new category directory. We should name the directory that contains our workflow(s) appropriately, as it will become the name of the repository deployed to iwc-workflows GitHub organization and only use lower-case and - in names of categories and repositories.

  5. Move the newly created directory

We have earlier executed the workflow on our example dataset. We can use this workflow invocation to generate the tests using Planemo, a software development kit for tools and workflows (Bray et al. 2023)

Hands On: Extract the tests from the example history workflow invocation
  1. Install Planemo (if not already installed) by following the Planemo installation guide.

  2. Get your Galaxy API key from the Galaxy server.

    1. In your browser, open your Galaxy homepage
    2. Log in, or register a new account, if it’s the first time you’re logging in
    3. Go to User -> Preferences in the top menu bar, then click on Manage API key
    4. If there is no current API key available, click on Create a new key to generate it
    5. Copy your API key to somewhere convenient, you will need it throughout this tutorial

  3. Get the workflow invocation.

    • Go to the workflow invocations page
      • Before Galaxy 24.0: Go to User > Workflow Invocations
      • In Galaxy 24.0: Go to Data > Workflow Invocations
      • Above Galaxy 24.1: Go to Workflow Invocation in the activity bar on the left
    • Open the most recent item
    • Find the invocation id:
      • Below 24.0, you can get it here:

        The image depicts a user interface from a computational workflow or job management system. At the top left, there is a "View Report" button, which likely allows users to access a detailed report of the job or workflow. The interface features two horizontal progress bars: the first indicates that all 4 steps have been successfully scheduled, and the second shows that all 3 jobs are complete. A "Download BioCompute Object" button is present, presumably for downloading a standardized description of the bioinformatics protocol or workflow. The interface includes collapsible sections labeled "Inputs," "Outputs," and "Steps," which likely contain detailed information about the inputs used, outputs generated, and steps involved in the workflow, respectively. At the top right, an invoice number is displayed: "Invoice: 6e32d21c3708b2b6." This interface is part of a system used for managing and monitoring computational jobs, particularly in bioinformatics or computational biology contexts.

      • Above Galaxy 24.1 (activity bar), you can find the workflow invocation id from the URL. For example, https://usegalaxy.org/workflows/invocations/be5c48c113145dd5 means that the workflow invocation id is be5c48c113145dd5.

  4. Extract the workflow with test data using:

    planemo workflow_test_init \
       --from_invocation <workflow_invocation_id> \
       --galaxy_url <galaxy_server_url> \
       --galaxy_user_key <your_api_key>
    

This will place in the current working directory:

  • a <workflow_name>.ga with the workflow
  • a <workflow_name-test>.yml to describe the tests
  • a test-data folder with input file and selected outputs
Question

In the <workflow_name-test>.yml,

  1. How is the input file provided?
  2. How are the workflow outputs checked for validity?
  3. How big are the files in the test-data file?
  1. The input file is given by:

    job:
      FastQ:
        class: File
        path: test-data/FastQ.fastqsanger.gz
        filetype: fastqsanger.gz
    
  2. The workflow outputs are checked for validity by comparing the generated outputs to the files stored in the test-data folder.

    outputs:
      multiqc_html:
        path: test-data/multiqc_html.html
      falco_html:
        path: test-data/falco_html.html
    
  3. The files are quite large:

    • FastQ.fastqsanger.gz: 132 KB
    • falco_html.html: 169 KB
    • multiqc_html.html: 5 MB

The HTML files in the test-data folder are quite large, above the 1 MB limit of IWC. To limit the size in the IWC GitHub repository, we will remove test files in test-data folder. To do that, we will:

  1. use the file stored on Zenodo for the input.
  2. edit the test comparisons to use assertions testing the output content, rather than comparing the entire output file with test data.

    The description of assertion can be find in the Galaxy XML documentation.

    Do not hesitate to look at different test files in the IWC GitHub for examples.

Hands On: Edit the test description
  1. Remove the test-data folder
  2. Open the <workflow_name-test>.yml file

  3. Replace the path of input by the Zenodo URL:

      https://zenodo.org/record/3977236/files/female_oral2.fastq-4143.gz
    
  4. Ensure all test output names are included.

  5. For Falco step

    1. Open the Falco HTML output on Galaxy
    2. Search for a text specific to this data (e.g. female_oral2_fastq-4143_gz.gz here)
    3. Replace in the <workflow_name-test>.yml file the path line of falco_html output by:

      asserts:
         has_text:
           text: "female_oral2_fastq-4143_gz.gz"
      
  6. Do the same for MultiQC
  7. Save the file
Comment

If the workflow is using build-in indexes, we should use the available indexes on CernVM-FS (CVMFS), a distributed filesystem perfectly designed for sharing readonly data across the globe. IWC continuous integration uses it for the tests.

Lint the workflow

To make sure the workflow and its test are syntactically correct, we can now run planemo workflow_lint :

Hands On: Lint the workflow
  1. Run the tests on the extracted workflow using

     planemo workflow_lint <workflow_file.ga>
    

Test the workflow against an instance which have all tools installed.

Before submitting the workflow to IWC, it might be interesting to run the tests against a Galaxy instance using Planemo so we can easily see what is failing and what are the differences between our expectations and the output we get:

Hands On: Run the tests
  1. Run the tests on the extracted workflow using

     planemo test \
        --galaxy_url <galaxy_server_url> \
        --galaxy_user_key <your_api_key> \
        <workflow_file.ga>
    

If the tests are not passing because of an error into the test file, we can modify the test file and use Planemo to check that the test is valid against the same invocation.

planemo workflow_test_on_invocation \
   --galaxy_url <galaxy_server_url> \
   --galaxy_user_key <your_api_key> \
   <workflow-tests.yml> \
   <workflow_invocation_id>

Add required metadata

Once the workflow tests has been created and tested, we are almost ready to submit it to IWC. Few last steps are missing to add required metadata

The first step is to generate a .dockstore.yml file that contains metadata needed for Dockstore.

Hands On: Generate a `.dockstore.yml` file
  1. Run:

    planemo dockstore_init
    
  2. Open the generated docker.yml and verify that author names and details are correct.

We now need

  1. a README.md file that briefly describes the workflow
  2. a CHANGELOG.md file that lists changes, additions and fixed.

    For that, we need to follow the formatting and principles proposed on keepachangelog.com.

Hands On: Create README and CHANGELOG files
  1. Create a README.md file inside the workflow folder (example README).

  2. Create a CHANGELOG.md file inside the workflow folder given the following template:

    # Changelog
       
    ## [0.1] yyyy-mm-dd
       
    First release.
    

Finally, there is currently no user interface within Galaxy to define release versions, so we have to manually set a release: "0.1" key value pair in the .ga file.

Hands On: Edit the release version in the workflow
  1. Edit the workflow.ga file to include the release number

    ],
    "format-version": "0.1",
    "license": "GPL-3.0-or-later",
    "release": "0.1",
    "name": "NameOfTheWorkflow",
    "steps": {
    "0": {
    

Submit the workflow to IWC

Hands On: Submit the workflow to IWC
  1. Commit the changes to the createw branch
  2. Push the branch to your fork
  3. Open a Pull Request (PR) on the IWC GitHub repository to submit your workflow.

Once the PR is reviewed and merged, the workflow will be available in Galaxy’s IWC GitHub repository (Tip 3 - “Make source code available in a public code repository.”) and indexed in both WorkflowHub and Dockstore registries (Tip 1 - “Register the workflow.”), making it discoverable and reusable by the community.

Comment

The Best practices for workflows in GitHub repositories GTN provides additional recommendations for structuring and maintaining workflows in Github.

Conclusion

By following this tutorial, you have ensured that your Galaxy workflow is following the “Ten quick tips for building FAIR workflows” (de Visser et al. 2023) and adheres to best practices. From creating and refining the workflow to creating test data and submitting it to the IWC, each step contributes to making workflows more reliable, shareable, and reusable.

For further workflow optimization and maintenance, the Galaxy community provides a comprehensive guide on best practices. This guide extends beyond the best practices panel in Galaxy and includes additional recommendations, such as:

  • Adding automated tests to validate workflow functionality.
  • Publishing workflows on platforms like GitHub, GitLab, or other public version-controlled repositories.
  • Registering workflows in well-known registries like WorkflowHub and Dockstore, ensuring wider accessibility and discoverability.

By continuously following these practices, you contribute to a stronger, more open, and collaborative bioinformatics community, where workflows are easily shared, improved, and adapted to new challenges.