Using BioImage.IO models for image analysis in Galaxy

Overview
Creative Commons License: CC-BY Questions:
  • How can I apply a pre-trained deep learning model to an image?

  • How does the BioImage.IO format integrate with Galaxy?

  • What kind of outputs are generated by the model?

Objectives:
  • Learn how to run a BioImage.IO model using Galaxy

  • Understand how to format image inputs and model axes

  • Interpret and download the model output

Requirements:
Time estimation: 30 minutes
Supporting Materials:
Published: Apr 9, 2025
Last modification: Apr 16, 2025
License: Tutorial Content is licensed under Creative Commons Attribution 4.0 International License. The GTN Framework is licensed under MIT
version Revision: 5

Deep learning models are increasingly used in bioimage analysis to perform processing steps such as segmentation, classification, and restoration tasks (e.g., Moen et al. 2019). The BioImage Model Zoo, (BioImage.IO)(Wei et al. 2021) is a repository that provides access to pre-trained AI models, sharing a common metadata model that allows their reuse in different tools and platforms.

Each model in BioImage.IO is tailored for a specific biological task — for example, segmenting nuclei, detecting mitochondria, or identifying neuronal structures — and trained on specific imaging modalities such as electron or fluorescence microscopy (e.g., von Chamier et al. 2021, Gómez-de-Mariscal et al. 2021).

This tutorial will guide you through the process of applying one of these BioImage.IO models to an input image using Galaxy (Batut et al. 2018). You will learn how to upload and configure the model, set the correct input parameters, and interpret the output files.

Agenda

In this tutorial, we will cover:

  1. Available BioImage.IO models in Galaxy
  2. Example: Segmentation
    1. Get the data
    2. Run the model on your image
    3. Post-processing of the model output
  3. Conclusion

Available BioImage.IO models in Galaxy

As of the version Process image using a BioImage.IO model ( Galaxy version 2.4.1+galaxy3), only the PyTorch-based BioImage.IO models listed in the table below are compatible with the Galaxy tool:

Model name Task Imaging modality Sample / species Link
🪴 PlatynereisEMnucleiSegmentationBoundaryModel Nuclei segmentation Electron microscopy Platynereis View model
🪴 PlatynereisEMcellsSegmentationBoundaryModel Cell segmentation Electron microscopy Platynereis View model
🦠 LiveCellSegmentationBoundaryModel Live cell segmentation Phase-contrast Microscopy Various cell types View model
🔬 HyLFM-Net-stat Light field reconstruction Light field and Fluorescence light microscopy Zebrafish View model
🪴 3DUNetArabidopsisApicalStemCells Stem cell segmentation Confocal / light sheet Arabidopsis root View model
🧬 CovidIFCellSegmentationBoundaryModel Cell segmentation Fluorescence light microscopy Infected human cells View model
🧬 NucleiSegmentationBoundaryModel Nucleus segmentation Fluorescence light microscopy Generic / various View model
🧬 HPANucleusSegmentation Nucleus segmentation Immunofluorescence Human Protein Atlas View model
🧠 NeuronSegmentationInEM (Membrane prediction) Neuron segmentation Electron microscopy Brain tissue View model
🧫 HPACellSegmentationModel Cell segmentation Immunofluorescence Human Protein Atlas View model
🧪 MitochondriaEMSegmentationBoundaryModel Mitochondria segmentation Electron microscopy Human View model

Example: Segmentation

Here we illustrate the type of information that is both useful for understanding the model’s biological context and necessary for using the Galaxy tool — specifically, the input axes and input size parameters.

As an example, we consider the following model: 🧬 NucleiSegmentationBoundaryModel

This model segments nuclei in fluorescence microscopy images. It predicts boundary maps and foreground probabilities for nucleus segmentation, primarily in images stained with DAPI. The outputs are designed to be post-processed with methods such as Multicut or Watershed to achieve instance-level segmentation (object-based segmentation).

You can find similar details for other models directly on BioImage.IO by viewing each model’s card. Look under the “inputs” section of the RDF file to find the required axes and input size values. These parameters are essential for running the model correctly in Galaxy.

Get the data

Hands On: Data Upload
  1. Create a new history for this tutorial.

  2. Download the following image and import it into your Galaxy history. For the purpose of this tutorial, we will use one image to test only one of the 11 available models:

    If you are importing the image via URL:

    • Copy the link location
    • Click galaxy-upload Upload Data at the top of the tool panel

    • Select galaxy-wf-edit Paste/Fetch Data
    • Paste the link(s) into the text field

    • Press Start

    • Close the window

    If you are importing the image from the shared data library:

    As an alternative to uploading the data from a URL or your computer, the files may also have been made available from a shared data library:

    1. Go into Libraries (left panel)
    2. Navigate to the correct folder as indicated by your instructor.
      • On most Galaxies tutorial data will be provided in a folder named GTN - Material –> Topic Name -> Tutorial Name.
    3. Select the desired files
    4. Click on Add to History galaxy-dropdown near the top and select as Datasets from the dropdown menu
    5. In the pop-up window, choose

      • “Select history”: the history you want to import the data to (or create a new one)
    6. Click on Import

  3. Rename the datasets appropriately if needed (e.g. "BioImage.IO model", "Test image")

  4. Confirm the datatypes are correct (pt for the model, tiff or png for the image)

    • Click on the galaxy-pencil pencil icon for the dataset to edit its attributes
    • In the central panel, click galaxy-chart-select-data Datatypes tab on the top
    • In the galaxy-chart-select-data Assign Datatype, select datatypes from “New type” dropdown
      • Tip: you can start typing the datatype into the field to filter the dropdown menu
    • Click the Save button

  5. Import the BioImage.IO model from the Galaxy file repository:

    • Click on Upload Data
    • Go to the Choose from repository tab
    • Navigate to: ML modelsbioimaging-models
    • Select the desired model file (for this tutorial, choose nucleisegmentationboundarymodel.pt)
    • Click Import to add it to your history

    If you are importing the model from the shared data library:

    As an alternative to uploading the data from a URL or your computer, the files may also have been made available from a shared data library:

    1. Go into Libraries (left panel)
    2. Navigate to the correct folder as indicated by your instructor.
      • On most Galaxies tutorial data will be provided in a folder named GTN - Material –> Topic Name -> Tutorial Name.
    3. Select the desired files
    4. Click on Add to History galaxy-dropdown near the top and select as Datasets from the dropdown menu
    5. In the pop-up window, choose

      • “Select history”: the history you want to import the data to (or create a new one)
    6. Click on Import

Run the model on your image

Hands On: Run BioImage.IO model
  1. Process image using a BioImage.IO model ( Galaxy version 2.4.1+galaxy3) with the following parameters:
    • param-file “BioImage.IO model”: nucleisegmentationboundarymodel.pt
    • param-file “Input image”: test_image_nuclei.png
    • param-text “Size of the input image”: 256,256,1,1
    • param-select “Axes of the input image”: Four axes (e.g., bcyx, byxc)
    Comment: Axes and size

    The param-text “Size of the input image” and the param-select “Axes of the input image” are crucial to transform the input image into the format that the BioImage.IO model requires. The correct values are provided in the RDF file that comes with the chosen model on BioImage.IO.

The model will process the input image and generate two outputs:

  • Two predicted images (written in one TIFF file)
  • A predicted tensor matrix (.npy)

Figure 1 below is a visualization of the two predicted images generated by the 🧬 NucleiSegmentationBoundaryModel. Predicted Image 1 are the foreground probabilities and Predicted Image 2 are the boundary map.

Example tiff output of the nuclei segmentation model. Open image in new tab

Figure 1: The output generated by the nuclei segmentation model for the example data. The intensity values in all three images are ranging from 0 (black) to 1 (white) with gray values in between.

Galaxy provides a basic preview using its .tiff visualization tool. However, BioImage.IO models sometimes produce tiff files with several predicted images residing in the same tiff file.

To properly explore the results, it is recommended to click on the visualize icon in the output file, this will give you the option to display the dataset using the Avivator tool.

Inspect output with Avivator. Open image in new tab

Figure 2: Visualize your Tiff output with Avivator in Galaxy

An alternative is to download the file and open it locally using image analysis tools such as Fiji/ImageJ, napari, or QuPath.

Question: Check your understanding
  1. Why do the image axes matter when using a model?
  2. What happens if the image size does not match the model input?
  3. What are TIFF and NPY formats?
  4. How can you interpret the output of the model, and what does it tell you about your input image?
  1. Because deep learning models are trained on specific image shapes and dimensions; mismatches will cause errors or wrong results.
  2. The model will fail to run or produce invalid output.
  3. TIFF (.tif) is a standard format for storing image data, commonly used in microscopy and bioimaging. It can be easily viewed and interpreted visually. NPY (.npy) is a binary format used by NumPy to store arrays. In this case, it contains the raw prediction tensor produced by the model, which can be useful for further analysis or visualization with Python tools.
  4. The model generates a predicted image that highlights or segments specific structures (e.g. nuclei, cells, mitochondria) based on what it learned during training. By comparing the output image to the input, users can see which regions were detected or classified, helping to extract biological meaning from the raw data.

Post-processing of the model output

There are two challenges when it comes to using the model output for subsequent analysis. First, the model produces a single output file with two images (boundary maps and foreground probabilities), so for subsequent analysis we need to extract the corresponding information from that file. Second, neither of the two images produced by the model directly corresponds to segmentation results. Albeit the extracted image (Predicted Image 1) looks like a binary image with intensity 0 for the image background and intensity 1 for the image foreground, it is not. For example, there are fine contours of intensity values subtly below 1 between closely clustered cell nuclei. Thus, to obtain segmentation results, we first need to threshold the foreground probabilities (values ranging between 0 and 1) to determine the image foreground (as a binary image without any values between 0 and 1).

However, directly thresholding the foreground probabilities is going to lose information when it comes to closely clustered cell nuclei, where the crucial information is stored in the boundary map. To cope with that, we will extract both images (the foreground probabilities and the boundary map) from the output file, combine their information into a single image, and then perform thresholding.

Hands On: Extract the segmentation results from the model output
  1. Split image along axes ( Galaxy version 2.2.3+galaxy1) with the following parameters:
    • param-file “Input Image”: the output from “Run BioImage.IO model”
    • param-select “Axis to split along”: Q-axis (other or unknown axis)
    • param-check “Squeeze result images”

    This produces a dataset collection with two items (the two images). Next, we need to extract the first dataset (Predicted Image 1) from this collection.

  2. Extract dataset with the following parameters:
    • param-file “Input List”: the output from the previous step
    • param-select “How should a dataset be selected”: Select by index
    • param-select “Element index”: 0

    This will yield the file 1.tiff in your history (Predicted Image 1).

  3. Extract dataset with the following parameters:
    • param-file “Input List”: the output from the previous step
    • param-select “How should a dataset be selected”: Select by index
    • param-select “Element index”: 1

    This will yield the file 2.tiff in your history (Predicted Image 2).

    Next, we will combine the information from the two images into a single image.

  4. Process images using arithmetic expressions ( Galaxy version 1.26.4+galaxy2)
    • param-text “Expression”: foreground - boundaries
    • param-repeat “Input images”:
      • param-file “Image”: 1.tiff
      • param-text “Variable for representation of the image within the expression”: foreground
    • param-repeat “Input images”:
      • param-file “Image”: 2.tiff
      • param-text “Variable for representation of the image within the expression”: boundaries
    Question

    What is the motivation for combining the information from the two images with this arithmetic expression?

    Each pixel of the foreground image (Predicted Image 1) corresponds to the probability of that pixel being part of a foreground object. We have also seen that the boundaries image (Predicted Image 2) uses white (intensity value 1) to encode pixels which likely correspond to boundaries of cell nuclei, and lower intensity values for others. Thus, we can interpret the boundaries image as boundary probabilities, in the sense that each pixel of that image corresponds to the probability of that pixel being part of an object boundary (i.e. the boundary of a nucleus). By considering the expression foreground - boundaries, we essentially consider the probability of each pixel being part of the image foreground, plus the probability of that point being not part of an object boundary.

    Thus, in the resulting image, the intensity of each pixel pixel can be interpreted as the probability of that pixel being part of the interior of a foreground object. This better preserves information about the individual cell nuclei in the image, which is especially crucial for closely clustered cell nuclei, as opposed to considering the foregorund probabilities solely.

    This interpretation also gives rise to the choice of 0.6 as the threshold value for in the next step (see below). Naturally, we would choose 0.5 to determine the image regions for which the probability is higher than 50% that the image pixels correspond to the interior of foreground objects (cell nuclei), but to improve the separation of closely clustered cell nuclei, it is a good practice to choose a threshold that is somewhat higher.

  5. Threshold image ( Galaxy version 0.18.1+galaxy3)
    • param-file “Input image”: the output of the Process images using arithmetic expressions ( Galaxy version 1.26.4+galaxy2) tool
    • param-select “Thresholding method”: Manual
    • param-text “Threshold value”: 0.6

Finally, you could follow the “Hands On: Segment image” from the “Introduction to Image Analysis using Galaxy” tutorial to create a segmentation overlay (e.g., see Figure 3 below) or to perform cell counting.

Segmentation overlay. Open image in new tab

Figure 3: Overlay of the original input image and the contours of the segmentation results obtained using the BioImage.IO model and post-processing.

Conclusion

In this tutorial, you learned how to run a BioImage.IO model on a biological image using Galaxy. By uploading a compatible model and image, setting the appropriate size and axes, and running the tool, you obtained both a predicted image and a tensor matrix representing the model output.

This provides a fast, reproducible way to apply deep learning models in the context of bioimage analysis — all within Galaxy.