Galaxy meets Onedata — distributed storage for science

Author(s) orcid logoŁukasz Opioła avatar Łukasz Opioła
esg-wp4 esg contributing

Posted on: 17 March 2025

Thanks to the efforts undertaken in the EuroScienceGateway project, Galaxy now offers integration with Onedata. It can be used as a remote source for data import/export (a.k.a. Files Source Plugin) and as a storage location for Galaxy datasets (an Object Store). The integration includes BYOD (Bring Your Own Data) and BYOS (Bring Your Own Storage) templates.

All the good stuff has been integrated in Galaxy version 24.2 and is now live at the EU server: https://usegalaxy.eu.

Is for me?

You can give it a go even if you have never used Onedata before. Take a look at the tutorial that will take you through creating your first free account in the Onedata sandbox (demo) environment. If you like it and want more, let us know — we’ll see what can be done to organize a fully-fledged Onedata ecosystem for your use cases.

If your organization is already a part of a Onedata ecosystem or you have access to Onedata services for science, like the EGI DataHub, then it’s definitely worth looking into.

What’s the point?

Good question! It’s all explained in this tutorial.

Getting started

There is a range of tutorials concerning Onedata on the Galaxy training portal. We recommend to start with this one, which will give you a basic understanding about Galaxy & Onedata integration and guide you through the next steps.

The below image shows the Onedata file browser UI and the Galaxy Upload tool making it possible to import data stored in Onedata Spaces. Onedata & Galaxy interfaces.

A quick overview of Onedata

Onedata (https://onedata.org) is a data management platform that provides easy and unified access to globally distributed storage resources. It’s an open-source project, developed by the team from the Cyfronet Computing Center in Krakow (Poland), since 2013.

Onedata creates a POSIX-compatible, virtual file system layer spanning geographically distributed data providers hosting heterogeneous storage resources. The virtualized data can be access using multiple interfaces: Web GUI, fuse-based POSIX mount, REST API, CDMI API, Python libraries, or S3. Regardless of the interface, the user gets the same, unified view of all his data.

Onedata virtual FS.

The data management capabilities of Onedata and its ability to facilitate cross-organizational collaborative data sharing may be a great asset for your data processing pipelines in Galaxy.

Read on in the tutorial!

Funding

These organisations or grants provided funding support for the development of this resource


Recent News

See all news

An Ode to Helena - from the 🥓Bacon Brigade

7 March 2025   gtn

@gtn:hexylena, as a key member of the phenomenally creative and passionate Galaxy Training Network team alongside @gtn:shiltemann and @gtn:bebatut, pioneered the concept of Training Infrastructure. In a world dominated by clunky virtual learning environments and expensive software, she built an open-source system that could seamlessly support individual users, trainers, and contributors across diverse disciplines and expertise levels.

SPOC HDR UK ELIXIR-UK CoFest 2025: How did it go?

14 February 2025   gtn single-cell

We held our second 🖖🏾SPOC CoFest, in the great tradition of the excellent CoFests organised in the GTN that welcomed @gtn:nomadscientist and many others into the community.