# Data source integration

### Overview

Questions
• How can I write a tool that can import data into Galaxy from an external database?

• What are “data sources” and how do they function?

• Is there any ready-to-use example?

Objectives
Time estimation: 10 minutes
Supporting Materials
Last modification: May 27, 2021

# Data Source Integration

An important goal of Galaxy is scalability. A major bottleneck when it comes to analysis of big data sets is the time and space it takes of copying these data sets.

Galaxy provides an interface such that it can communicate with other servers to get data directly into the Galaxy environment of a user without the need of “downloading” the data. In this hands on, we will use the resource from DoRiNA Server (Blin et al. 2014). but the main point about this short section is: if you have a data source which you think is very important for your research with Galaxy let us know!

### hands_on Hands on!

1. Create a new history called “doRiNA”
2. Go to Get Data::doRiNA search
3. Choose hg19 from the drop-down list -> Search Database
4. Leave everything as is and choose from the Regulators (set A) drop-down list “hsa-let-7astar-CLASH” -> Search doRiNA
5. Use the “Send to Galaxy” button
6. Notice the new History Item

You can see the tutorial section of the DoRiNA website for more detailed examples. That was very easy for all of you! If you want your database of choice to be accessible as easy as this let us know!

### Key points

• It is possible to couple an external data resource with a Galaxy server

• The external data resource is accessed through his native interface

• Data flows from the external data resource to the Galaxy server without the need of “downloading” the data

Have questions about this tutorial? Check out the FAQ page for the Development in Galaxy topic to see if your question is listed there. If not, please ask your question on the GTN Gitter Channel or the Galaxy Help Forum

# References

1. Blin, K., C. Dieterich, R. Wurmus, N. Rajewsky, M. Landthaler et al., 2014 DoRiNA 2.0—upgrading the doRiNA database of RNA interactions in post-transcriptional regulation. Nucleic Acids Research 43: D160–D167. 10.1093/nar/gku1180

# Feedback

Did you use this material as an instructor? Feel free to give us feedback on how it went.

# Citing this Tutorial

1. Bérénice Batut, Saskia Hiltemann, Gianmauro Cuccuru, Helena Rasche, 2021 Data source integration (Galaxy Training Materials). https://training.galaxyproject.org/archive/2021-08-01/topics/dev/tutorials/data-source-integration/tutorial.html Online; accessed TODAY
2. Batut et al., 2018 Community-Driven Data Analysis Training for Biology Cell Systems 10.1016/j.cels.2018.05.012

### details BibTeX

@misc{dev-data-source-integration,
author = "Bérénice Batut and Saskia Hiltemann and Gianmauro Cuccuru and Helena Rasche",
title = "Data source integration (Galaxy Training Materials)",
year = "2021",
month = "05",
day = "27"
url = "\url{https://training.galaxyproject.org/archive/2021-08-01/topics/dev/tutorials/data-source-integration/tutorial.html}",
note = "[Online; accessed TODAY]"
}
@article{Batut_2018,
doi = {10.1016/j.cels.2018.05.012},
url = {https://doi.org/10.1016%2Fj.cels.2018.05.012},
year = 2018,
month = {jun},
publisher = {Elsevier {BV}},
volume = {6},
number = {6},
pages = {752--758.e1},
author = {B{\'{e}}r{\'{e}}nice Batut and Saskia Hiltemann and Andrea Bagnacani and Dannon Baker and Vivek Bhardwaj and Clemens Blank and Anthony Bretaudeau and Loraine Brillet-Gu{\'{e}}guen and Martin {\v{C}}ech and John Chilton and Dave Clements and Olivia Doppelt-Azeroual and Anika Erxleben and Mallory Ann Freeberg and Simon Gladman and Youri Hoogstrate and Hans-Rudolf Hotz and Torsten Houwaart and Pratik Jagtap and Delphine Larivi{\{e}}re and Gildas Le Corguill{\'{e}} and Thomas Manke and Fabien Mareuil and Fidel Ram{\'{\i}}rez and Devon Ryan and Florian Christoph Sigloch and Nicola Soranzo and Joachim Wolff and Pavankumar Videm and Markus Wolfien and Aisanjiang Wubuli and Dilmurat Yusuf and James Taylor and Rolf Backofen and Anton Nekrutenko and Björn Grüning},
title = {Community-Driven Data Analysis Training for Biology},
journal = {Cell Systems}
}
`