View markdown source on GitHub

Integrate and query local datasets and distant RDF data with AskOmics using Semantic Web technologies




last_modification Published: Jul 17, 2020
last_modification Last Updated: Jul 26, 2021

How to explore data


Study of biological mechanisms requires to:

.image-00[ Local data tables and remote data with gene and proteins is combined in data integration. Now a graph is produced with differential expression pointing to a gene which points to a protein. Next the data is queried and results produced.]

What is the Semantic Web?

Semantic Web

Set of recommendations to integrate data, to integrate domain knowledge and to perform query and reasoning.


RDF: Set of triples

.image-01[ a small graph, subject points to object with an arrow labelled predicate.]

nextprot:P01137 :hasTaxon taxon:9606 .
nextprot:P01137 :hasSequence "MPPSGLRLLL" .

RDF: triples form a labeled directed graph

# Description
nextprot:P01137 rdf:type nextprot:Protein .
taxon:9606 rdf:type nextprot:Organism .
# Data
nextprot:P01137 :hasTaxon taxon:9606 .
nextprot:P01137 :hasSequence "MPPSGLRLLL" .

.image-02[ A graphic with two regions, data description and data. In the data is a circle labelled nextprot:P01137 which points to a sequence via a hasSequence arrow. The nextprot points to a taxon:9606 with a hasTaxon arrow. The taxon points to a nextprot:Organism in the data description region. The nextprot protein points to nextprot:Protein in the data description region via an rdf:type arrow.]


SELECT ?gene
    ?gene rdf:type :Gene .
    ?gene :hasTaxon taxon:9606 .

SPARQL: entity matching allow federated queries



.pull-right[ .image-03[ Dataset 1 and 2 are shown as two silos, each with different small graphs. Each has a red node. Those nodes are connected via a dashed line. A picture of a cloud points at the two datasets, and their individual graphs collapsed into one larger graph. A query is sent to this cloud which comes out as a result table.] ]

What is AskOmics?


Web software for data integration and query using Semantic Web. The main functionalities are:

AskOmics can be used as a standalone software, or with Galaxy

Data integration with AskOmics

Local data integration

AskOmics generates the graph of data and the abstraction

AskOmics uses the file structure (e.g. header of TSV files) to generate the graph of data description: the abstraction

.image-04[ Two tables are provided, pointing to RDF abstraction with a small graph of DE, Gene, and their attributes. And RDF data which has the same graph as abstraction, but with real identifiers.]

The rest of the files is converted to RDF triples that correspond to the data.

Distant RDF data integration

pip3 install abstractor
abstractor -s -o nextprot_abstraction.ttl -f turtle

Query multiple data sources with AskOmics

Query composition

.image-06[ A picture of an RDF graph with many nodes. On the right is a query interface of some sort.]

Key Points

Thank you!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors! Galaxy Training Network Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.