Querying web polystores

Yasar Khan; Antoine Zimmermann; Alokkumar Jha; Dietrich Rebholz-Schuhmann; Ratnesh Sahay

doi:10.1109/BigData.2017.8258299

Communication Dans Un Congrès Année : 2018

Querying web polystores

(1) , (2, 3, 2) , (1) , (1) , (4, 1)

1
2
3
4

Yasar Khan

Fonction : Auteur

Insight Centre for Data Analytics [Galway]

Antoine Zimmermann

Fonction : Auteur
PersonId : 4097
IdHAL : antoine-zimmermann
ORCID : 0000-0003-1502-6986
IdRef : 133375676

Laboratoire Hubert Curien

École des Mines de Saint-Étienne

Laboratoire Hubert Curien

Alokkumar Jha

Fonction : Auteur

Insight Centre for Data Analytics [Galway]

Dietrich Rebholz-Schuhmann

Fonction : Auteur

Insight Centre for Data Analytics [Galway]

Ratnesh Sahay

Fonction : Auteur

Digital Enterprise Research Institute

Insight Centre for Data Analytics [Galway]

Résumé

The database, semantic web, and linked data communities have proposed solutions that federate queries over multiple data sources using a single data model. Nowadays, the data retrieval requirements originating from versatile and broad domains like healthcare and life sciences (HCLS) are changing this conventional trend - of federating query over a single data model - primarily due to the simultaneous use of different data models (CSV, JSON, RDB, RDF, XML, etc.) in a real-life scenario. It's now impractical to assume that the variety (graph, key-value, stream, text, table, tree, etc.) of high volume data residing in specialised storage engines will first be converted to a common data model, stored in a general-purpose data storage engine, and finally be queried over the Web. Nevertheless, in this era where genomics datasets are growing from petascale to exascale, it is now important to exploit such vast domain resources in their native data models. The key approach is to query the vast data resources from their native data models and specialised storage engines. In this paper, we propose a Web-based query federation mechanism - called PolyWeb - that unifies query answering over multiple native data models (CSV, RDB, and RDF). We demonstrate PolyWeb on a cancer genomics use-case where it is often the case that a description of biological and chemical entities (e.g., gene, disease, drug, pathways) span across multiple data models. In order to assess the benefits and limitations of evaluating queries over native data models, we evaluate PolyWeb with state-of-the-art query federation engine in terms of result completeness, source selection, and overall query execution time.

Mots clés

Databases Query Federation Query Planning Cancer Genomics Semantic Web Linked Data

Domaines

Modélisation et simulation

Florent Breuil : Connectez-vous pour contacter le contributeur

https://hal-emse.ccsd.cnrs.fr/emse-01879779

Soumis le : lundi 24 septembre 2018-12:02:01

Dernière modification le : mercredi 30 octobre 2024-19:42:19

Dates et versions

emse-01879779 , version 1 (24-09-2018)

Identifiants

HAL Id : emse-01879779 , version 1
DOI : 10.1109/BigData.2017.8258299

Citer

Yasar Khan, Antoine Zimmermann, Alokkumar Jha, Dietrich Rebholz-Schuhmann, Ratnesh Sahay. Querying web polystores. 2017 IEEE International Conference on Big Data (Big Data), Dec 2017, Boston, France. pp.3190-3195, ⟨10.1109/BigData.2017.8258299⟩. ⟨emse-01879779⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-ST-ETIENNE EMSE IOGS UNIV-RENNES1 CNRS IRISA PARISTECH FAYOL-ENSMSE ISCOD-ENSMSE TDS-MACS UR1-MATH-STIC UR1-UFR-ISTIC UNIV-RENNES UDL ANR UR1-MATH-NUM LABORATOIRE-HUBERT-CURIEN INSTITUT-MINES-TELECOM

111 Consultations

0 Téléchargements

Querying web polystores

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager