A SPARQL extension for generating RDF from heterogeneous formats

Abstract : RDF aims at being the universal abstract data model for structured data on the Web. While there is effort to convert data in RDF, the vast majority of data available on the Web does not conform to RDF. Indeed, exposing data in RDF, either natively or through wrappers, can be very costly. Furthermore, in the emerging Web of Things, resource constraints of devices prevent from processing RDF graphs. Hence one cannot expect that all the data on the Web be available as RDF anytime soon. Several tools can generate RDF from non- RDF data, and transformation or mapping languages have been designed to offer more flexible solutions (GRDDL, XSPARQL, R2RML, RML, CSVW, etc.). In this paper, we introduce a new language, SPARQL-Generate, that generates RDF from: (i) a RDF Dataset, and (ii) a set of documents in arbitrary formats. As SPARQL-Generate is designed as an extension of SPARQL 1.1, it can provably: (i) be implemented on top on any existing SPARQL engine, and (ii) leverage the SPARQL extension mechanism to deal with an open set of formats. Furthermore, we show evidence that (iii) it can be easily learned by knowledge engineers that know SPARQL 1.1, and (iv) our first naive open source implementation performs better than the reference implementation of RML for big transformations.
Type de document :
Communication dans un congrès
14th ESWC 2017 , May 2017, Portoroz, Slovenia. Springer International Publishing, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Volume 10249 pp.Pages 35-50, 〈10.1007/978-3-319-58068-5_3〉
Liste complète des métadonnées

https://hal-emse.ccsd.cnrs.fr/emse-01504637
Contributeur : Florent Breuil <>
Soumis le : lundi 10 avril 2017 - 14:09:11
Dernière modification le : jeudi 26 juillet 2018 - 01:10:52

Identifiants

Citation

Maxime Lefrançois, Antoine Zimmermann, Mohammad Noorani Bakerally. A SPARQL extension for generating RDF from heterogeneous formats. 14th ESWC 2017 , May 2017, Portoroz, Slovenia. Springer International Publishing, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Volume 10249 pp.Pages 35-50, 〈10.1007/978-3-319-58068-5_3〉. 〈emse-01504637〉

Partager

Métriques

Consultations de la notice

233