Feeding PIDza to VIVO: data ingest with SPARQL-Generate

Sandra Mierz; Maxime Lefrançois

doi:10.5281/zenodo.5027304

Communication Dans Un Congrès Année : 2021

Feeding PIDza to VIVO: data ingest with SPARQL-Generate

(1) , (2, 3, 4, 5)

1
2
3
4
5

Sandra Mierz

Fonction : Auteur

TIB Leibniz Information Centre For Science and Technology University Library

Maxime Lefrançois

Fonction : Auteur
PersonId : 523
IdHAL : mlefranc
ORCID : 0000-0001-9814-8991
IdRef : 181017741

École des Mines de Saint-Étienne

Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes

Institut Henri Fayol

Département Informatique et systèmes intelligents

Résumé

The first hurdle after installing VIVO is to fill it with an initial set of data about an institution, its researchers and their publications. Done manually it is a cumbersome and time-consuming process. One approach to overcome this is to use open-data containing a persistent identifier(PID) like ROR, ORCID or DOI. The advantage lies in the reduced processing of input data: since data does not need to be disambiguated, the data ingestion process can be reduced to mapping the data to the VIVO ontology. While several tools exist that are able to import one PID-identified object into VIVO, the release of Datacite Commons takes this approach to the next level. Datacite Commons offers an interface to a so-called PID-Graph: a structure of multiple connected data objects each identified by a PID. It makes queries possible that take advantage of the connections between several PIDs like e.g. querying an organization (identified by a ROR iD) and its affiliated persons (identified by their ORCID iD) and subsequently their publications (identified by a DOI), and thus providing a quick data basis for an empty Research Information System. In this talk, we will present a microservice importing data from the Datacite Commons PID-Graph and the ROR API into VIVO ( https://github.com/vivo-community/generate2vivo ). This microservice is based on lifting rules defined using the SPARQL-Generate RDF transformation language, which we will overview beforehand. SPARQL-Generate is an expressive template-based language to generate RDF streams or text streams from RDF datasets and document streams in arbitrary formats (for more information see website https://w3id.org/sparql-generate/ )

Mots clés

VIVO SPARQL-Generate Jena Datacite Datacite Commons PID ROR ORCID DOI

Domaines

Modélisation et simulation Informatique [cs] Intelligence artificielle [cs.AI] Langage de programmation [cs.PL] Web

Florent Breuil : Connectez-vous pour contacter le contributeur

https://hal-emse.ccsd.cnrs.fr/emse-03270629

Soumis le : vendredi 25 juin 2021-08:51:46

Dernière modification le : mardi 17 septembre 2024-15:46:08

Dates et versions

emse-03270629 , version 1 (25-06-2021)

Identifiants

HAL Id : emse-03270629 , version 1
DOI : 10.5281/zenodo.5027304

Citer

Sandra Mierz, Maxime Lefrançois. Feeding PIDza to VIVO: data ingest with SPARQL-Generate. 12th VIVO Conference (VIVO21), Jun 2021, On Line, France. ⟨10.5281/zenodo.5027304⟩. ⟨emse-03270629⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

EMSE PRES_CLERMONT CNRS FAYOL-ENSMSE LIMOS ISCOD-ENSMSE TDS-MACS CLERMONT-AUVERGNE-INP INSTITUT-MINES-TELECOM

299 Consultations

2 Téléchargements

Feeding PIDza to VIVO: data ingest with SPARQL-Generate

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager