Skip to Main content Skip to Navigation
Conference papers

Feeding PIDza to VIVO: data ingest with SPARQL-Generate

Abstract : The first hurdle after installing VIVO is to fill it with an initial set of data about an institution, its researchers and their publications. Done manually it is a cumbersome and time-consuming process. One approach to overcome this is to use open-data containing a persistent identifier(PID) like ROR, ORCID or DOI. The advantage lies in the reduced processing of input data: since data does not need to be disambiguated, the data ingestion process can be reduced to mapping the data to the VIVO ontology. While several tools exist that are able to import one PID-identified object into VIVO, the release of Datacite Commons takes this approach to the next level. Datacite Commons offers an interface to a so-called PID-Graph: a structure of multiple connected data objects each identified by a PID. It makes queries possible that take advantage of the connections between several PIDs like e.g. querying an organization (identified by a ROR iD) and its affiliated persons (identified by their ORCID iD) and subsequently their publications (identified by a DOI), and thus providing a quick data basis for an empty Research Information System. In this talk, we will present a microservice importing data from the Datacite Commons PID-Graph and the ROR API into VIVO ( https://github.com/vivo-community/generate2vivo ). This microservice is based on lifting rules defined using the SPARQL-Generate RDF transformation language, which we will overview beforehand. SPARQL-Generate is an expressive template-based language to generate RDF streams or text streams from RDF datasets and document streams in arbitrary formats (for more information see website https://w3id.org/sparql-generate/ )
Complete list of metadata

https://hal-emse.ccsd.cnrs.fr/emse-03270629
Contributor : Florent Breuil Connect in order to contact the contributor
Submitted on : Friday, June 25, 2021 - 8:51:46 AM
Last modification on : Tuesday, July 13, 2021 - 3:09:33 AM

Identifiers

Citation

Sandra Mierz, Maxime Lefrançois. Feeding PIDza to VIVO: data ingest with SPARQL-Generate. 12th VIVO Conference (VIVO21), Jun 2021, On Line, France. ⟨10.5281/zenodo.5027304⟩. ⟨emse-03270629⟩

Share

Metrics

Record views

130