Prov API for recording and managing provenance information

Purpose

  • to be used by communities and EUDAT infrastructure tools to record prov info
  • prototype work to evaluate neo4j backend for this pupose
  • provide concrete implementation to evaluate and test different use cases
  • act as an early prototype to specify and develop a future provenance API to be used in future infra-projects and which can be discussed in current projects like ENVRI+ and EUDAT

References

Components

  • provio library to generate neo4j graph from w3c prov documents in json (or xml)
  • example python API to generate, update, modify provenance graphs
  • example prov sources: log files (+ wrapper to call API), example clients
  • (optional: ProvStore free service to store provenance documents https://provenance.ecs.soton.ac.uk/store/ )

Initial use cases

initial use cases are centered around application scenarios coming from geo-sciences especially the ENES climate community. Generic geo-science infrastructure related topics will be discussed in the context of the ENVRI+ project. Results will be used to discuss future EUDAT GEF provenance handling aspects. Concrete use cases:

  • ENES community: log file of climate data evaluation workflow --> neo4j graph
  • ENES community: provenance capture as part of web processing workflow execution
  • EUDAT: provenance capture as part of GEF processing workflow (as no real GEF implementation is availabe the ENES OGC WPS implementation of the previous use case acts as a prototype)

In [ ]: