taverna-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikos Minadakis <minad...@ics.forth.gr>
Subject Re: Taverna Player provenance
Date Mon, 01 Dec 2014 15:43:56 GMT
Hello Stian and Everyone,

sorry for the delay of my answer but I just found out than I had to 
send this email :P

So, my main questions as you already mentioned are the following:

1) Can we track the provenance of data by using Taverna?

You already answered to this and added that:
- Portal is not capturing who is doing which interaction so if the run 
is shared with multiple people that might be something
additional to add.
-I checked, and the BioVel Portal is not yet using the version of 
Taverna Server that allows capturing/export of Provenance (2.5.4) -
but is planning that upgrade soon.

So, this is fine by me and I would like to ask you if it is possible to 
be more specific for the versions that will support such 
functionalities.

2) Can I use my own provenance Schema in case I don't want to use 
PROV-O? (CRM digital is an example, and we are using it in FORTH)

3) To be more specific with my requirements.
I want to use taverna in order to implement a complex scientific 
workflow that supports interactions from different Actors and 
Institutions.

So lets consider that that the workflow consists of 3 steps A-B-C (in 
real life they are much more) and 2 Institutions will interact. In1 and 
In3

In1 will start with step A. When this step finishes In2 will be 
informed automatically and will continue with Step B. When step B is 
finished In1 will be automatically informed in order to continue with 
step C and finish the workflow. Now imagine this carried out by 20 
institutions and hundreds of steps.

As a result we should be able to know who did what and when. Not only 
for the concrete steps but even for the actions that are done internally 
in each institution (data provenance, runs, etc etc). And cause of the 
specializations of the actions PROV-O may not be enough for tracking 
such provenance, so another schema may be used for it.

Off course it would be great as a final goal to be able to go back. So, 
to start from the final product of the execution and by following the 
provenance information to be able to do as many steps back as needed and 
to extract the previous results.

My final questions is: Is Taverna capable of fulfilling such a 
requirement? And of course if not, to what degree does it support it, 
and what effort does it need to be extended?

If I could have an answer by the end of this week it would be great 
since I will report it in a new project's meeting next week.

Thank you,

Nikos

On 2014-11-03 18:22, Stian Soiland-Reyes wrote:
> Nikos Minadakis and myself had a chat about provenance requirements
> for when running workflows through the Taverna Player - in particular
> at the http://portal.biovel.eu/
>
>
> Nikos would want to get this kind of provenance out of a workflow run
> in the Player/portal:
>
> * modification events
> * who did these modifications
> * when
> * inputs and outputs
>
> (and presumably which workflow :) )
>
>
> An example is the Data Refinement Workflow
> https://portal.biovel.eu/workflows/641 which has several user
> interactions that should be tracked.
>
> Nikos would like to have access to the provenance primarily as
> machine-accessible, but preferably also in a human-readable kind of
> report. In particular he would like to mix-in his own
> provenance-specific schema (crm digital ?).
>
>
> I described how the Taverna Server can capture provenance of the
> details of a workflow run and expose that as a Data Bundle -
> 
> https://github.com/taverna/taverna-prov#structure-of-exported-provenance
>
> This is however basically a trace of every step of the workflow - and
> would include the user modification as a series of events, like:
>
> 1) at 15:42:00 the workflow 1298319283 was started as run 2781721
> 2)  at 15:42:12 in run 2781721, the Interaction service named
> "Ask_user_5" in workflow 1298319283 responded an output value 51231.
> The value contains "GBIF".
> 3) at 15:42:53 in run 2781721, the Interaction service named
> "run_analysis" in workflow 1298319283 used an input value 51231. The
> value contains "GBIF".
>
>
> Nikos would like to connect these interactions with who was using the
> Portal. I checked, and the BioVel Portal is not yet using the version
> of Taverna Server that allows capturing/export of Provenance (2.5.4) 
> -
> but is planning that upgrade soon.
>
> The Taverna Server does not know who started the run from the
> Player/Portal - so the Player would need to inject that additional
> provenance afterwards.  (This should be doable within the same bundle
> as it is a ZIP file where you can just add files).
>
> I don't believe the Portal is capturing who is doing which 
> interaction
> - so if the run is shared with multiple people that might be 
> something
> additional to add.
>
> It might be needed to mark up or understand the workflow so that the
> resulting provenance only focus only on the steps that are 
> 'important'
> scientifically.
>
>
>
> ACTION Nikos: To respond to this email with a deeper list of
> requirements / queries that the provenance should be able to
> capture/expose.
>
> ACTION Nikos: Ping back to this thread in a week's time so we won't 
> forget :)
>
> ACTION Stian/Rob: Ask Rob if it is possible to turn on provenance for
> workflow runs in the development instance of the portal
>
> ACTION Rob/Alan: Is it possible for the player/portal to know who 
> does
> which interaction? Can they tell the server or should it be injected
> after the fact?


Mime
View raw message