taverna-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nadeesh Dilanga <nadeesh...@gmail.com>
Subject Re: GSoC 2016 Docker support for Taverna
Date Wed, 23 Mar 2016 08:54:38 GMT
Hi Stian et al,
Here I have drafted my proposal [1]. Appreciate everone's feedback on the
proposal. Please let me know if this is not align with your original
expectation from this project. Or whether it needs any scope level changes.

Apart from that, @TAVERNA-900 can you please clarify following;

"Create a Docker tool for executing Taverna activities (TAVERNA-879) - *this
allows any Taverna steps to be used by other CWL engines*"

[1] -
https://docs.google.com/document/d/1DKYuzr2hA5brQ2xBz_AVQgMWXB5qm6rftbWoGbnbXrg/edit?usp=sharing

On Tue, Mar 22, 2016 at 1:42 PM, Nadeesh Dilanga <nadeesh092@gmail.com>
wrote:

> Hi,
> Thank you very much for the quick response. I will go through these bit
> more and get back when I meet any roadblocks.
>
> On Mon, Mar 21, 2016 at 10:15 PM, Stian Soiland-Reyes <stain@apache.org>
> wrote:
>
>> On 21 March 2016 at 00:51, Nadeesh Dilanga <nadeesh092@gmail.com> wrote:
>>
>> > First of all, apologize for the delayed response. I wanted to give my
>> self
>> > bit more time to understand and going through what Taverna is and what
>> > exactly the expected outcome of the project (tutorials and related slide
>> > decks and also youtube videos were very helpful). Because this will be
>> my
>> > one and only GSoC proposal and I want it to be perfect!.
>>
>> Thanks!  You don't have to do it perfect - just great! :-))
>>
>> > 1. Taverna is a BPMN like(but more extensive and scoped more widely in
>> > features) workflow engine which has several ways of creating work flows
>> and
>> > different interfaces of access them.
>>
>> While I guess we don't like to be compared with BPMN, I think you are
>> correct. :)
>>
>>
>> >  2. When creating workflows, one major extension point to cater custom
>> use
>> > cases is, to plug/create your own services/service types which is a
>> great
>> > model IMHO. And this project is in fact to write an adapter(activity
>> plugin
>> > which I believe is the executor of an invocation of a service) when some
>> > one needs to run something on Docker at some phase of his workflow.
>>
>> Correct - thus one could have a workflow with multiple tools from
>> different docker images.
>>
>>
>> > if #2 is correct, can you please provide me an example of an use case
>> which
>> > led to this project idea, because feels I may be missing something here.
>> > Because IMHO, even for docker eventually it will be a service invocation
>> > from a workflow front, and what Tarvena needs is some activity plugins
>> that
>> > are aware of the particular transport protocols.
>>
>> We already have the Tool activity which allow you to run command line
>> tools - however such workflows are hard to share as anyone receiving
>> it may not have that tool installed, or in the same version/location.
>>
>> While approaches like https://www.debian.org/devel/debian-med/ and
>> BioLinux have helped towards "How to get it installed" - it then moves
>> the requirement to a particular operating system, which in a way is
>> worse.
>>
>> Docker solves the "How to consistently install this tool" problem -
>> and even works (almost) seemlessly from OS X and Windows. It adds nice
>> reproducibility aspects as you can mark the exact snapshot version of
>> the docker image you have used.
>>
>>
>> There are now also initiatives such as http://bioboxes.org/ (and  to a
>> certain degreehttp://bio.tools/ ) which describe bioinformatics tools
>> as Docker images - thus these can in theory be used directly from
>> Taverna.
>>
>>
>> Perhaps part of the project would be to define a use case so we find
>> some actual command lines we want to run in a Taverna workflow - e.g.
>> to run HMMER for sequence alignment using
>> https://hub.docker.com/r/dockerbiotools/hmmer/ using sequences fetched
>> from an EBI web service?  I am not sure how much of the bioinformatics
>> side you would like to get into! :)
>>
>>
>>
>> > (example: http service hosted in Docker, Http activity plugin, Message
>> > Broker service hosted in Docker, you need AMQP,MQTT like activity
>> plugin)
>>
>> Yes, but I don't think we want to run many of those kind of services
>> from Taverna, I was thinking more of running just command line tools
>> that happen to be packaged as Docker images.
>>
>> > 3. Or the case is to invoke some composite applications that
>> > deployed/installed in Docker disregarding what the protocols are ?
>>
>> No, this would get a bit more complex, so I would stay away from that
>> for the GSOC project - although of course the potential is very
>> interesting motivation as well.
>>
>> I think this is what I described in
>> https://issues.apache.org/jira/browse/TAVERNA-941
>>
>>
>> > if #3 is correct, what we run in the docker container can be another
>> > Taverna workflow. If that is the case your idea on "Save workflow as
>> Docker
>> > image" will be a superb addition!.
>>
>> Yes! It should then be possible! But.. why? :)  Run with older Taverna
>> version?
>>
>> One interesting thing could be if there's also "Save workflow as
>> Docker image" - if such a docker image is added as a Docker image -
>> would be to "unwrap" it and show the inner workflow in Taverna.
>>
>> With Docker there's a big danger of going down the "It's turtles all
>> the way down" recursion - hence I tried to scope the GSOC ideas to be
>> more concrete about running command line tools.
>>
>>
>> >  So with this, I would like to understand what Taverna community expect
>> > from "Invoking Docker from Taverna"  on this GSoC project. So that I
>> can be
>> > more specific on my project proposal and make it the best project for
>> this
>> > summer for Taverna.
>> >
>> >
>> >
>> > On Fri, Mar 18, 2016 at 7:18 AM, Stian Soiland-Reyes <stain@apache.org>
>> > wrote:
>> >
>> >> On 17 March 2016 at 15:22, alaninmcr <alaninmcr@googlemail.com> wrote:
>> >> >> I found Docker as an excellent solution for scaling, easy
>> deployment and
>> >> >> obviously a hot topic these days in enterprises who want to
>> implement
>> >> >> micro
>> >> >> services based architecture/deployment for low footprint
>> >> servers/services.
>> >> >>
>> >> >> I presume the idea behind Docker support for Taverna is NOT from
a
>> micro
>> >> >> service standpoint, but more like from a packaging and deployment
>> >> >> perspective. Please correct me if I am wrong.
>> >>
>> >> No, you are right in that our current Docker ideas would not be about
>> >> creating Taverna (or Taverna workflow) as a micro-service,. but to use
>> >> Docker for execution.
>> >>
>> >> A similar aspect could be to use Docker to start up a set of
>> >> microservices accompanying the Workflow, and then access them from
>> >> Taverna workflow using the existing WSDL and REST activities.
>> >> This is something that I am interested in within the
>> >> http://bioexcel.eu/ project - but is a bit more architecturally
>> >> challenging as it would mean things like dynamic port bindings in the
>> >> workflow configuration. It
>> >>
>> >> I've tracked this as https://issues.apache.org/jira/browse/TAVERNA-941
>> >> but IMHO it would be a too big task for a GSOC project.
>> >>
>> >>
>> >> > There are two separate issues:
>> >> >
>> >> > https://issues.apache.org/jira/browse/TAVERNA-901 is to allow
>> Taverna
>> >> > workflows to include steps that are tools that inside docker
>> containers.
>> >> > That would be deployment of an existing docker.
>> >> >
>> >> > https://issues.apache.org/jira/browse/TAVERNA-879 is to create
>> docker
>> >> > containers for Taverna workflows. That is packaging and (because the
>> >> > containers will be part of a CWL workflow) deployment.
>> >>
>> >> Nadeesh, I've added your interest to
>> >>
>> https://cwiki.apache.org/confluence/display/TAVERNADEV/2016-03+GSOC+2016
>> >>
>> >> but if you are more interested in packaging for Docker, then perhaps
>> >> we could look at the existing Docker wrapping of Taverna Server
>> >>
>> >> https://hub.docker.com/r/taverna/taverna-server/
>> >> https://github.com/taverna-extras/taverna-server-docker
>> >>
>> >> and consider doing something similar for our command line tools
>> >> "executeworkflow" and "tavlang".
>> >>
>> >> That shouldn't take you too long - so you may want to prototype one of
>> >> TAVERNA-901 and TAVERNA-879 as well.
>> >>
>> >>
>> >> I know Dmitry used wsdl-generic as a command line tool as in
>> >> http://inb.bsc.es/documents/galaxygears/ which could also be
>> >> interesting as a Docker container (e.g. for running WSDL services
>> >> within a CWL workflow), but I am not sure where the source code for
>> >> that is (is that outside Apache, Dmitry?)
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> >> If that is the case, can you please clarify what is the current
>> >> packaging
>> >> >> deployment model ?
>> >>
>> >>
>> >> For Taverna 2.5 we used install4j via Maven to package into an
>> installer:
>> >>
>> >>
>> >>
>> https://github.com/apache/incubator-taverna-commandline/blob/old/taverna-commandline-product-core-20141228/pom.xml#L1712
>> >>
>> >> That's what made the installers we have at
>> >> https://taverna.incubator.apache.org/download/command-line-tool/
>> >>
>> >> One packaging task we could consider for Taverna 3.0 is to update
>> >>
>> >>
>> https://github.com/apache/incubator-taverna-commandline/tree/master/taverna-commandline-product
>> >> to use install4j or similar to generate such installers also for
>> >> Taverna 3, which has a slightly different
>> >> folder structure.
>> >>
>> >> As an open source project we have 5 licenses for Install4j, but we
>> >> have not asked the author yet if this is still valid under Apache.
>> >> Now releasing under Apache license instead of LGPL we would ironically
>> >> now be allowed to bundle the binary Oracle JRE rather than having to
>> >> use the open source
>> >> OpenJDK builds.
>> >>
>> >> But I'm afraid such a task would not involve Docker - as I think most
>> >> users of Taverna Command line would not have Docker (or even the right
>> >> Java version) installed.
>> >>
>> >>
>> >>
>> >> > There is no current mechanism for packaging up something to run a
>> >> specific
>> >> > Taverna workflow. You can run workflows from the command line tool
>> or on
>> >> a
>> >> > Taverna Server.
>> >>
>> >> Making a recipe for generating Docker images for running a particular
>> >> Taverna Workflow could be interesting. We could then have "Save
>> >> workflow as Docker image" built into Taverna!
>> >>
>> >> If you are thinking about such an idea, feel free to suggest it as a
>> >> new Jira task!
>> >>
>> >>
>> >>
>> >> Overall - you don't have to pick exactly our ideas - you can be
>> >> inspired by them and will have to write your own proposal about what
>> >> work you propose to do (which should be reasonably scoped and
>> >> scheduled) and say how Apache Taverna would benefit.
>> >>
>> >> Looking forward to hear more about your ideas!
>> >>
>> >> --
>> >> Stian Soiland-Reyes
>> >> Apache Taverna (incubating), Apache Commons RDF (incubating)
>> >> http://orcid.org/0000-0001-9842-9718
>> >>
>>
>>
>>
>> --
>> Stian Soiland-Reyes
>> Apache Taverna (incubating), Apache Commons RDF (incubating)
>> http://orcid.org/0000-0001-9842-9718
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message