taverna-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amila Karunathilaka (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TAVERNA-901) Run Docker from Taverna
Date Wed, 09 Mar 2016 06:25:40 GMT

    [ https://issues.apache.org/jira/browse/TAVERNA-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186606#comment-15186606

Amila Karunathilaka commented on TAVERNA-901:

Hi Stian,
I'm Amila Karunathilaka 3rd year undergraduate student in Computer Science & Engineering
from University of Moratuwa.
I'm interested about this project idea. I have knowledge  and experience about Java, Docker
and Kuberneties.
I followed your references.
I would like to know more about this project. Please give me some tips to getting started.

Thank you.

> Run Docker from Taverna
> -----------------------
>                 Key: TAVERNA-901
>                 URL: https://issues.apache.org/jira/browse/TAVERNA-901
>             Project: Apache Taverna
>          Issue Type: Story
>          Components: Taverna Common Activities
>            Reporter: Stian Soiland-Reyes
>              Labels: docker, gsoc2016, tool, unix, workflow
> h2. GSOC: Add Docker support to Taverna
> The proposed GSOC project is to add support for invoking Docker containers within Taverna
by adding a Docker Activity plugin.
> Tasks include:
> * Propose JSON model for describing a {{docker run}} command
> * (Optional) Validate Docker activity config, e.g. can the docker image be pulled?
> * Investigate: New Docker activity, or modify existing External Tool activity?
> * Make/modify a Taverna Activity plugin for executing Docker (may or may not be based
on the External Tool activity)
> * (Optional) Capture docker metadata and add to workflow run provenance (e.g. which docker
image ID was pulled)
> * (Optional) Add Bioboxes support
> * (Optional) Integrate with CWL support (TAVERNA-900)
> Other Taverna/Docker--related tasks can of course also be proposed by the students.
> h2. Docker
> [Docker|https://www.docker.com/] is a Linux container virtualization platform. A Linux
_container_ is a special kernel feature, which similar to _chroot jails_ behave as a separate
machine, but unlike Virtual Machines do not have the overhead of virtualization of hardware.

> Docker is popular in the _devops_ movement as it provides an easy way to install dependencies
for software development and deployment, e.g. to run servers for mySQL, Apache Solr or node.js.
> In brief a _Docker Image_ contains a virtual Linux file system (e.g. a miniature Debian
installation). A _Docker Container_ is a particular execution of a Docker Image, which typically
runs a single process as installed within the container, and may have network ports exposed
to the world, or have parts of the host computer's file system mounted within the inner container.
> One great advantage of Docker is that it simplifies tool *installation*, as each Docker
image is a _self-contained Linux distribution_ which don't have to be compatible with the
host computer (beyond the kernel). 
> For Windows and OS X users Docker automatically manage a virtual machine running the
Linux containers, but Docker containers can also be deployed on the cloud or a local cluster,
e,g. using _Docker Machine_.
> Docker images can be created from a {{Dockerfile}}, which basically lists the commands
to run to prepare the image. Docker images can be chained together using _base images_ - for
instance to build on an image with mySQL, the Dockerfile says {{FROM mysql}}.
> Thus Docker is also an important tool for *reproducibility*, as these images can be automatically
kept up to date and are distributed through the [Docker hub|https://hub.docker.com/]. In bioinformatics,
this has led to [Bioboxes|http://bioboxes.org/], a standard for creating interchangable bioinformatics
software containers.
> h2. Taverna
> [Apache Taverna|http://taverna.incubator.apache.org/] (incubating) is a Java-based workflow
system with a graphical design interface. Taverna workflows can combine many different service
types, including REST and WSDL services, command line tools, scripts (e.g. BeanShell, R) and
custom plugins (e.g. BioMart).
> Taverna workflows can be executed on the desktop, on the command line, or on a Taverna
server installation, which can be controlled from a web portal, a mobile app, or integrated
into third-party applications.
> Taverna is used in a [wide range of sciences|http://taverna.incubator.apache.org/introduction/taverna-in-use/]
for data analysis and processing, including bioinformatics, cheminformatics, biodiversity
and musicology. Workflow engine features include provenance tracking, implicit parallelism/iterations,
retry/failover and looping. 
> Taverna workflows are commonly shared on [myExperiment|http://www.myexperiment.org],
and can either be created graphically in the [Taverna workbench|http://taverna.incubator.apache.org/download/workbench/],
programmatically using the [Taverna Language API|http://taverna.incubator.apache.org/download/language/]
or by generating workflow definitions in the [SCUFL2|http://taverna.incubator.apache.org/documentation/scufl2/]
> h2. Community engagement
> Interested GSOC students are requested to engage early with the [dev@taverna|http://taverna.incubator.apache.org/community/lists#devtaverna]
mailing list to describe their ideas for approaching this project, to clarify the tasks and
for any questions and issues.
> As a first step, the prospective applicant should leave a comment on this Jira issue
to indicate their interest, and the GSOC mentors would be happy to assist on any questions.

> As the project starts we are expecting the student to become part of the dev@taverna
community to regularly discuss their progress. 
> h2. Mentors
> An important part of GSOC is the personal mentoring from existing  members of the open
source community. Our job is not just to teach you how to successfully get through the GSOC
programme, but also to motivate you and make sure you progress. We will show you how to contribute
to open source, debug, improve, document, test and release your code as part of Apache Taverna.

> The GSOC mentors for Apache Taverna have experience from guiding multiple earlier GSOC
students and local students, and can be contacted privately for day-to-day interaction and
> Mentors for this GSOC project:
> * Stian Soiland-Reyes

This message was sent by Atlassian JIRA

View raw message