taverna-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stian Soiland-Reyes <st...@apache.org>
Subject Re: Finalize Docker Invoke JSON format
Date Wed, 08 Jun 2016 16:41:49 GMT
Perhaps we need a configuration/preferences per Workbench/workflow
about where to run the docker images. It's going to be different per
user.. e.g. if you use Docker on a Mac then it will be some local IP
address to the virtual machine - but then it's probably going to be
the same for each workflow, or for each step in the same workflow.

So do you think we need it to be part of the Activity configuration?
Perhaps unnecessary flexibility..



I don't think we need to aim for a Docker Machine-type or Docker
Swarm-type scenario, although that would be cool to keep in mind to
enable cloud-based workflows:

https://docs.docker.com/machine/
https://docs.docker.com/swarm/overview/


If we do it as a configuration, then I guess we should use
org.apache.taverna.configuration Configuration Manager

https://taverna.incubator.apache.org/javadoc/taverna-osgi/org/apache/taverna/configuration/Configurable.html

see for instance
https://github.com/apache/incubator-taverna-engine/blob/master/taverna-database-configuration-impl/src/main/java/org/apache/taverna/configuration/database/impl/DatabaseConfigurationImpl.java


then users could modify the equivalent of

conf/docker.conf

to specify those defaults.



On 5 June 2016 at 05:52, Nadeesh Dilanga <nadeesh092@gmail.com> wrote:
> Hi Stian,
> New container for every execution means, we actually not start a container
> with a given name, but automatically generated name as you mentioned and
> just start container in every workflow node execution ? Can one workflow
> have more than one Docker invocation steps ?
>
> Yes, now it really make sense why we need a JSON format for Docker run. And
> that should be the Docker run JSON payload.
>
> Following is a simple format I tried to invoke a Docker run(start a
> container).
>
> curl ­X POST ­H "Content­Type: application/json" ­d '{
> "Hostname":"","User":"","Memory":0,"MemorySwap":0,"AttachStdin":false,"AttachStdout":true,"Attachstderr":true,
> "PortSpecs":null,"Tty":false,"OpenStdin":false,"StdinOnce":false,"Env":null,"Cmd":["date"],"Image":"ubuntu",
> "Tag":"latest","Volumes":{"/tmp":{}
> },"WorkingDir":"","DisableNetwork":false,"ExposedPorts":{"22/tcp": {} }
>
> Here is an example from the api doc which has more additional elements to
> pass, request [2] and response[3].
>
> And sure, let's start with dumping logs to stdout.
>
> So the question now is whether we are going ahead with a full [2] like
> comprehensive request. And on top of that, Host name is a mandatory field
> which we need input from the workflow interface.  What do you think ?
>
>
> [2]
> POST /containers/create HTTP/1.1 Content-Type: application/json { "Hostname":
> "", "Domainname": "", "User": "", "AttachStdin": false, "AttachStdout": true,
> "AttachStderr": true, "Tty": false, "OpenStdin": false, "StdinOnce": false,
> "Env": [ "FOO=bar", "BAZ=quux" ], "Cmd": [ "date" ], "Entrypoint": "", "
> Image": "ubuntu", "Labels": { "com.example.vendor": "Acme", "
> com.example.license": "GPL", "com.example.version": "1.0" }, "Volumes": { "
> /volumes/data": {} }, "WorkingDir": "", "NetworkDisabled": false, "
> MacAddress": "12:34:56:78:9a:bc", "ExposedPorts": { "22/tcp": {} }, "
> StopSignal": "SIGTERM", "HostConfig": { "Binds": ["/tmp:/tmp"], "Links": [
> "redis3:redis"], "Memory": 0, "MemorySwap": 0, "MemoryReservation": 0, "
> KernelMemory": 0, "CpuShares": 512, "CpuPeriod": 100000, "CpuQuota": 50000,
> "CpusetCpus": "0,1", "CpusetMems": "0,1", "BlkioWeight": 300, "
> BlkioWeightDevice": [{}], "BlkioDeviceReadBps": [{}], "BlkioDeviceReadIOps":
> [{}], "BlkioDeviceWriteBps": [{}], "BlkioDeviceWriteIOps": [{}], "
> MemorySwappiness": 60, "OomKillDisable": false, "OomScoreAdj": 500, "
> PidsLimit": -1, "PortBindings": { "22/tcp": [{ "HostPort": "11022" }] }, "
> PublishAllPorts": false, "Privileged": false, "ReadonlyRootfs": false, "Dns":
> ["8.8.8.8"], "DnsOptions": [""], "DnsSearch": [""], "ExtraHosts": null, "
> VolumesFrom": ["parent", "other:ro"], "CapAdd": ["NET_ADMIN"], "CapDrop": [
> "MKNOD"], "GroupAdd": ["newgroup"], "RestartPolicy": { "Name": "", "
> MaximumRetryCount": 0 }, "NetworkMode": "bridge", "Devices": [], "Ulimits":
> [{}], "LogConfig": { "Type": "json-file", "Config": {} }, "SecurityOpt":
> [], "CgroupParent": "", "VolumeDriver": "", "ShmSize": 67108864 }, "
> NetworkingConfig": { "EndpointsConfig": { "isolated_nw" : { "IPAMConfig": {
> "IPv4Address":"172.20.30.33", "IPv6Address":"2001:db8:abcd::3033" }, "Links
> ":["container_1", "container_2"], "Aliases":["server_x", "server_y"] } } }
>
>
> [3] And response comes as
>
> HTTP/1.1 201 Created Content-Type: application/json { "Id":"e90e34656806",
> "Warnings":[] }
>
>
>
>
> On Sat, Jun 4, 2016 at 6:39 AM, Stian Soiland-Reyes <stain@apache.org>
> wrote:
>
>> Hi!
>>
>> I think we can assume that docker images are in the registry, although I
>> know it's possible to specify alternate registries as well, which is useful
>> for commercial entities. In could even be possible to store a custom docker
>> image within the Taverna wfbundle zip, although it would make it ginourmous
>> ;) ,-- so let's start by assuming a public image in the Docket hub,
>> probably versioned.
>>
>> Secondly, no, I don't think we can assume the container exists, as
>> workflows can be shared. A good motivator for the functionality you are
>> building is that a Taverna workflow that relies on command line tools can
>> be shared with others who don't have those tools already installed (in
>> expected location, path etc).
>>
>> I think for many use cases the container Name is not important and can be
>> new for every executions, e.g. let Docker assign it's usual Jolly Badger
>> style names.
>>
>> What is the Docker API for setting up a container? Do you need to pull the
>> images first? It would be cool if we can pull at the beginning of a
>> workflow run (e.g. asynchronously when an Activity is configured) so that
>> does not need to delay execution at a later step.
>>
>> We will probably need a volume mount of a temporary directory, so that
>> input and output files can be provided to the command, but if you prefer it
>> might be easier to start with stdin and stdout support; similar to the Tool
>> Activity.
>> On 4 Jun 2016 5:10 a.m., "Nadeesh Dilanga" <nadeesh092@gmail.com> wrote:
>>
>> > Hi all,
>> > I am starting this thread to discuss and finalize the docker commands we
>> > need to expose for client side(Taverna).
>> >
>> > Latest stable docker remote API is version 1.23[1]. And it has several
>> APIs
>> > that can be useful.
>> >
>> > The original JIRA [2] mentioned about the JSON format to a docker run. I
>> > hope it meant about the docker config.json ?
>> >
>> > Because, given we use remote APIs, I would like to know what are the
>> > expectations are ?
>> >
>> > 1. Do we assume that Images are created and published to the registry.
>> > 2. Do we assume that docker container is created
>> >
>> > Given #1 and #2 done, then we are talking about starting the
>> > container(~docker run). If that is the case, when we use remote APIs we
>> > only need following, and no need of a JSON:
>> >
>> > Request: POST /containers/(id or name)/start
>> > Response: HTTP/1.1 204 No Content
>> >
>> > There are other responses too:
>> >
>> > Status Codes:
>> >
>> >    - *204* – no error
>> >    - *304* – container already started
>> >    - *404* – no such container
>> >    - *500* – server error
>> >
>> >
>> >
>> > [1] -
>> > https://docs.docker.com/engine/reference/api/docker_remote_api_v1.23/
>> >
>>



-- 
Stian Soiland-Reyes
Apache Taverna (incubating), Apache Commons
http://orcid.org/0000-0001-9842-9718

Mime
View raw message