mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nils De Moor <nils.de.m...@gmail.com>
Subject Re: Docker odd behavior
Date Thu, 23 Oct 2014 08:50:49 GMT
Sorry for that, here is the direct link:
http://mail-archives.apache.org/mod_mbox/mesos-user/201410.mbox/%3CCALnA0DwbqJdEm3at7SonsXmCwxnN3%3DCUrwgoPBHXJDoOFyJjig%40mail.gmail.com%3E

On Thu, Oct 23, 2014 at 10:49 AM, Nils De Moor <nils.de.moor@gmail.com>
wrote:

> I had the same issue as you did, here is how I fixed it:
> http://mail-archives.apache.org/mod_mbox/mesos-user/201410.mbox/browser
>
> On Thu, Oct 23, 2014 at 1:42 AM, Connor Doyle <connor@mesosphere.io>
> wrote:
>
>> Hi Eduardo,
>>
>> There is a known defect in Mesos that matches your description:
>> https://issues.apache.org/jira/browse/MESOS-1915
>> https://issues.apache.org/jira/browse/MESOS-1884
>>
>> A fix will be included in the next release.
>> https://reviews.apache.org/r/26486
>>
>> You see the killTask because the default --task_launch_timeout value for
>> Marathon is 60 seconds.
>> Created an issue to make the logging around this better:
>> https://github.com/mesosphere/marathon/issues/732
>>
>> --
>> Connor
>>
>>
>> On Oct 22, 2014, at 16:18, Eduardo Jiménez <yoeduardoj@gmail.com> wrote:
>>
>> > Hi,
>> >
>> > I've started experimenting with mesos using the docker containerizer,
>> and running a simple example got into a very strange state.
>> >
>> > I have mesos-0.20.1, marathon-0.7 setup on EC2, using Amazon Linux:
>> >
>> > Linux <ip> 3.14.20-20.44.amzn1.x86_64 #1 SMP Mon Oct 6 22:52:46 UTC
>> 2014 x86_64 x86_64 x86_64 GNU/Linux
>> >
>> > Docker version 1.2.0, build fa7b24f/1.2.0
>> >
>> > I start the mesos slave with these relevant options:
>> >
>> > --cgroups_hierarchy=/cgroup
>> > --containerizers=docker,mesos
>> > --executor_registration_timeout=5mins
>> > --isolation=cgroups/cpu,cgroups/mem
>> >
>> > I launched a very simple app, which is from the mesosphere examples:
>> >
>> > {
>> >   "container": {
>> >     "type": "DOCKER",
>> >     "docker": {
>> >       "image": "libmesos/ubuntu"
>> >     }
>> >   },
>> >   "id": "ubuntu-docker2",
>> >   "instances": "1",
>> >   "cpus": "0.5",
>> >   "mem": "512",
>> >   "uris": [],
>> >   "cmd": "while sleep 10; do date -u +%T; done"
>> > }
>> >
>> > The app launches, but then mesos states the task is KILLED, yet the
>> docker container is STILL running. Here's the sequence of logs from that
>> mesos-slave.
>> >
>> > 1) Task gets created and assigned:
>> >
>> > I1022 17:44:13.971096 15195 slave.cpp:1002] Got assigned task
>> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 for framework
>> 20141017-172055-3489660938-5050-1603-0000
>> > I1022 17:44:13.971367 15195 slave.cpp:1112] Launching task
>> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 for framework
>> 20141017-172055-3489660938-5050-1603-0000
>> > I1022 17:44:13.973047 15195 slave.cpp:1222] Queuing task
>> 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799' for executor
>> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
>> '20141017-172055-3489660938-5050-1603-0000
>> > I1022 17:44:13.989893 15195 docker.cpp:743] Starting container
>> 'c1fc27c8-13e9-484f-a30c-cb062ec4c978' for task
>> 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799' (and executor
>> 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799') of framework
>> '20141017-172055-3489660938-5050-1603-0000'
>> >
>> > So far so good. The log statements right next to "Starting container"
>> is:
>> >
>> > I1022 17:45:14.893309 15196 slave.cpp:1278] Asked to kill task
>> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
>> 20141017-172055-3489660938-5050-1603-0000
>> > I1022 17:45:14.894579 15196 slave.cpp:2088] Handling status update
>> TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for task
>> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
>> 20141017-172055-3489660938-5050-1603-0000 from @0.0.0.0:0
>> > W1022 17:45:14.894798 15196 slave.cpp:1354] Killing the unregistered
>> executor 'ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799' of framework
>> 20141017-172055-3489660938-5050-1603-0000 because it has no tasks
>> > E1022 17:45:14.925014 15192 slave.cpp:2205] Failed to update resources
>> for container c1fc27c8-13e9-484f-a30c-cb062ec4c978 of executor
>> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 running task
>> ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 on status update for
>> terminal task, destroying container: No container found
>> >
>> > After this, there's several log messages like this:
>> >
>> > I1022 17:45:14.926197 15194 status_update_manager.cpp:320] Received
>> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for
>> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
>> 20141017-172055-3489660938-5050-1603-0000
>> > I1022 17:45:14.926378 15194 status_update_manager.cpp:373] Forwarding
>> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for
>> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
>> 20141017-172055-3489660938-5050-1603-0000 to master@10.0.0.147:5050
>> > W1022 17:45:16.169214 15196 status_update_manager.cpp:181] Resending
>> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for
>> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
>> 20141017-172055-3489660938-5050-1603-0000
>> > I1022 17:45:16.169275 15196 status_update_manager.cpp:373] Forwarding
>> status update TASK_KILLED (UUID: 660dfd13-61a0-4e3f-9590-fba0d1a42ab2) for
>> task ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799 of framework
>> 20141017-172055-3489660938-5050-1603-0000 to master@10.0.0.147:5050
>> >
>> >
>> > Eventually the TASK_KILLED update is acked and the Mesos UI shows the
>> task as killed. By then, the process should be dead, but its not.
>> >
>> > $ sudo docker ps
>> > CONTAINER ID        IMAGE                    COMMAND
>> CREATED             STATUS              PORTS               NAMES
>> > f76784e1af8b        libmesos/ubuntu:latest   "/bin/sh -c 'while s   5
>> hours ago         Up 5 hours
>> mesos-c1fc27c8-13e9-484f-a30c-cb062ec4c978
>> >
>> >
>> > The container shows in the UI like this:
>> >
>> > ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799
>>  ubuntu-docker2.0995fb7f-5a13-11e4-a18e-56847afe9799     KILLED  5 hours
>> ago     5 hours ago
>> > And its been running the whole time.
>> >
>> > There's no other logging indicating why killTask was invoked, which
>> makes this extremely frustrating to debug.
>> >
>> > Has anyone seen something similar?
>> >
>> > Thanks,
>> >
>> > Eduardo
>> >
>>
>>
>

Mime
View raw message