mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Anderson <>
Subject Re: Troubles with slave recovery via Docker containerizer on 0.23.0
Date Thu, 06 Aug 2015 16:37:25 GMT
Hi Tim,

That's the output from `docker inspect`. I've gisted the full contents
of the container's log file (in all of its JSON-encoded glory) here:

The slave itself isn't logging much of interest, just various
"Executor has terminated with unknown status" messages, etc.

For context, my container is running 0.23.0 installed from packages on
Ubuntu 14.04. Docker is at 1.6.2.


On Wed, Aug 5, 2015 at 4:28 PM, Tim Chen <> wrote:
> Hi Ben,
> Did you get the command from docker inspect or from the slave log?
> If it's from the slave log then we don't actually print out the exact way we
> exec the command, but just joining the exec arguments with a space in
> between.
> What's the exact error in the slave/sandbox stderr log?
> Tim
> On Wed, Aug 5, 2015 at 4:18 PM, Benjamin Anderson
> <> wrote:
>> Hi there - I'm working on setting up a Mesos environment with the
>> Docker containerizer and can't seem to get the recovery feature
>> working. I'm running CoreOS, so the slave processes themselves are
>> containerized. I have no issues running jobs without the recovery
>> features enabled, but all jobs fail to boot when I add the following
>> flags:
>>     MESOS_DOCKER_MESOS_IMAGE=myrepo/my-slave-container
>> Inspecting the Docker images and their log output reveals that the
>> container invocation appears to be flawed - see this gist:
>> The containerizer is attempting to invoke an unquoted command via
>> `/bin/sh -c`, which, predictably, fails to pass the complete command.
>> This results in the error message shown in the second file in the
>> linked gist.
>> This is reproducible manually; quoting the arguments to `/bin/sh -c`
>> results in success (at least, it correctly receives the supplied
>> arguments).
>> I gather that this is related to MESOS-2115, and it's clear that this
>> patch[1] changed that behavior significantly, but if it introduced a
>> bug I can't see it. It's possible that my instance is configured
>> incorrectly as well; the documentation here is a bit vague and there
>> aren't many examples on the web.
>> Thanks in advance,
>> --
>> b
>> [1]:

View raw message