mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sharma Podila <>
Subject Re: Exposing executor container
Date Tue, 12 Aug 2014 20:48:01 GMT
You may already know this, but, this does sound similar to

There was a possible (and partial) solution in using soft limits for memory
for which a ticket was opened.

On Tue, Aug 12, 2014 at 1:17 PM, Thomas Petr <> wrote:

> That solution would likely cause us more pain -- we'd still need to figure
> out an appropriate amount of resources to request for artifact downloads /
> extractions, our scheduler would need to be sophisticated enough to only
> accept offers from the same slave that the setup task ran on, and we'd need
> to manage some new shared artifact storage location outside of the
> containers. Is splitting workflows into multiple tasks like this a common
> pattern?
> I personally agree that tasks manually overriding cgroups limits is a
> little sketchy (and am curious how MESOS-1279 would affect this
> discussion), but I doubt that we'll be the last people to attempt something
> like this. In other words, we acknowledge we're going rogue by
> temporarily overriding the limits... are there other implications of
> exposing the container ID that you're worried about?
> Do you have any thoughts about my other idea (overriding the fetcher
> executable for a task)?
> Thanks,
> Tom
> On Tue, Aug 12, 2014 at 2:05 PM, Vinod Kone <> wrote:
>> Thanks Thomas for the clarification.
>> One solution you could consider would be separating out the setup
>> (fetch/extract) phase and running phase into separate mesos tasks. That way
>> you can give the setup task resources need for fetching/extracting and as
>> soon as it is done, you can send a TASK_FINISHED so that the resources used
>> by that task are reclaimed by Mesos. That would give you the dynamism you
>> need. Would that work in your scenario?
>> Having the executor change cgroup limits behind the scenes, opaquely to
>> Mesos, seems like a recipe for problems in the future to me, since it could
>> lead to temporary over-commit of resources and affect isolation across
>> containers.
>> On Tue, Aug 12, 2014 at 10:45 AM, Thomas Petr <> wrote:
>>> Hey Vinod,
>>> We're not using mesos-fetcher to download the executor -- we ensure our
>>> executor exists on the slaves beforehand (during machine provisioning, to
>>> be exact). The issue that Whitney is talking about is OOMing while fetching
>>> artifacts necessary for task execution (like the JAR for a web service).
>>> Our own executor
>>> <>
>>> some nice enhancements around S3 downloads and artifact caching that we
>>> don't necessarily want to lose if we switched back to using mesos-fetcher.
>>> Surfacing the container ID seems like a trivial change, but another
>>> alternative could be to allow frameworks to specify an alternative fetcher
>>> executable (perhaps in CommandInfo?).
>>> Thanks,
>>> Tom
>>> On Tue, Aug 12, 2014 at 1:09 PM, Vinod Kone <> wrote:
>>>> Hi Whitney,
>>>> While we could conceivably set the container id in the environment of
>>>> the executor, I would like to understand the problem you are facing.
>>>> The fetching and extracting of the executor is done in by
>>>> mesos-fetcher, a process forked by slave and run under slave's cgroup.
>>>> AFAICT, this shouldn't cause an OOM in the executor. Does your executor do
>>>> more fetches/extracts once it is launched (e.g., for user's tasks)?

View raw message