mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kapil Malik <kma...@adobe.com>
Subject RE: Custom docker executor
Date Sun, 09 Aug 2015 03:16:58 GMT
Hi Tim,

Thanks again. Inline.


From: Tim Chen [mailto:tim@mesosphere.io]
Sent: 08 August 2015 23:23
To: user@mesos.apache.org
Subject: Re: Custom docker executor

Hi Kapil,

Thanks these information are very useful.

About failure callbacks in Chronos, I agree it's indeed something that Chronos should just
provide as it's a common feature for job dependency managers. We can have more discussions
on your github issue.
[kmalik] : Agreed. I am also thinking of raising a PR with marathon like functionality. This
might be simplest for now.

Marathon doesn't have a failure callback as it simply keeps retry for a long running service,
do you need a callback for every failed attempt? And for the event stream if too much information
perhaps Marathon can even provide a filter so you can only subscribe to events you care about.
Feel free to leave a issue on github and we can discuss there as well.
[kmalik] : Hmm, right actually I realize in my case it’s more of a health check. A filter
will be good indeed nevertheless, sure will open an issue.

About custom health checks, we've already provided the ability in Mesos to provide us a command
and you can specfiy the time interval and grace period that health check can tolerate, frameworks
just need to integrate with this feature. Marathon already has, which is using a command to
health check a job. However this feature is not yet supported for Docker containers, and I
will be looking into this soon. Does providing a command that runs in the docker container
for health checks sufficient for you?
[kmalik] : True, mesos has good options for health check impl. But marathon / chronos integrate
in limited capacity today. Since I don’t want to write a new framework, I thought of achieving
it via custom executor atleast.
Yes, custom health check support via docker exec will be nice. Something on lines of kubernetes
health probes (being greedy ☺ ).

About spark jobs, have you considered using the new Cluster mode on Spark Mesos instead (in
Spark 1.4)? I've worked on that feature and it launches
all your Spark jobs as Mesos tasks similar to Marathon and Chronos, but it's built into Spark
and can speak Spark submit protocol natively.
[kmalik] : I see, sure will go through this. Thanks
Anyhow, can you specify more exactly what orphan Spark workers are you referring to? I believe
if you launch Spark jobs with talking to Mesos there shouldn't be any need for Spark workers
since that's for Standalone mode.
[kmalik] : Pardon my use of incorrect terminology. I meant – the chronos job (running spark
driver inside a docker) was killed, but the ‘spark framework’ which it registered with
mesos was still alive. Since my API server deals only with chronos / marathon, it had no way
to identify and cleanup the spark framework registered by the docker.

Tim







On Sat, Aug 8, 2015 at 3:45 AM, Kapil Malik <kmalik@adobe.com<mailto:kmalik@adobe.com>>
wrote:
Hi Tim,

Thank you for the quick reply. As I mentioned, we need to run short lived (using Chronos currently)
and long lived (using Marathon currently) jobs.
While I cannot provide elaborate details, objectively, our requirements include the following
–


1.       Failure callback for jobs scheduled on chronos / marathon

a.       Chronos doesn’t provide decent callback hooks today (please correct me if I am
wrong). https://github.com/mesos/chronos/issues/473 . Even for a successfully completed job,
I need to have a dependent job which makes a call to my service.

b.      Marathon has option of subscribing to event bus https://mesosphere.github.io/marathon/docs/event-bus.html
, but we are afraid it might result in information overload, sending all sorts of events.

2.       Custom health checks
This is not a pre/post hook per se. But for a long running (finite = chronos/ infinite = marathon)
job, we need some periodic health checks. Marathon has a basic HTTP health check, good for
start, but we may need slightly more elaborate health checks. Again, Chronos doesn’t have
them at all.

3.       Managing spark jobs
Users of our API can submit docker images, which run a spark job on the mesos. Thus, the spark
driver runs inside the user docker on Chronos / marathon, and registers another mesos framework
for spark, running on other mesos slaves.
Now, in real world with huge amounts of data, it often happens that Spark job fails for one
reason or another. This leaves some orphaned spark workers and needs manual clean up. With
a custom executor, we may pass a hint that it’s a spark job docker so need to ensure appropriate
cleanup in case of failure.

So when you mentioned “hooks that can be performed pre and post container launch”, can
you provide some examples? Are they available as plugins / extensions on mesos or docker?

@Mike Michel, thank you for the powerstrip<https://github.com/ClusterHQ/powerstrip>
link. Looks quite useful, will go through it in detail and see whether it can serve some of
our requirements.

Thanks and regards,

Kapil Malik | kmalik@adobe.com<mailto:kmalik@adobe.com> | 33430 / 8800836581

From: Tim Chen [mailto:tim@mesosphere.io<mailto:tim@mesosphere.io>]
Sent: 08 August 2015 13:42
To: user@mesos.apache.org<mailto:user@mesos.apache.org>
Subject: Re: Custom docker executor

Hi Kapil,

What kind of pre/post actions do you like to perform?

The community has been contributing hooks that can be performed pre and post container launch,
so like to see what your use cases are
and perhaps the new hooks can satisfy your need, or maybe even some other way that can already
do what you like to achieve.

Tim

On Sat, Aug 8, 2015 at 1:01 AM, Kapil Malik <kmalik@adobe.com<mailto:kmalik@adobe.com>>
wrote:
… posting in a fresh thread
Hi,

We have a usecase to run multi-user workloads on mesos. Users provide docker images encapsulating
application logic, which we (we = say some “Central API”) schedule on Chronos / Marathon.
However, we need to run some standard pre / post steps for every docker submitted by users.
We have following options –


1.       Ask every user to embed their logic inside a pre-defined docker template which will
perform pre/post steps.

==> This is error prone, makes us dependent on whether the users followed template, and
not very popular with users either.



2.       Extend every user docker (FROM <>) and find a way to add pre-post steps in
our docker. Refer this docker when scheduling on chronos / marathon.

==> Building new dockers does not scale as users and applications grow



3.       Write a custom executor which will perform the pre-post steps and manage the user
docker lifetime.

==> Deals with user docker lifetime and is obviously complex.

Is there a standard / openly available DockerExecutor which manages the docker lifetime and
which I can extend to build my custom executor?
For instance, do you suggest extending https://github.com/apache/mesos/blob/master/src/docker/executor.cpp
as a starting point? Can I access it in Java?

This way I will be concerned only with my custom logic (pre/post steps) and still get benefits
of a standard way to manage docker containers.


Thanks and regards,

Kapil Malik | kmalik@adobe.com<mailto:kmalik@adobe.com> | 33430 / 8800836581



Mime
View raw message