aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From meghdoot bhattacharya <meghdoo...@yahoo.com.INVALID>
Subject Re: Support instance-specific TaskConfig in CreateJob API
Date Tue, 16 Aug 2016 06:49:20 GMT
DockerContainerizer integration has drawbacks. You can probably use Thermos (with MesosContainerizer)
and launch docker containers through it and using "instanceId" pull in configs and then pass
options to docker run.
Alternatively you could use what we are buildinghttps://github.com/mesos/docker-compose-executor

where we massage the final compose file in sandbox using custom pluginshttps://github.com/mesos/docker-compose-executor/blob/master/src/main/java/com/paypal/mesos/executor/pluginapi/ComposeExecutorPluginImpl.java

We get pod IP etc and generate the final compose file. One can use the instanceId in this
case as well to grab configs, edit the final compose file for example in your use case.
Thx




      From: Mauricio Garavaglia <mauriciogaravaglia@gmail.com>
 To: dev@aurora.apache.org 
 Sent: Monday, August 15, 2016 11:12 AM
 Subject: Re: Support instance-specific TaskConfig in CreateJob API
   
On Mon, Aug 15, 2016 at 2:49 PM, David McLaughlin <dmclaughlin@apache.org>
wrote:

> On Mon, Aug 15, 2016 at 10:40 AM, Mauricio Garavaglia <
> mauriciogaravaglia@gmail.com> wrote:
>
> > Hi,
> >
> > WRT the constraint use case:
> >
> > Let's say I have two instances of a service, these instances would need
> > different Docker arguments:
> >
>
> But that's less about using Docker and more how you're using Docker in
> particular? Also the features of your executor? For example, if you're
> using Thermos you can access the instance id in your process definition.
>
>
For example, using the ceph rbd volume plugin [1] to implement how each
instance stores the data. Or the journald log driver [2] to centralize the
logs related to an instance later in logstash.

I agree that those are not the trivial docker use case examples, but I
don't think they are way off the limits of what we should be able to do.

Regarding accessing the instance id in the executor. When thermos starts
it's a bit late as the container is already created.

[1] http://ceph.com/planet/getting-started-with-the-docker-
rbd-volume-plugin/
[2] https://docs.docker.com/engine/admin/logging/journald/



>
>
> >
> > - Labels. Log drivers uses container labels to identify who is producing
> > this logs. The container name doesn't work with Mesos of course. So you
> > have loglabel=serviceA-instance-0 and the other instance has
> > loglabel=serviceA-instance-1.
> > - Volumes. Each instance must read/write on its own volume. In the
> similar
> > way each aurora instance writes to its own distributed log instance.
> > - Ip addresses, etc.
> >
> > That means having separated jobs would be the thing to do right now. But
> if
> > for HA reasons we don't want to let aurora schedule them on the same
> rack,
> > the solution would be to add constraints on both jobs manually assigning
> > them to a predefined set of racks.
> >
> > Coping with constraints is not the only use case; the same goes for
> rolling
> > updates. Having different jobs means that the rolling update process
> needs
> > to be manually implemented on top of the aurora api instead of using the
> > one that is provided out of the box.
> >
> >
> >
> >
> > On Mon, Aug 15, 2016 at 2:05 PM, Maxim Khutornenko <maxim@apache.org>
> > wrote:
> >
> > > I would love to hear more about constraint use cases that don't work
> > across
> > > jobs to see if/how we can extend Aurora to support them.
> > >
> > > As far as heterogeneous jobs go, that effort would require rethinking
> > quite
> > > a few assumptions around fundamental Aurora principles to ensure we
> don't
> > > lock ourselves into the corner wrt future features by accepting an
> "easy
> > to
> > > do" change short-term. I am -1 on supporting anything specific for
> > > adhoc jobs only. IMO, this has to be an all-or-nothing feature adding
> > > support for heterogeneous jobs across the stack.
> > >
> > > If you guys feel strongly about this idea, please craft a high-level
> > design
> > > summary for the community to explore and review.
> > >
> > > On Sat, Aug 13, 2016 at 7:43 AM, Mauricio Garavaglia <
> > > mauriciogaravaglia@gmail.com> wrote:
> > >
> > > > Hi,
> > > >
> > > > We have been experimenting with the idea of having heterogeneous
> tasks
> > > in a
> > > > job. Mainly to support different docker container configurations
> (like
> > > > volumes to let tasks have different storage, different labels for
> > logging
> > > > purposes, or ip addresses).
> > > > The main reason for using this instead of separate jobs is that
> > > scheduling
> > > > constraints doesn't work across jobs, and we may want to have rack
> > > > anti-affinity for the different instances.
> > > >
> > > > You can check how it works on the README in the repo [
> > > > https://github.com/medallia/aurora/tree/0.13.0-medallia]. Basically
> > the
> > > > job
> > > > includes a list of parameters that are later interpolated in the task
> > > > config during mesos task creation, so this happens at a latter time
> and
> > > the
> > > > different values to apply to each instance are held in the config. We
> > can
> > > > start discussing if you think the design sounds or the feature could
> be
> > > > helpful and start working to move it upstream.
> > > >
> > > > We used StartJobUpdate to achieve the same purpose but required
> > > > some gymnastics during deployment that we wanted to avoid. Regarding
> > Min
> > > > Cal's issue about short-lived tasks finishing before the update
> starts,
> > > we
> > > > solved it by initially configuring all the tasks with a dummy NOP
> ("no
> > > > operation") process that just sits there waiting to be updated.
> > > >
> > > > Mauricio
> > > >
> > > >
> > > > On Fri, Aug 12, 2016 at 3:17 PM, Min Cai <mincai@gmail.com> wrote:
> > > >
> > > > > Thanks Maxim. Please see my previous email to David's comments for
> > more
> > > > > detailed response.
> > > > >
> > > > > On Fri, Aug 12, 2016 at 9:24 AM, Maxim Khutornenko <
> maxim@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > I am cautious about merging createJob and startJobUpdate as
we
> > don't
> > > > > > support updates of adhoc jobs. It's logically unclear what adhoc
> > job
> > > > > update
> > > > > > would mean as adhoc job instances are not intended to survive
> > > terminal
> > > > > > state.
> > > > > >
> > > > >
> > > > > +1. Our adhoc job instances could be short-lived and finished way
> > > before
> > > > > StartJobUpdate calls are made to Aurora.
> > > > >
> > > > >
> > > > > >
> > > > > > Even if we decided to do so I am afraid it would not help with
> the
> > > > > scenario
> > > > > > of creating a new heterogeneous job as the updater only supports
> a
> > > > single
> > > > > > TaskConfig target.
> > > > > >
> > > > >
> > > > > We will have to make N StartJobUpdate calls to update N distinct
> task
> > > > > configs so it will be expensive if N is large like > 10K.
> > > > >
> > > > >
> > > > > >
> > > > > > Speaking broadly, Aurora is built around the idea of homogenous
> > jobs.
> > > > > It's
> > > > > > possible to have different task configs to support canaries
and
> > > update
> > > > > > rolls but we treat that state as *temporary* until config
> > > > reconciliation
> > > > > > completes.
> > > > > >
> > > > >
> > > > > Agreed that the homogeneous jobs are important design consideration
> > for
> > > > > *long-running* jobs like Services. However, most adhoc jobs are
> > > > > heterogenous by nature. For example, they might need to process
> > > different
> > > > > input files and write to different output files. Or they might take
> > > > > different parameters etc. It would be nice to extend Aurora to
> > support
> > > > > heterogenous tasks so that it can be used for broader use cases as
> a
> > > > > meta-scheduler.
> > > > >
> > > > > Thanks, - Min
> > > > >
> > > > >
> > > > > > On Fri, Aug 12, 2016 at 8:03 AM, David McLaughlin <
> > > > > dmclaughlin@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Min,
> > > > > > >
> > > > > > > I'd prefer to add support for ad-hoc jobs to startJobUpdate
and
> > > > > > completely
> > > > > > > remove the notion of job create.
> > > > > > >
> > > > > > > " Also, even the
> > > > > > > > StartJobUpdate API is not scalable to a job with 10K
~ 100K
> > task
> > > > > > > instances
> > > > > > > > and each instance has different task config since
we will
> have
> > to
> > > > > > invoke
> > > > > > > > StartJobUpdate for each distinct task config."
> > > > > > >
> > > > > > >
> > > > > > > What is the use case for that? Aurora was designed to have
> those
> > as
> > > > > > > separate jobs.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > David
> > > > > > >
> > > > > > > On Thu, Aug 11, 2016 at 2:56 PM, Min Cai <mincai@gmail.com>
> > wrote:
> > > > > > >
> > > > > > > > Hey fellow Aurora team:
> > > > > > > >
> > > > > > > > We would like to propose a simple and backwards compatible
> > > feature
> > > > in
> > > > > > > > CreateJob API so that we can support instance-specific
> > > TaskConfigs.
> > > > > The
> > > > > > > use
> > > > > > > > case here is for an Adhoc job which has different
resource
> > > settings
> > > > > as
> > > > > > > well
> > > > > > > > as different command line arguments for each task
instance.
> > > Aurora
> > > > > > today
> > > > > > > > already support heterogenous tasks for the same job
via
> > > > > StartJobUpdate
> > > > > > > API,
> > > > > > > > i.e. we can update the job instances to use different
task
> > > configs.
> > > > > > This
> > > > > > > > works reasonably well for long running tasks like
Services.
> > > > However,
> > > > > it
> > > > > > > is
> > > > > > > > not feasible for Adhoc jobs where each task will finish
right
> > > away
> > > > > > before
> > > > > > > > we even have a chance to invoke StartJobUpdate. Also,
even
> the
> > > > > > > > StartJobUpdate API is not scalable to a job with 10K
~ 100K
> > task
> > > > > > > instances
> > > > > > > > and each instance has different task config since
we will
> have
> > to
> > > > > > invoke
> > > > > > > > StartJobUpdate for each distinct task config.
> > > > > > > >
> > > > > > > > The proposal we have is to add an optional field in
> > > > JobConfiguration
> > > > > > for
> > > > > > > > instance specific task config. It will be override
the
> default
> > > task
> > > > > > > config
> > > > > > > > for given instance ID ranges if specific. Otherwise,
> everything
> > > > will
> > > > > be
> > > > > > > > backwards compatibility as current API. The implementation
of
> > > this
> > > > > > change
> > > > > > > > also seems to be very simple. We only need to plumb
instance
> > > > specific
> > > > > > > tasks
> > > > > > > > configs when we call statemanager.insertPendingTasks
in
> > > > > > > > SchedulerThriftInterface.createJob function.
> > > > > > > >
> > > > > > > >  /**
> > > > > > > >  * Description of an Aurora job. One task will be
scheduled
> > for
> > > > each
> > > > > > > > instance within the job.
> > > > > > > >  */
> > > > > > > > @@ -328,13 +343,17 @@ struct JobConfiguration {
> > > > > > > >    4: string cronSchedule
> > > > > > > >    /** Collision policy to use when handling overlapping
cron
> > > runs.
> > > > > > > > Default is KILL_EXISTING. */
> > > > > > > >    5: CronCollisionPolicy cronCollisionPolicy
> > > > > > > > -  /** Task configuration for this job. */
> > > > > > > > +  /** Default task configuration for all instances
of this
> > job.
> > > */
> > > > > > > >    6: TaskConfig taskConfig
> > > > > > > >    /**
> > > > > > > >    * The number of instances in the job. Generated
instance
> > IDs
> > > > for
> > > > > > > tasks
> > > > > > > > will be in the range
> > > > > > > >    * [0, instances).
> > > > > > > >    */
> > > > > > > >    8: i32 instanceCount
> > > > > > > > +  /**
> > > > > > > > +  * The instance specific task configs that override
the
> > > default
> > > > > task
> > > > > > > > config for given
> > > > > > > > +  * instanceId ranges.
> > > > > > > > +  */
> > > > > > > > +  10: optional set<InstanceTaskConfig> instanceTaskConfigs
> > > > > > > >  }
> > > > > > > >
> > > > > > > > Please let us know your comments and suggestions.
> > > > > > > >
> > > > > > > > Thanks, - Min
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message