aurora-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Helmkamp <br...@codeclimate.com>
Subject Re: Suitibility of Aurora for one-time tasks
Date Wed, 26 Feb 2014 21:08:56 GMT
Sure. Yes, they are shell commands and yes they are provided different
configuration on each run.

In effect we have a number of different job types that are queued up,
and we need to run as quickly as possible. Each job type has different
resource requirements. Every time we run the job, we provide different
arguments (the "payload"). For example:

$ ./do_something.sh SOME_ID (Requires 1 CPU and 1GB RAM)
$ ./do_something_else.sh SOME_OTHER_ID (Requires 4 CPU and 4GB RAM)
[... there are about 12 of these ...]

-Bryan

On Wed, Feb 26, 2014 at 3:58 PM, Bill Farner <wfarner@apache.org> wrote:
> Can you offer some more details on what the workload execution looks like?
>  Are these shell commands?  An application that's provided different
> configuration?
>
> -=Bill
>
>
> On Wed, Feb 26, 2014 at 12:45 PM, Bryan Helmkamp <bryan@codeclimate.com>wrote:
>
>> Thanks, Kevin. The idea of always-on workers of varying sizes is
>> effectively what we have right now in our non-Mesos world. The problem
>> is that sometimes we end up with not enough workers for certain
>> classes of jobs (e.g. High Memory), while part of the cluster sits
>> idle.
>>
>> Conceptually, in my mind we would define approximately a dozen Tasks,
>> one for each type of work we need to perform (with different resource
>> requirements), and then run Jobs, each with a Task and a unique
>> payload, but I don't think this model works with Mesos. It seems we'd
>> need to create a unique Task for every Job.
>>
>> -Bryan
>>
>> On Wed, Feb 26, 2014 at 3:35 PM, Kevin Sweeney <kevints@apache.org> wrote:
>> > A job is a group of nearly-identical tasks plus some constraints like
>> rack
>> > diversity. The scheduler considers each task within a job equivalently
>> > schedulable, so you can't vary things like resource footprint. It's
>> > perfectly fine to have several jobs with just a single task, as long as
>> > each has a different job key (which is (role, environment, name)).
>> >
>> > Another approach is to have a bunch of uniform always-on workers (in
>> > different sizes). This can be expressed as a Service like so:
>> >
>> > # workers.aurora
>> > class Profile(Struct):
>> >   queue_name = Required(String)
>> >   resources = Required(Resources)
>> >   instances = Required(Integer)
>> >
>> > HIGH_MEM = Resources(cpu = 8.0, ram = 32 * GB, disk = 64 * GB)
>> > HIGH_CPU = Resources(cpu = 16.0, ram = 4 * GB, disk = 64 * GB)
>> >
>> > work_forever = Process(name = 'work_forever',
>> >   cmdline = '''
>> >     # TODO: Replace this with something that isn't pseudo-bash
>> >     while true; do
>> >       work_item=`take_from_work_queue {{profile.queue_name}}`
>> >       do_work "$work_item"
>> >       tell_work_queue_finished "{{profile.queue_name}}" "$work_item"
>> >     done
>> >   ''')
>> >
>> > task = Task(processes = [work_forever],
>> > *  resources = '{{profile.resources}}, # Note this is static per
>> > queue-name.*
>> > )
>> >
>> > service = Service(
>> >   task = task,
>> >   cluster = 'west',
>> >   role = 'service-account-name',
>> >   environment = 'prod',
>> >   name = '{{profile.queue_name}}_processor'
>> >   *instances = '{{profile.instances}}', # Scale here.*
>> > )
>> >
>> > jobs = [
>> >   service.bind(profile = Profile(
>> >     resources = HIGH_MEM,
>> >     queue_name = 'graph_traversals',
>> >     instances = 50,
>> >   )),
>> >   service.bind(profile = Profile(
>> >     resources = HIGH_CPU,
>> >     queue_name = 'compilations',
>> >     instances = 200,
>> >   )),
>> > ]
>> >
>> >
>> > On Wed, Feb 26, 2014 at 11:46 AM, Bryan Helmkamp <bryan@codeclimate.com
>> >wrote:
>> >
>> >> Thanks, Bill.
>> >>
>> >> Am I correct in understanding that is not possible to parameterize
>> >> individual Jobs, just Tasks? Therefore, since I don't know the job
>> >> definitions up front, I will have parameterized Task templates, and
>> >> generate a new Task every time I need to run a Job?
>> >>
>> >> Is that the recommended route?
>> >>
>> >> Our work is very non-uniform so I don't think work-stealing would be
>> >> efficient for us.
>> >>
>> >> -Bryan
>> >>
>> >> On Wed, Feb 26, 2014 at 12:49 PM, Bill Farner <wfarner@apache.org>
>> wrote:
>> >> > Thanks for checking out Aurora!
>> >> >
>> >> > My short answer is that Aurora should handle thousands of short-lived
>> >> > tasks/jobs per day without trouble.  (If you proceed with this
>> approach
>> >> and
>> >> > encounter performance issues, feel free to file tickets!)  The DSL
>> does
>> >> > have some mechanisms for parameterization.  In your case since you
>> >> probably
>> >> > don't know all the job definitions upfront, you'll probably want to
>> >> > parameterize with environment variables.  I don't see this described
>> in
>> >> our
>> >> > docs, but you there's a little detail at the option declaration [1].
>> >> >
>> >> > Another approach worth considering is work-stealing, using a single
>> job
>> >> as
>> >> > your pool of workers.  I would find this easier to manage, but it
>> would
>> >> > only be suitable if your work items are sufficiently-uniform.
>> >> >
>> >> > Feel free to continue the discussion!  We're also pretty active in
our
>> >> IRC
>> >> > channel if you'd prefer that medium.
>> >> >
>> >> >
>> >> > [1]
>> >> >
>> >>
>> https://github.com/apache/incubator-aurora/blob/master/src/main/python/apache/aurora/client/options.py#L170-L183
>> >> >
>> >> >
>> >> > -=Bill
>> >> >
>> >> >
>> >> > On Tue, Feb 25, 2014 at 10:11 PM, Bryan Helmkamp <
>> bryan@codeclimate.com
>> >> >wrote:
>> >> >
>> >> >> Hello,
>> >> >>
>> >> >> I am considering Aurora for a key component of our infrastructure.
>> >> >> Awesome work being done here.
>> >> >>
>> >> >> My question is: How suitable is Aurora for running short-lived
tasks?
>> >> >>
>> >> >> Background: We (Code Climate) do static analysis of tens of thousands
>> >> >> of repositories every day. We run a variety of forms of analysis,
>> with
>> >> >> heterogeneous resource requirements, and thus our interest in Mesos.
>> >> >>
>> >> >> Looking at Aurora, a lot of the core features look very helpful
to
>> us.
>> >> >> Where I am getting hung up is figuring out how to model short-lived
>> >> >> tasks as tasks/jobs. Long-running resource allocations are not
really
>> >> >> an option for us due to the variation in our workloads.
>> >> >>
>> >> >> My first thought was to create a Task for each type of analysis
we
>> >> >> run, and then start a new Job with the appropriate Task every time
we
>> >> >> want to run analysis (regulated by a queue). This doesn't seem
to
>> work
>> >> >> though. I can't `aurora create` the same `.aurora` file multiple
>> times
>> >> >> with different Job names (as far as I can tell). Also there is
the
>> >> >> problem of how to customize each Job slightly (e.g. a payload).
>> >> >>
>> >> >> An obvious alternative is to create a unique Task every time we
want
>> >> >> to run work. This would result in tens of thousands of tasks being
>> >> >> created every day, and from what I can tell Aurora does not intend
to
>> >> >> be used like that. (Please correct me if I am wrong.)
>> >> >>
>> >> >> Basically, I would like to hook my job queue up to Aurora to perform
>> >> >> the actual work. There are a dozen different types of jobs, each
with
>> >> >> different performance requirements. Every time a job runs, it has
a
>> >> >> unique payload containing the definition of the work it should
be
>> >> >> performed.
>> >> >>
>> >> >> Can Aurora be used this way? If so, what is the proper way to model
>> >> >> this with respect to Jobs and Tasks?
>> >> >>
>> >> >> Any/all help is appreciated.
>> >> >>
>> >> >> Thanks!
>> >> >>
>> >> >> -Bryan
>> >> >>
>> >> >> --
>> >> >> Bryan Helmkamp, Founder, Code Climate
>> >> >> bryan@codeclimate.com / 646-379-1810 / @brynary
>> >> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Bryan Helmkamp, Founder, Code Climate
>> >> bryan@codeclimate.com / 646-379-1810 / @brynary
>> >>
>>
>>
>>
>> --
>> Bryan Helmkamp, Founder, Code Climate
>> bryan@codeclimate.com / 646-379-1810 / @brynary
>>



-- 
Bryan Helmkamp, Founder, Code Climate
bryan@codeclimate.com / 646-379-1810 / @brynary

Mime
View raw message